OpenbookQA

The OpenbookQA benchmark for commonsense-intensive open book question answering (Mihaylov et al., 2018).

  • Task: question answering
  • What: ?
  • When: ?
  • Who: ?
  • Language: English
  1. EM

  2. ECE (10-bin)

  3. EM (Robustness)

  4. EM (Fairness)

  5. Denoised inference time (s)

  6. # eval

  7. # train

  8. truncated

  9. # prompt tokens

  10. # output tokens

  11. # trials