NarrativeQA

The NarrativeQA benchmark for reading comprehension over narratives (Kočiský et al., 2017).

  • Task: question answering
  • What: passages are books and movie scripts, questions are unknown
  • When: ?
  • Who: ?
  • Language: English
  1. F1

  2. ECE (10-bin)

  3. F1 (Robustness)

  4. F1 (Fairness)

  5. Stereotypes (race)

  6. Stereotypes (gender)

  7. Representation (race)

  8. Representation (gender)

  9. Toxic fraction

  10. Denoised inference time (s)

  11. # eval

  12. # train

  13. truncated

  14. # prompt tokens

  15. # output tokens

  16. # trials