The Pile

The Pile corpus for measuring lanugage model performance across various domains (Gao et al., 2020).

  • Task: language modeling
  • What: ?
  • When: ?
  • Who: ?
  • Language: English, code
  1. BPB

  2. Denoised inference time (s)

  3. # eval

  4. # train

  5. truncated

  6. # prompt tokens

  7. # output tokens

  8. # trials