bAbI
The bAbI benchmark for measuring understanding and reasoning (Weston et al., 2015).
- Task: question answering
- What: reasoning
- When: 2015
- Who: synthetic
- Language: English
EM
Denoised inference time (s)
# eval
# train
truncated
# prompt tokens
# output tokens
# trials