MATH
The MATH benchmark for measuring mathematical problem solving on competition math problems (Hendrycks et al., 2021).
- Task: ?
- What: n/a
- When: n/a
- Who: n/a
- Language: synthetic
Equivalent
Denoised inference time (s)
# eval
# train
truncated
# prompt tokens
# output tokens
# trials