HumanEval (Code)
The HumanEval benchmark for measuring functional correctness for synthesizing programs from docstrings (Chen et al., 2021).
- Task: ?
- What: n/a
- When: n/a
- Who: n/a
- Language: synthetic
pass@1
Denoised inference time (s)
# eval
# train
truncated
# prompt tokens
# output tokens
# trials