Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
RyanMarten authored Jan 30, 2025
1 parent 9b5bfb3 commit f30dcad
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ Evalchemy is a unified and easy-to-use toolkit for evaluating language models, f

#### [2025.01.29] New Reasoning Benchmarks

- [Reasoning Benchmarks](https://www.open-thoughts.ai/blog/measuring): AIME24, AMC23, MATH500, LiveCodeBench, GPQA-Diamond, HumanEvalPlus, MBPPPlus, BigCodeBench, MultiPL-E, and CRUXEval benchmarks added as part of our [Open Thoughts](https://github.com/open-thoughts/open-thoughts) project
- AIME24, AMC23, MATH500, LiveCodeBench, GPQA-Diamond, HumanEvalPlus, MBPPPlus, BigCodeBench, MultiPL-E, and CRUXEval benchmarks added as part of our [Open Thoughts](https://github.com/open-thoughts/open-thoughts) project. See the [blog post](https://www.open-thoughts.ai/blog/measuring) for more.

#### [2025.01.28] New Model Support
- [vLLM models](https://blog.vllm.ai/2023/06/20/vllm.html): High-performance inference and serving engine with PagedAttention technology
Expand Down

0 comments on commit f30dcad

Please sign in to comment.