Cao Model Benchmarks (old)

MAMatt-38•Created May 15, 2026

4 views

Instructions

╔════════════════════════════════╗ ║ Cao - MBs ║ ╚════════════════════════════════╝ 5/16/26: Go on the new UI : https://scratch.mit.edu/projects/1219333300/ 02/22/26: Added a new small user interface to select benchmarks and model ranking. Cao Model Benchmarks: Please note: all benchmarks were performed with model parameters rendered as similar as possible. As a reminder, these results do not fully reflect the true power of a model, so they should be interpreted with caution. I conducted these tests to compare which model performed best on Q&A benchmarks (based on my dataset, not on questions/answers written by humans).

Project Details

Project ID1320581318

CreatedMay 15, 2026

Last ModifiedMay 16, 2026

SharedMay 16, 2026

CommentsAllowed