╔════════════════════════════════╗ ║ Cao - MBs ║ ╚════════════════════════════════╝ 5/16/26: Go on the new UI : https://scratch.mit.edu/projects/1219333300/ 02/22/26: Added a new small user interface to select benchmarks and model ranking. Cao Model Benchmarks: Please note: all benchmarks were performed with model parameters rendered as similar as possible. As a reminder, these results do not fully reflect the true power of a model, so they should be interpreted with caution. I conducted these tests to compare which model performed best on Q&A benchmarks (based on my dataset, not on questions/answers written by humans).