FretBench

A benchmark suite for evaluating guitar fretboard note-name reasoning in language models.

 

answer

Leaderboard

# Model Score Tuning Breakdown Cost Last Tested
1 Qwen 3.5 FlashOW Alibaba 94.5%
Std 97% DropD 97% HSD 93% DropDb 89%
$0.166 Mar 9, 2026
2 DeepSeek V3.2 SpecialeOW DeepSeek 94.5%
Std 98% DropD 97% HSD 93% DropDb 86%
$0.286 Mar 10, 2026
3 Qwen 3.5 PlusOW Alibaba 94.5%
Std 98% DropD 97% HSD 93% DropDb 86%
$0.378 Mar 9, 2026
4 Kimi K2.5 (Reasoning)OW Moonshot 94.5%
Std 98% DropD 97% HSD 93% DropDb 86%
$0.596 Mar 10, 2026
5 DeepSeek V3.2 Speciale (Reasoning)OW DeepSeek 93.4%
Std 97% DropD 97% HSD 93% DropDb 82%
$0.297 Mar 10, 2026
View all results →