FretBench
A benchmark suite for evaluating guitar fretboard note-name reasoning in language models.
answer
A benchmark suite for evaluating guitar fretboard note-name reasoning in language models.
| # | Model | Score | Tuning Breakdown | Cost | Last Tested | |
|---|---|---|---|---|---|---|
| 1 | Qwen 3.5 FlashOW Alibaba | 94.5% | Std 97% DropD 97% HSD 93% DropDb 89% | $0.166 | Mar 9, 2026 | |
| 2 | DeepSeek V3.2 SpecialeOW DeepSeek | 94.5% | Std 98% DropD 97% HSD 93% DropDb 86% | $0.286 | Mar 10, 2026 | |
| 3 | Qwen 3.5 PlusOW Alibaba | 94.5% | Std 98% DropD 97% HSD 93% DropDb 86% | $0.378 | Mar 9, 2026 | |
| 4 | Kimi K2.5 (Reasoning)OW Moonshot | 94.5% | Std 98% DropD 97% HSD 93% DropDb 86% | $0.596 | Mar 10, 2026 | |
| 5 | DeepSeek V3.2 Speciale (Reasoning)OW DeepSeek | 93.4% | Std 97% DropD 97% HSD 93% DropDb 82% | $0.297 | Mar 10, 2026 |