Model Stats (/model-stats)
Compares Claude, GPT-4o, and Gemini performance across all evaluations.
Metrics per model
- Total evaluations scored
- Average score given
- Win rate when that model recommended trading
- Accuracy — how often the model’s recommendation matched the outcome
- Agreement rate — how often models agree with each other