Model Stats (`/model-stats`)

Compares Claude, GPT-4o, and Gemini performance across all evaluations.

Metrics per model

Total evaluations scored
Average score given
Win rate when that model recommended trading
Accuracy — how often the model’s recommendation matched the outcome
Agreement rate — how often models agree with each other

Model comparison

Side-by-side table showing which model is most accurate, most conservative, and most aggressive.

Signals Drift Monitor