A golf forecaster that ranks #1 on ProphetArena Sports, ahead of GPT-5 and Grok-4
A sample training example — question, source, and outcome-derived label.
Benchmark comparisons against frontier models.
Foresight V1 32B ranks #1 on ProphetArena Sports ahead of Grok 4, Gemini 2.5 Pro, and multiple GPT-5 variants. On 855 held-out golf questions, it achieved Brier Skill Score +17.0% vs. +12.8% for GPT-5, with 41% lower calibration error (ECE 0.062 vs. 0.106).
Papers, models, datasets, notebooks, and write-ups for this case study.
Leverage your own raw data or use public sources. No labeling required.