Military Strikes Forecasting

A compact military-strikes forecaster built from public news

10.6%
lower Brier score than GPT-5.4
better calibration than GPT-5.4
8.4×
larger Brier Skill Score lift than GPT-5.4

We trained a compact forecaster for Numinous-style military-strikes questions using public news, resolved outcomes, and Foresight Learning. On a held-out set of military-strikes forecasts, it beat GPT-5.4 on Brier score, calibration, and Brier Skill Score.

What we did


Read more

Primary artifacts for this case study.