Political Event Forecasting

+29%

better calibration than baseline on political outcomes

↗ Model card

682

held-out test questions evaluated on live political events

↗ Model card

Example prediction questions

The kinds of questions a model trained on your data can answer.

DATASET Policy actions

Question

Will the Jan 27, 2025 executive order threatening 25% tariffs on Canadian imports face a successful federal court injunction or vacatur within 90 days of signing?

Question source

New York Times Jan 27, 2025

Trump Threatens Canada With 25% Tariffs Over Border Security

Label

Yes.

Type

binary

Confidence

0.92

Label source

Reuters Apr 22, 2025

Federal judge enjoins enforcement of 25% Canadian import tariffs pending merits briefing

DATASET Policy actions

Question

What is the probability that EPA’s March 2025 light-duty vehicle emissions standard is withdrawn or replaced via notice-and-comment rulemaking before the Nov 3, 2026 midterm elections?

Question source

The Hill Mar 10, 2025

Administration officials defend rule as final after interagency review

Label

0.34

Type

continuous

Confidence

0.77

Label source

Federal Register Sep 2, 2025

Agency publishes notice of proposed rulemaking to replace prior standard

DATASET Policy actions

Question

Will the Senate pass the FY2026 National Defense Authorization Act conference report with at least 60 votes before the Aug 2025 recess, given current Armed Services Committee markup timing?

Question source

Defense News Jun 9, 2025

SASC chair targets floor vote by late July after UAV procurement fight

Label

Yes.

Type

binary

Confidence

0.88

Label source

U.S. Senate Jul 22, 2025

NDAA FY2026 passes 68–29 after Ukraine aid title compromise

DATASET Policy actions

Question

What is the probability of at least one People’s Liberation Army Navy vessel crossing the median line of the Taiwan Strait in Aug 2026 after the Jul 2026 joint U.S.–Philippines exercise schedule is published?

Question source

CSIS ChinaPower Jul 1, 2026

PLAN sorties up 18% week-on-week ahead of Balikatan follow-on

Label

0.61

Type

continuous

Confidence

0.74

Label source

Taiwan Ministry of National Defense Aug 31, 2026

Daily PLA tracker: 14 median-line crossings logged in August

Key results

Benchmark comparisons against frontier models

Calibration on 682 Political Test Questions

Trump-Forecaster achieves lower Expected Calibration Error than GPT-5 in both context-aware (ECE 0.079 vs. 0.091) and context-free (ECE 0.164 vs. 0.191) settings — 13–14% better calibration, with the largest gains when no additional context is available.

Grouped bar chart comparing ECE of Foresight vs. GPT-5 in with-context and without-context conditions on 682 held-out political questions

↗ Model card

Explore

Primary write-ups and artifacts for this solution.

Dataset WWTD-2025 → Model Trump-Forecaster → Notebook Political Event Fine-Tuning Notebook →

Example prediction questions

Key results

Calibration on 682 Political Test Questions

Explore

Ready to build your own expert?