Back to Soccer Models
Soccer Totals

Soccer Totals Model

Over/Under goal predictions for EPL matches using expected goals (xG) analytics and Poisson distribution modeling.

Methodology

The Soccer Totals model projects expected goals for each team using xG-based attack and defense ratings, then applies Poisson distribution to calculate precise probabilities for each total line.

// Projection Approach
Expected goals calculated for each team
Based on attack vs defense matchup ratings
Factors:
• Team attack and defense strength
• Home field advantage
• Recent form weighting
Output:
Probabilities for each total line via Poisson

Rating Metrics

  • • Attack Rating: xG For relative to league
  • • Defense Rating: xG Against relative to league
  • • EPL Average: ~1.35 xG per team
  • • Total Average: ~2.7 goals per match

Confidence Tiers

  • • MAX (2.0u): Highest EV picks
  • • STRONG (1.5u): Strong EV picks
  • • STANDARD (1.0u): Solid EV picks
  • • All picks must meet minimum thresholds

Key Factors

xG Attack Rating

Measures offensive quality based on expected goals created. Top teams like Man City average 2.0+ xG per match.

xG Defense Rating

Measures defensive quality based on expected goals conceded. Values below 1.0 indicate better-than-average defense.

Home Field Advantage

EPL home teams average ~10% higher xG than away. This factor adjusts projections for venue.

Time Weighting

Recent matches weighted more heavily. Last 5 and last 10 game windows capture current form.

Lines Analyzed

1.5
Low-scoring matches
~75% hit rate Over
2.5
Most popular line
~52% hit rate Over
3.5
High-scoring matches
~28% hit rate Over
4.5
Shootout potential
~12% hit rate Over

The model evaluates all lines and selects the one with highest expected value for each match.

Frequently Asked Questions

Expected goals (xG) measures the quality of a shot based on historical data - the likelihood that a shot from a given position with specific characteristics will result in a goal. We use xG instead of actual goals because it provides a more stable and predictive signal. A team might score 4 goals from 1.2 xG one week (lucky) and 0 goals from 2.5 xG the next (unlucky). xG smooths this variance.
We use Poisson distribution modeling. First, we calculate expected goals for each team based on their attack rating vs opponent defense rating. These expected values (lambda) feed into Poisson PMFs to calculate the probability of each scoreline (0-0, 1-0, 1-1, etc.). We sum relevant scorelines to get over/under probabilities for each line (1.5, 2.5, 3.5, 4.5).
Ratings are derived from xG data, measuring how a team performs relative to league average. Values above 1.0 indicate above-average performance. We weight recent matches more heavily and apply regression to handle small sample sizes early in the season.
Different lines offer different value. Over 2.5 might be overpriced while Over 3.5 offers edge in the same match. By analyzing all common lines, we find the best risk/reward opportunity. The model selects the line with the highest expected value (EV) for each match.
Edge represents the expected value (EV) of the bet. It's calculated as: (Model Probability × Decimal Odds) - 1. An edge of 5% means for every $100 wagered, you expect to profit $5 on average. Only positive EV plays are surfaced as picks.
Confidence tiers (MAX, STRONG, STANDARD) are based on expected value thresholds. Higher EV picks are assigned to higher tiers and warrant larger unit sizing. Only picks meeting our minimum EV requirements are surfaced.
Picks are generated in the morning on match days UK time, ensuring we have the latest odds and team news. Picks lock before kickoff to capture closing line value.
CLV measures how much the line moved toward your position after you locked. If you took Over 2.5 at -110 and it closed at -130, you captured CLV - indicating the market agreed with your position. Consistent positive CLV is the strongest predictor of long-term profitability.
The English Premier League has the deepest betting markets, tightest odds, and most comprehensive data coverage. Market efficiency means our edges are real rather than artifacts of thin markets. We plan to expand to other top leagues (La Liga, Bundesliga, Serie A) in future updates.
Early in the season, we have limited current-season data. The model uses a blend of previous season ratings with regression toward league average. As more matches are played, current-season data gradually takes over. This prevents overreaction to small samples while still capturing genuine changes.