Skip to main content
Sandbox previewPayments run on Stripe test mode. Picks and results are real but this instance is for testing. A separate production site will launch later.

Our process · 8 min read

What a stat-led research process looks like

The same checklist runs on every pick. What StatLine actually checks before publishing — and what we kill on the cutting-room floor.

Why we wrote this

“Stat-led” is a phrase plenty of services use without explaining what it actually means. We use it to describe a specific discipline — same inputs, same threshold, same checklist on every pick — and we want it to mean the same thing for a reader as it does for us. So this article is essentially: here is exactly what StatLine does between live odds and a published Telegram post. If you wanted to run a thinner version of the same process yourself, you could.

Step 1: Ingest, normalise, and snapshot

Every relevant data point — bookmaker odds, line history, team form, player stats, injuries, suspensions, late lineups, weather, venue effects — gets ingested on a schedule and stored in a database. Two properties of the storage matter:

  • Reproducibility. Every value is stamped with its source and the time we observed it. We can reconstruct the exact view of the world we had when we made a past pick.
  • Append-only history. Bookmaker prices change. We snapshot every observed quote rather than overwriting. That gives us the line history needed to compute closing-line value later.

This sounds banal. It is the most expensive part of the operation by hours and infrastructure. The discipline is: if a piece of data wasn't observed, it doesn't get used. No retro-fitting.

Step 2: Generate candidates

For every market that is open and within our coverage window, the engine runs a model that outputs a probability for each side. The model uses only data that was observable before the open. (In other words: no information leakage from the future.)

A market becomes a candidate when our probability estimate diverges from the bookmaker's implied probability by more than a configured threshold. The threshold is conservative — most markets do not generate candidates on most days. That is the point.

Step 3: Score and filter

Each candidate is scored against a set of features the analyst desk also reviews manually:

  • Sample size of the underlying historical pattern.
  • Volatility of the relevant team / player / market.
  • Recent line movement direction (sharp money already in the trade?).
  • Available bookmaker prices (do other books agree, or is this an outlier?).
  • Risk flags (injury status, weather extremes, late lineup uncertainty).

Candidates that fail any structural filter are dropped. Candidates that pass move to the desk for manual review.

Step 4: Manual review

A human looks at every candidate before it is published. The reviewer has the score, the model's evidence bullets, the line-history graph, and the underlying data, and they ask three questions:

  1. Does the explanation hold up? The model is not always right; if it is leaning on a feature that doesn't apply this week (e.g. a forecast based on a key player who is out), the candidate is killed.
  2. Is there market noise we can't see? Sometimes lines move in ways our line-history doesn't capture (a bookmaker pulling a market briefly, a steam move from a known syndicate, etc.). When in doubt, we don't publish.
  3. Is this the right write-up? The pick is published with evidence bullets and a risk flag. If the risk flag would be high but the evidence bullets weak, that is usually a kill — readers should not have to weigh weak evidence on a high-risk play.

The kill rate at this stage is meaningful. A given match might generate three candidates and result in zero published picks. That is fine. Volume is not the goal.

Step 5: Publish

Picks that survive the desk's review go out simultaneously to the channel (free or VIP, depending on the candidate's edge size) and to a snapshot in our own database. Each published pick records:

  • The exact price taken at the moment of publish.
  • The bookmaker the price was observed at.
  • The candidate's evidence bullets and risk flag.
  • The model's probability and the bookmaker's implied probability.

Step 6: Stamp the closing line

A scheduled job runs in the 5-35 minute window before each kick-off and snapshots the closing line for every market we have an open pick on. The closing-line price is stamped onto the pick's record automatically. From this moment, we can compute the pick's closing-line value before the game has even started.

This is the critical piece. By stamping CLV at kick-off we don't need to wait for the result to evaluate whether the pick was a good decision. (See CLV.)

Step 7: Settle the pick

After the match, an automated settlement run reads results from our match-data feeds and marks each pick won, lost, or void. Both result and CLV stay on the record. Both feed the public /track-record page. Wins, losses, voids — all visible.

Step 8: Roll up and re-evaluate

On a rolling basis, we look at:

  • Average CLV across last 100 / 250 / 1,000 picks.
  • CLV by sport and market type.
  • CLV by edge bucket (do the candidates we mark high-edge actually post higher CLV?).
  • Win rate, ROI, and stake-adjusted P/L (secondary metrics).

If the rolling CLV regresses, we treat that as a signal that the model has drifted or the market has tightened. Retuning happens — and we don't hide the regression. Track-record is honest by design.

What the process explicitly avoids

  • Cherry-picking. Every published pick is logged. We can't hide losses; the pick row is public on /track-record the moment it is published.
  • Hindsight retro-fitting. We ingest before we model and we model before we publish. The system has no way to retroactively decide a candidate was a pick if it didn't make the threshold.
  • Confidence inflation. Risk flags are honest. A high-edge candidate carrying meaningful risk is labelled accordingly; we don't paper over the risk to ship a pick.
  • Volume for volume's sake. Some weeks the channel is quiet. The alternative — publishing thin candidates to fill space — is exactly what unwinds credibility over the long run.

How a reader could replicate a small-scale version

Even without our infrastructure, a serious bettor can run a thinner version:

  1. Pick one or two markets you understand deeply (e.g. AFL line, NBA totals).
  2. Build a back-of-envelope probability estimate for each match. (A model in a Google Sheet is enough to start.)
  3. Compare your estimate to the consensus of bookmaker prices. Bet only when your estimate beats the consensus by a meaningful margin.
  4. Track every bet (see tracking your performance), including odds taken and closing odds.
  5. Re-evaluate quarterly. If average CLV is positive at meaningful sample size, the process is working. If not, something needs to change — usually the model, not the bankroll discipline.

Why this is hard to do without a team

The boring parts (data ingestion, line snapshotting, settlement, infrastructure) take most of the hours. The interesting parts (modelling, manual review) take the intellectual energy. Doing both sustainably is what separates a paid service that survives from one that quietly disappears after a bad quarter.

That is also why we charge a subscription. Doing this work honestly costs money, and we'd rather be funded by the readers than by bookmaker affiliate kickbacks. The funding model determines what the service is allowed to say.

Keep reading

Educational content only — not personal financial advice. Sports are uncertain and any bet can lose. Past results do not predict future results. 18+. Gamble responsibly. Responsible gambling resources.

What a stat-led research process looks like — StatLine · StatLine