Category: engineering2026-02-177 min

Why Most Sportsbooks Cannot Replay Their Own State

Replayability is an audit primitive. If you can’t deterministically replay state, you can’t prove what happened under regulation.

SB
Author
SmartBet Engineering
We write about architecture, trading systems, risk, and real-time infrastructure for sportsbooks.

Key takeaways

  • Replayability is the minimal audit primitive: deterministic re-execution of historical inputs must reproduce identical state.
  • Most sportsbook stacks cannot replay because they are built around mutable databases, asynchronous side effects, and non-deterministic pricing/risk code paths.
  • Regulation pressures “prove what happened” requirements that cannot be met with screenshots, logs, or database snapshots alone.
  • Infrastructure-first fixes center on event capture, deterministic engines, idempotent side effects, and verifiable state commitments.

Replayability is not observability

Most operators can observe what happened: logs, traces, dashboards, even “audit tables.” Replayability is different: the ability to take the exact historical inputs and deterministically recompute the same derived state (markets, prices, limits, exposures, settlements, customer balances) at any point in time.

Under regulation, replayability is not a nice-to-have. It is the basis for:

  • Auditability: demonstrating the chain of causality from inputs to outcomes.
  • Dispute resolution: re-running a customer’s bet lifecycle to verify acceptance, rejection, re-offer, or settlement.
  • Change control: proving that a code/config change did or did not affect specific decisions.
  • Forensics: reconstructing state after incidents (partial outages, message loss, clock skew, human interventions).

If you cannot deterministically replay state, you cannot reliably prove what happened—only what your systems currently claim happened.

What “replay state” means in trading terms

A sportsbook “state” is not a single table. It is a compound of interdependent views:

  • Market definitions and lifecycles (open/suspend/close/void)
  • Price formation and margining
  • Eligibility, stake limits, and customer segmentation
  • Exposure and liability aggregation across correlated markets
  • Bet placement outcomes (accept/reject/reoffer) and timestamps
  • Settlements, adjustments, and wallet movements

Replayability requires that for any time window T, given the same ordered inputs and the same code/config, the system produces identical outputs and internal state transitions. This is the operational definition of determinism discussed in /en/insights/engineering/determinism-is-a-competitive-advantage-in-regulated-trading.

Why most sportsbooks cannot replay their own state

1) State lives in mutable databases, not in a reconstructible event history

The common architecture is “current state in SQL + append logs for debugging.” That breaks replay because:

  • Updates overwrite prior truth (last-write-wins).
  • “Audit tables” are often incomplete, inconsistent, or out of transaction scope.
  • Cross-service state transitions are not captured as a single ordered history.

A replayable system treats the event history as the authoritative record and derives materialized views from it. Without that, reconstruction becomes guesswork.

2) Time is not a controlled input

Most trading stacks leak wall-clock time into decisions:

  • Pricing depends on now() (time-to-start, time-in-play, decay curves).
  • Risk decisions depend on “current exposure” at acceptance time, which is subject to racing writes.
  • Settlement depends on delayed feeds and late-arriving corrections.

If time is not explicitly modeled (monotonic sequence numbers, effective timestamps, deterministic scheduling), replay will drift. The same inputs replayed later will use different time values and produce different results.

3) Asynchronous side effects break determinism

A bet lifecycle is typically distributed:

  • bet placement API
  • risk service
  • wallet/ledger
  • trading engine
  • settlement service
  • notifications, CRM, reporting

If side effects occur before a decision is durably recorded, or if retries are not idempotent, then replay cannot reproduce the same external outcomes. Common failures:

  • “At-least-once” delivery without idempotency keys causes duplicate settlements or wallet adjustments.
  • A risk decision is logged but the corresponding state mutation is lost (or vice versa).
  • A downstream system writes enriched data back upstream (out-of-band mutation).

Replayability requires strict separation between decision events and effects, with deterministic effect application.

4) Feeds are not captured with provenance and ordering guarantees

Sportsbooks rely on external inputs: odds feeds, fixtures, live data, results, manual trader actions. Replay fails when:

  • Raw feed messages are not stored exactly as received.
  • Normalization pipelines mutate inputs without versioning.
  • Ordering is inferred from receipt time rather than source sequence.
  • Manual actions are not captured as first-class events with operator identity and justification.

If you cannot replay the inputs, you cannot replay the state.

5) Pricing and risk code paths are non-deterministic by construction

Non-determinism shows up in subtle places:

  • Floating-point differences across runtime versions or hardware
  • Randomized algorithms (e.g., sampling, jitter) without seeded PRNG
  • Concurrency-dependent aggregation (unordered maps, parallel reductions)
  • Dependency on external services at decision time (feature flags, ML scores, third-party lookups) without versioned snapshots

A deterministic trading engine treats these as controlled inputs or removes them from the critical path.

6) Risk controls are implemented as scattered checks, not a coherent limit architecture

When stake limits, exposure limits, velocity checks, and trader overrides live across multiple services and databases, the acceptance decision becomes a distributed emergent property.

That makes it hard to answer: “Why was this bet accepted at 14:03:12 but rejected at 14:03:15 with the same parameters?”

Replayability improves when risk controls are designed as a cohesive, versioned policy layer with explicit inputs and outputs. See /en/insights/strategy/limit-architecture-designing-risk-controls-that-scale for how limit design impacts auditability and reproducibility.

Regulation forces “proof,” not “narrative”

Regulated environments increasingly require operators to demonstrate:

  • Exact sequence of inputs (market data, results, manual interventions)
  • Exact decision logic at the time (code + configuration + feature flags)
  • Exact outputs (accept/reject, price offered, settlement, adjustments)
  • Integrity guarantees (tamper-evident records, retention, access control)

A narrative built from partial logs and mutable database records is not a proof. Deterministic replay is a proof mechanism: you can re-run the chain and show it matches.

Replayability as an infrastructure primitive

Event capture: record the world as it was seen

Minimum requirements:

  • Persist raw inbound feed messages with source metadata (provider, sequence, timestamp, signature if available).
  • Persist internal commands and decisions as immutable events (who/what/when).
  • Use a single ordering model per stream (sequence numbers) and define cross-stream ordering rules.

This is not “log everything.” It is “log the authoritative inputs needed to recompute.”

Deterministic execution: make the engine replay-safe

Key techniques:

  • Replace wall-clock reads with an injected time source derived from events.
  • Make computations deterministic: stable ordering, explicit rounding rules, fixed numeric types where needed.
  • Version code and configuration; bind a decision to the exact config snapshot used.
  • Eliminate external dependencies from decision time, or snapshot their outputs as inputs.

Idempotent side effects: separate decision from effect

A replayable system records a decision event first, then applies effects in an idempotent manner:

  • Wallet movements reference a unique decision ID.
  • Settlement adjustments are derived and applied exactly once, with dedupe.
  • Downstream projections can be rebuilt from scratch without double-counting.

Verifiable state commitments: make tampering detectable

Even with replay, you need integrity:

  • Hash chains / Merkle commitments over event streams
  • WORM storage policies for critical journals
  • Access controls and audit trails over administrative actions

This enables “we can recompute and we can prove the record wasn’t altered.”

Practical replay questions auditors and incident responders ask

A system is “replayable” when it can answer, deterministically:

  • What was the offered price and margin for a given market at time t?
  • What was the computed limit and exposure at acceptance time, and what inputs produced it?
  • What feed messages and manual actions caused a market to suspend?
  • Why did two identical bet requests produce different outcomes?
  • What code/config version decided the outcome, and can we reproduce it in an isolated environment?

If these require manual reconstruction from multiple databases and best-effort logs, replayability is missing.

What to prioritize if you’re starting from a typical stack

  1. Define the event model for trading decisions, feed ingestion, manual trader actions, and settlements.
  2. Capture raw inputs with provenance and immutable storage.
  3. Build a deterministic replay harness that can recompute key views (prices, exposures, limits) from events.
  4. Make side effects idempotent and driven by recorded decision IDs.
  5. Introduce integrity commitments for event streams and administrative actions.

This is infrastructure work, not UI work. But it is the shortest path to credible auditability and operational control. For more context across regulated trading systems, see /en/insights.

Related