Prediction Markets are Learning Algorithms

In this piece we’ll unpack this similarity and reveal that, in many cases, they are formally equivalent in a strong sense. We’ll discuss which classes of prediction markets are mathematically identical to standard online learning...

Prediction Markets are Learning Algorithms

On the surface, prediction markets and large‑scale ML training seem to live in different worlds.

Prediction markets are messy, human systems where people buy shares in claims such as “Dodgers will win 2025 World series” on Polymarket or Kalshi. By contrast, large-scale ML is often a wall of GPUs quietly grinding through trillions of tokens.

However, after striping away surface-level UIs and jargon, the two are ultimately solving the same problem:

Take fragmented, noisy information about the future and compress it into probabilities.

Over the last decade a literature in CS theory has made this link explicit. Cost‑function prediction markets can be reinterpreted as online learning algorithms, and some learning algorithms can be implemented directly as markets.

These insights are more than just mathematically elegant observations; if you think of a training run as a kind of prediction market — over gradients, data, or models — you naturally stumble on a decentralized world working at scale:

  • many agents each having partial signals,
  • disparate incentives influencing a shared system,
  • and a protocol that decides how much to trust each agent.

This is exactly the world decentralized training networks have been moving towards.

In this piece we’ll unpack this similarity and reveal that, in many cases, they are formally equivalent in a strong sense. We’ll discuss which classes of prediction markets are mathematically identical to standard online learning, how “machine learning markets” can formally implement model ensembles and probabilistic inference, and why this perspective increasingly matters for modern ML and RL (especially in decentralized, multi-agent training).

1. Two ways of buying the future

Popular platforms such as Kalshi and Polymarket have made prediction markets ubiquitous, and the space continues maturing quickly. These platforms don’t just entertain; they attempt to produce probability estimates that journalists, traders, and policymakers now track alongside opinion polls.

On the ML side, training a frontier model has become a multi‑billion‑dollar exercise in forecasting—next tokens, next states, next user actions. Each new generation of models scales up parameters, data, and compute in pursuit of slightly more calibrated probability distributions over these forecasts.

From this framing we begin to see that both industries are selling essentially the same product: beliefs about the future constructed from many weak signals.

2. Prediction markets as probabilistic learners

2.1 A compressed intro to prediction markets

A typical prediction market platform allows you to trade contracts that pay $1 if some event happens and $0 otherwise. When markets are well-designed, the current price behaves like a probability, i.e. if a contract trades at $0.67 says “the crowd thinks there’s a ~67% chance this happens”.

Perhaps surprising given their simplicity, these mechanisms have a solid empirical track record. Classic work by Wolfers & Zitzewitz (2004) found that election prediction markets like the Iowa Electronic Markets typically had average errors of around 1.5 percentage points in the final week before US elections, beating major opinion polls that erred by about 2.1 points. Other surveys and meta-analyses have reinforced this view that well-designed prediction markets are among the best tools we have for aggregating dispersed information.

So far this may sound like straightforward social science: people trade, prices move, the crowd is wise. But around 2010, CS theorists noticed that the math underlying a large class of these markets seems to coincide with the math underlying online learning algorithms.

2.2 From trading to no-regret learning

Consider a standard cost-function prediction market (CFM); here the market maker maintains a vector of outstanding shares q describing how many contracts are held across the market and a convex cost function C(q) encoding relative information about shares. CFMs ensure that the instantaneous price of outcome i is just the partial derivatives of C(q). Traders submit orders that change q and the market maker charges them according to how much C(q) changes.

Chen and Vaughan (2010) showed that if you interpret each trade as a “loss vector” and look at how the market maker updates prices over time, any such CFM with bounded loss can be interpreted as a no-regret learning algorithm in the classic “prediction with expert advice” setting in ML. The constraint that the CFMs have bounded loss may seem restrictive, but actually holds for a large class of scoring rule markets such as LMSR. (It does disallow markets based on order books, but we’ll talk about that more in a follow-up blog post coming soon arguing why we think order books are ill-suited for predictions.)

More concretely:

  • In online learning a learner repeatedly chooses a probability distribution pₜ over predictions made by a set of “experts” and then observes a loss vector ℓₜ.
  • In a CFM the market maker repeatedly quotes prices pₜ (one per outcome) and then observes traders’ demand, which Chen & Vaughan maps to an effective loss vector.

Their results show that:

  • Every convex cost function corresponds to a Follow-The-Regularized-Leader (FTRL) algorithm, where a value related to the cost function plays the part of the regularizer.
  • Bounded worst-case monetary loss of the CFM implies a regret on the order of square root of T for the corresponding learner.
  • Specific markets reduce to specific learning algorithms. For example, the quadratic market scoring rule equates to online gradient descent.

From this angle a prediction market is literally an online learner (and vice versa) where:

  • The “state” of the learner is the current price vector pₜ.
  • Each incoming order is a training sample (a hint that the current probabilities are wrong).
  • The market maker’s pricing rule is a no-regret update that keeps loss bounded.

2.3 Mirror descent and price dynamics

The connection between CFMs and online learning was revealed to be even deeper in later work; Frongillo et al. (2012) showed that if traders behave like Kelly bettors (maximizing expected log utility), then a natural class of automated market makers implements stochastic mirror descent on the space of prices. The stationary distribution of prices coincides with the Walrasian equilibrium of an underlying market.

Very recently, Nueve & Waggoner (2025) show how to design market makers so that the aggregate effect of traders corresponds to general steepest-gradient-descent algorithms, not just the FTRL family. In other words:

By picking your cost function appropriately, you can make “a bunch of traders in a market” collectively run almost any first-order learning algorithm you like.

With this result in hand price paths in a liquid prediction market start to look less like a spurious social phenomenon and more like a familiar optimization trace: the noisy trajectory of an online learner trying to minimize loss.

3. Machine learning markets: ensembles as equilibria

So far we’ve looked at a how a well designed prediction market is itself an online learner. Amos Storkey’s Machine Learning Markets flips the script: what if the agents in the market are machine learning models?

3.1 Agents as models, prices as ensemble predictions

In Storkey’s framework each agent is a probabilistic model with beliefs over outcomes, has a utility function over “wealth”, and trade securities whose payoffs depend on the outcomes.

Under a no-arbitrage assumption prices again behave like probabilities. But now, Storkey shows that equilibrium prices implement classic model ensembling mechanisms:

  • Logarithmic utility yields a wealth-weighted average of agents’ beliefs, which is essentially a mixture-of-experts.
  • Exponential decaying negative utility yields a geometric mean of agents’ beliefs, which corresponds to a product-of-experts.
  • If each agent only cares about a subset of variables, the resulting equilibrium maps directly to factor graphs and message passing algorithms used in graphical-model inference.

Stated plainly: Many probabilistic models we train today—mixtures/products-of-experts, Markov random fields—can be interpreted as markets at equilibrium between specialized agents.

3.2 Artificial prediction markets as classifiers

Going beyond theory, Barbu & Lay (2012) explicitly build supervised learning algorithms that look like small prediction markets:

  • Each feature or base classifier is a “trader” endowed with a budget.
  • Traders bet on classes they think are likely for each labelled example.
  • Budgets are updated based on correctness; the contract price for each class becomes the model’s predicted probability.

Despite the unusual framing, these markets achieved competitive accuracy on standard ML benchmarks at the time they were published. Follow-up work explored artificial markets for medical imaging and other domains, and more recent work shows how such markets can support human–AI hybrid classification, where human-like “exogenous agents” and ML models trade side by side.

4. From markets to modern ML & RL

We’ve now explored two tight links between prediction markets and ML:

  1. Well-designed market makers behave like online learning algorithms.
  2. Market equilibria behave like model ensembles and inference algorithms.

So what does all of this tell us about the current state of ML, RL, and large-scale training?

4.1 Online learning ≈ market making ≈ training loops

Most large-scale training pipelines are just online learning algorithms in disguise; We maintain a parameter vector $\theta$ which are the weights of a model, stream mini-batches of data, compute loss gradients, then update $\theta$ using some variant of (often regularized) gradient or mirror descent.

Chen & Vaughan’s result tells us that a CFM with convex C corresponds to an FTRL algorithm over probabilities. Nueve & Waggoner show that more general market designs can implement steepest gradient descent in arbitrary norms.

Taken together we have a neat dictionary of equivalence:

  • Market cost functionregularizer / geometry of your optimizer.
  • Liquidity parameterlearning rate / step size.
  • Trader order flowloss or gradient information.
  • Bounded loss of market makerregret bounds of online learner.

If you’re designing new optimizers, curriculum-learning schemes, or decentralized training protocols, this equivalence suggests you can borrow intuitions from market design—and vice versa.

4.2 Data selection as a tiny prediction market

A direct application of this viewpoint shows up in recent work on data-efficient training for LLMs by Jha et al. (2025). They treat each training example as a “contract” and run a miniature LMSR-style prediction market where different “signals” (uncertainty, rarity, diversity) act as traders.

In their approach LMSR defines a convex cost function whose gradient gives a probability distribution over examples; a liquidity parameter controls whether the market concentrates on a few high-utility examples or spreads weight more evenly; a token-aware price-per-token rule biases toward examples that are “cheap” in terms of compute.

Empirically they showed:

  • On GSM8K, with a strict 60k-token budget, their selector matches strong single-signal baselines in accuracy while reducing variance across random seeds, and the selection step itself takes <0.1 GPU-hours.
  • On AGNews classification, selecting just 5–25% of the data via the market achieves competitive accuracy with improved stability vs baselines.

Despite these strong results, the algorithm simply boils down to letting simple bots “trade” on examples and then using the resulting prices as data-selection weights. This is exactly the “prediction market as a learning algorithm” story being slotted directly into an LLM training pipeline successfully.

A complementary line of work on active learning markets by Huang and Pinson (2025)** treats label acquisition itself as a market, where analysts have budgets and sellers offer labels at different prices; variance-based and query-by-committee strategies are embedded into a market-clearing optimization. These “label markets” achieve better prediction performance with fewer purchased labels than random baselines on real estate and energy forecasting datasets.

4.3 Profit vs prediction: RL meets market design

Prediction markets are supposed to be good forecasters, but many real markets are full of arbitrage and inefficiency. Recent work by Zhu et al. (2024) studies this explicitly, showing that optimizing algorithms for profit can yield different behaviour than optimizing for calibration and low regret, and that there are trade-offs between informational efficiency and revenue.

This tension becomes very familiar when framed in terms of RL:

  • A profit-maximizing bookmaker is like a reinforcement learner myopically maximizing external reward signals, which may result in a poorly calibrated policy.
  • A calibration-focused market maker is closer to an online learner minimizing regret through time, which may forgoing some profit during the process.

As markets become more automated and algorithmic—particularly in crypto prediction platforms—the lines between market maker and RL policy begin blurring. Tools from RL (exploration vs exploitation, risk-sensitive objectives, off-policy evaluation) and tools from learning-in-games (no-regret dynamics converging to equilibrium) become natural when reasoning about market algorithms and their failures.

5. Decentralization: markets as training protocols

So far we’ve talked about how connections between markets and ML can serve as guideposts for one another. However, we believe a more interesting direction is to directly treat prediction‑market mechanisms as concrete blueprints for decentralized training.

By design a prediction market is

  • Open: anyone with information can join.
  • Asynchronous: participants act on their own cadence.
  • Partially trusted: individual traders may be biased or adversarial, but the mechanism still delivers usable aggregate beliefs (within limits).

These properties run parallel with what is needed for a global decentralized training network that is trustless; where myriad nodes having local objectives, local data, and local inductive biases interact without constraints on their synchronicity or hardware heterogeneity.

In future work we’ll explore various concrete proposals for making these connections explicit. For example, running markets over gradients, data, or even models themselves.


Get Involved:
- Discord
- X
- LinkedIn