A Market Approach to Forecasting: Background, Theory & Practice

Georgios G. Tziralis

presented in partial fulfillment of the requirements for the degree of Doctor of Philosophy, in the Sector of Industrial Management and Operational Research, School of Mechanical Engineering, National Technical University of Athens Athens, 21 June 2013

0. Contents
I. Introduction II. Background III. Theoretical Properties IV. Theoretical Evaluation V. Empirical Analysis VI. Advanced Concepts VII. Conclusions

I. Introduction

1. Scope 2. Approaches 3. Motivation

I.I Scope
What men have seen they know; but what shall come hereafter, no man before the event can see. Sophocles Ajax, Chorus, lines 1417-1419, ca 450-440 BC

Forecasting, a task of aggregation, filtering and processing: a) ideally, collect all potentially related information, then b) filter only the relevant to the target variable parts & c) process the remaining data to arrive at conclusions.

I.2a Approaches

Three core approaches to forecasting: 1. Causal methods 2. Time-series methods 3. Judgmental methods

I.2b Causal methods
i) discover correlations between the target variable and other potentially related ones ii) assume such relationships do not change iii) apply such relationships to predict future prices + demonstrate a capacity to capture underlying relationships - difficult to discover, also not static

I.2c Time-series methods
i) discover underlying patterns across past data ii) assume such patterns remain valid iii) extrapolate them in the future + perform generally well in relatively static problems - incapable in dynamic environments (driving by the rear-view mirror issue)

I.2d Judgmental methods
i) expert opinions and intuitive judgments incorporated via surveys or similar tools ii) typically a single expert + the cheapest and most common technique in practice - hard for an expert to quantitively assess knowledge - difficult to identify experts, then elicit and weight opinions

I.3a Motivation

• a universal forecasting algorithm remains an elusive
promise

• traditional approaches seem to have reached a ceiling • any alternative mechanisms that organically perform
the core forecasting tasks in dynamic environments?

I.3b Motivation
• human intelligence seems to excel at most parts of
the forecasting problem biases

• yet, it remains hindered by endemic problems, such as • how could one take advantage of the virtues of
collective human intelligence and at the same time cancel out its limitations?

• relevant research field remains underexplored, serves
as the context of this thesis

II. Background

1. Markets 2. Prediction markets 3. Literature review 4. Classification

II.0 Research Questions

• Why & how do prediction markets work? • What are their fundamentals of operation? • What is their evolution in recent years? • What is the existing volume of publications? • Which are the topics covered by literature?

II.1a Markets
• the best available mechanism for gathering and
aggregating dispersed information from agents

• efficient markets hypothesis: markets reflect the sum
of all available information about future events

• theory of rational expectations: markets convey

information through the prices and volumes of assets

• experimental economics: markets may be created

specifically to collect, aggregate and publish information

II.1b Markets
Financial markets are institutions that incorporate by their very nature and facilitate, in one way or another, all four fundamental functions of investment, hedging, speculation and information aggregation. Primary function investment hedging speculation information aggregation Market institution stock market futures market wagering market ?

II.2a Prediction markets

Prediction markets are defined as markets designed and run for the primary purpose of mining and aggregating information scattered among traders; then transforming such information into market prices serving as predictions about specific future events.

II.2b Prediction markets
• an efficient way of arriving into consensus about the
possibility of a future event

• a well-working framework for belief aggregation • an efficient mechanism of information incorporation
for future events

• a mechanism with an accuracy that can be assessed • largely an empirical (or social) science

II.3 Literature review

• surveys and examines the totality of relevant
academic work at the time of writing

• resulted in identifying 155 articles, increasing trend • a comprehensive basis for understanding prediction
market research and its state of the art

• no other review available, a contribution

II.4a Classification

II.4b Classification
• a promising forecasting approach and largely
untouched research field

• no standard terminology as of yet ('prediction
markets' emerges as default) applications

• a growing amount of research in various practical • no standard market mechanism in place

III. Theoretical Properties

1. Market mechanisms 2. Prediction markets mechanisms 3. Properties for a coherent price function 4. A coherent price function I 5. A coherent price function II

III.0 Research Questions

• What market mechanisms are used in general? • What prediction market mechanisms are used? • Which are the properties for a coherent (‘proper’)
price function of a prediction market?

• Are there any coherent price functions?

III.1a Market mechanisms

1. Continuous double auction (i.e. stock markets) 2. Market-maker (i.e. wagering with bookmaker) 3. Pari-mutuel (i.e. horse racing)

III.1b Continuous double auction

• any case that pbid ! pask results into a transaction • for any transaction to occur, there must be a

counterpart on the other side willing to accept the trade

• in the case that the highest bid price is less than the
lowest ask, nothing happens

• illiquidity issues, 'thin market' problem

III.1c Market maker

• in most cases CDAwMM • an agent who is nearly always ready to trade • in the bookmaker case, the market institution sets the
odds only for other players to buy and not to sell money

• exposure to risk of losing considerable amounts of

III.1d Pari-mutuel
• people place wagers on which of two or more
wagered on it split the total amount invested

mutually exclusive and exhaustive outcomes will occur

• when the true outcome becomes known, players who • the cost of purchasing an equal share of the profits
remains constant, unconditional to when the wager was placed

• no incentive for buying until either all information is
revealed, or the market is about to close

III.2a Prediction market mechanisms
No risk for the Infinite Current information market institution liquidity capture CDA Market maker Pari-mutuel yes no yes no yes yes yes yes no

standard mechanisms not fully appropriate for prediction markets usage

III.2b Scoring rules
• i=1,…,n mutually exclusive and exhaustive outcomes
of future event

• s scoring rule such as xi=si(ri), where x:monetary
reward, r:probability reported by an agent expectations to maximise her returns xi=ai+b*log(ri), where a,b constants

• s is proper if ri=pi, namely the agent reports her true • example of proper scoring rule (logarithmic):

III.2c Market scoring rules
• Any proper scoring rule can be interpreted as an
automated market maker (Hanson 2003) outcomes equal)

• The market maker sets initial expectations (e.g. all • Each new trader agrees to compensate the previous
trader according to her previously submitted probability estimate

• She also receives the scoring rule payment associated
with the probability estimate she submits

III.2d Market scoring rules
• let j be the relevant security paying off for the event i • also, qj the total quantity of security j held by all
traders combined

• for the logarithmic scoring rule, the cost and price
functions become:

III.2e Dynamic pari-mutuel market
• a hybrid between pari-mutuel & CDA (Pennock 2004) • it always enables purchases of each outcome (yet one
can sell only what she owns)

• buying shares of an outcome results into its stock
winning stock, proportionately to shares they hold

price increasing, while other stocks’ prices decrease

• all money invested get redistributed to owners of the • price function may vary according to properties
required

III.2f Adequacy
Market institution risk MSR DPM limited no Infinite liquidity no yes Direction of liquidity buy/sell buy Current information capture yes yes

• both mechanisms adequate, only minor issues • MSR pays a fixed monetary unit per share • DPM pays per share an equal portion of the total

amount outstanding in the market at the time of closing

III.3a DPM price functions
• Pennock (2004) suggests a couple: money and share
ratio

• in simple case of n=2 outcomes (A,B), these are:
i) money ratio: pA/pB=MA/MB, where M total amount wagered ii) share ratio: pA/pB=kNA/NB, where N total number of shares, k constant

• both present a number of unappealing properties,
further research required

III.3b Coherency properties

• a set of properties providing for clarity, brevity,
intuitiveness and elegance is introduced price function

• their satisfaction will provide for a coherent DPM • no such framework existed, a contribution • no already suggested functions satisfy it

III.3c Coherency properties
simple case, n=2 outcomes

• reflexiveness • summation to 1 • differentiability • injection • monotonicity (& invert) • convergence to 1 • convergence to 0
extension for n=k is straightforward

III.4a A coherent price function I

• simple case, n=2 outcomes

• satisfies all properties but invert monotonicity

III.4b A coherent price function I

• extended case, n=k outcomes

• satisfies all properties but invert monotonicity and
convergence to 1 for n>2

III.4c A coherent price function II
• simple case, n=2 outcomes

• conditional logic model, similar to logarithmic scoring
rule (b: sensitivity parameter)

• all properties satisfied

III.4d A coherent price function II
• extended case, n=k outcomes

• all properties satisfied • in a sense, a "unification" of DPM & MSR, via the
proposed price function

• an elegant solution, core contribution of this thesis

IV. Theoretical Evaluation

1. Market Design 2. Convergence Properties 3. Discussion

IV.0 Research Questions

• Does a ‘proper’ prediction market work in theory? • Will a ‘proper’ prediction market converge to a
consensus equilibrium?

• If yes, how fast is the convergence process? • What is the best possible equilibrium? • Will a ‘proper’ prediction market always converge?

IV.1a Market design

• model design balances between realistic detail and
meaningful simplicity

• this one is biased towards simplicity

IV.1b Information structure
• Boolean state space: s!{0,1}m with common prior
probability P(s)

• Traders information space: X={0,1}n (n traders) • initially, each trader is privy to one bit of information
xi (input bit)

• conditional distribution of information, which is

common knowledge to all traders: Q(x|s): {0,1}mx{0,1}n "[0,1]

IV.1c Market mechanism
• Market structure: pari-mutuel market, securities {k,l},
one for each binary outcome

• Target: predict the value of f(s): {0,1}m x {0,1}n "[0,1], f
common knowledge to all traders quantity of shares bought

• Price function: pk=exp(qk)/(exp(qk)+exp(ql)), where q • Pay-off: all money redistributed to the winning
security (if f(s)=1 to security k, and vice versa)

IV.1d Market mechanism

• The market is a multi-period Shapley-Shubik market
game

• It proceeds in synchronous rounds; on each round,
each agent buys one share of a security, either k or l continues until equilibrium

• After each round, new prices are announced; process

IV.1e Trader behavior

• risk neutral, myopic, bidding truthfully rather than
strategically

• traders’ probability distribution is updated after each
trading round via Bayes’ rule

• the above are common knowledge

IV.2 Convergence properties
• Price convergence:
without any new information, converges to consensus equilibrium in finite steps

• Convergence speed: at most at n rounds, where n is
the number of traders

• Best possible prediction: direct communication

equilibrium, where all market traders directly reveal their private information to each other

• Convergence to the best possible prediction: not
guaranteed

IV.3 Discussion
• the model described stands as an abstract departure
from reality, difficult to replicate in practice

• it validated the essential value of markets as a decision
support tool for prediction and information aggregation purposes

• the theoretical approach to the problem is by nature
rigid and increasingly complex

• one needs to turn to experimental approaches for a
more extended study of the underlying issues

V. Empirical Analysis

1. The askmarkets platform 2. Experiment A 3. Experiment B 4. Experiment C 5. Deployment Framework

V.0 Research Questions
• Does a ‘proper’ prediction market work in practice? • Is there a software platform supporting a ‘proper’
prediction market mechanism? equilibrium in practice? deployment?

• Is a ‘proper’ prediction market able to converge to • What is a practical framework for prediction markets

V.Ia Context
• moving from theory to practice, largely a social
science

• no definite results, indications instead of proofs • alternative approach to equilibrium in real world • experimental results are assessed on this basis

settings: market converges (early enough & in a 'stable' way) to the correct outcome (market accuracy)

V.Ib The askmarkets platform
• the proposed market framework had yet to be tested
in practice

• a brand new software platform for prediction markets
was developed from scratch, together with partner Efthimios Bothos (NTUA EE PhD 2010) and administration of prediction marketplaces are presented hereafter

• a state of the art tool for the creation, participation • it was utilized in a series of cases, a number of which

V.Ic The askmarkets platform

V.2a Experiment A
• academic context, testing information aggregation
capacities

• Mech Eng NTUA graduate students, optional
participation (assignment bonus) hard to access

• topic: Athens 2004 impact study, all data available but • students also answered to a survey on the same
topic, before participating in the market

V.2b Experiment A

V.2c Experiment A
• >100 students, >200 markets, >4k play-money
transactions in less than a month

• markets beat the survey, with a 0,33 vs 0,23 avg error • every single market converged to the correct
outcome

• significant improvement at a varying accuracy and
convergence speed a simple survey

• better results came at the cost of running a market vs

V.3a Experiment B
• business context, testing predictive capacities • facilitated in a Greek, publicly traded, automotive
company

• a couple of markets available, related to sales
estimates of core company products

• employees across the supply chain and organisational
chart; no specific incentives, no active communication

V.3b Experiment B

V.3c Experiment B
• 6 month long, 265 employees registered, only 32 made
at least 1 transaction, 664 play-money transactions in total

• sufficient acceptance and utilisation of the tool • results were sufficiently close to the desirable
outcome

• management considered the mechanism materially

valuable and considerably improved, compared to other approaches

V.4a Experiment C
• social context, testing predictive capacities • hosted in a popular news portal, themed under
forthcoming Greek elections

• open marketplace, traders could participate or create

market questions, no incentive other than a small gift to the winner

• lasted less than two weeks, no surveys available during
this period to compare

V.4b Experiment C

V.4c Experiment C
• 45 random traders, 706 play-money transactions • 14 markets, 7 with sufficient trading volume that were
further analysed

• in all 7 cases, markets suggested the correct outcome;
most under slow convergence and low confidence though lack of targeting, training and incentives

• sufficient volume of participation and results, granted

V.5a Deployment Framework
• a practicable framework for the deployment of
prediction markets

• a few experiments cannot lead to universal principles • that said, they suffice to provide a practical guide
instead

• good practices apply to cases satisfying typical
requirements of the experiments run

V.5b Design
• the constituents of a prediction market are essentially
two: its stocks and traders, along with the mechanism that brings them together

• generic rule of simplicity and intuitiveness, highly
recommended

• if a trader has second thoughts before making a
majority of participants are familiar with

transaction, she will most probably not participate

• exploit narratives and terminology that the vast

V.5c Stocks
• focus on the right choice and clear definition of a
forecasting goal

• short, clear and easily understood stock description • small number of stocks (i.e. less than 4 or 5) is
suggested

• predictions of absolute numbers rather than relative
numbers are preferred (e.g. ‘high (>80%)’ rather than ‘80%’)

V.5d Participants
• the tasks essentially are to a) find those who possess
or are capable to discover relevant information and b) incentivize and engage them to so do

• in play-money cases, incentives alone are of limited
impact

• turn the targeted audience to an active community of
people who take pride in participating and having their say in the market

• a market is successful when it becomes a daily topic
of conversation

V.5e Discussion
• developed a state-of-the-art, prediction markets
platform from scratch participation

• ran a diverse set of experiments, with sufficient • all experiments suggested that the proposed
mechanism is capable to perform the tasks at hand

• that said, empirical analysis largely stands as social
science, no generalised results can be extracted directions for further research

• a rough deployment framework was provided, various

VI. Advanced Concepts: Event Detection

1. Background 2. Time Series Analysis 3.Volatility Modeling 4. Experimental Settings 5. Experimental Results

VI.0 Research Questions

• What is the state-of-the-art in text mining for market
prediction and volatility analysis?

• What is a model of volatility for prediction markets? • How can such a model serve to detect events? • Does such a model work in practice?

VI.1a Background
• event detection: the task of monitoring news corpus • requires as input a meta-mechanism that aggregates
information from news corpus by nature

to discover stories that discuss a previously unidentified event

• there exists a proven -yet unidentified- relationship • relatively virgin research field; prediction markets
maybe a fit

between the release of relevant news stories and the fluctuation of prices

VI.1b Background
• various attempts to correlate simple changes in

market prices with a signal of an event, or invert, with ambiguous results

• volatility clustering: large changes tend to be followed
by large changes, of either sign, same for small changes

• needs to be modelled appropriately, rendering
previous approaches of limited value

• volatility analysis and GARCH modelling of prediction
markets time series introduced as input for text mining

VI.2 Time Series Analysis
• CDA prediction market data from Intrade was used,
regarding US presidential elections techniques, picked SVMs

• initially experimented with machine learning • satisfactory error metrics at first sight, however
predictions were reproducing past prices, approach abandoned

• then, a GARCH(1,1) model was demonstrated and • evidence for volatility clustering was also provided

verified as highly appropriate, adopted for further usage

VI.3 Volatility Modeling
• a GARCH(1,1) model for the prediction of market
time series volatility on a daily basis

• actual volatility prices are also retroactively known • diff between predicted and actual volatility can finally
be computed

• time instances of significant divergence are isolated as
of high information value

• the approach solves the volatility clustering issue, yet
at a high computational cost

VI.4 Experimental Settings
• 116 real-money prediction markets contracts with
significant transaction volume for a 4yr period and actual volatility were highlighted

• 276 instances of high divergence between predicted • number of occurrences of relevant for each contract • important volatility signals versus important news
occurrences were eventually identified keywords were tracked via Google News Archive, for a period of 3 days before up to 3 days after each instance

VI.5 Experimental Results
• news spikes before the volatility signal suggest the use
news the day before the error spike occurs relevant news occurring the days after of news as a leading indicator of volatility, and vice versa

• average instance suggests a decreasing rate of relevant • it was followed by a smaller but growing increase of • no clear and significant enough trend revealed • however, the proposed approach essentially

contributed the ability to highlight such phenomena, with a significantly improved confidence rate

VII.1 Conclusions
I. Introduction – Forecasting is far from a solved problem, and cannot be but so II. Background – Markets emerge as an interesting approach to forecasting, extended literature review III. Theoretical Properties – a coherent set of properties and a function satisfying them was contributed IV. Theoretical Evaluation – theoretically and abstractedly, the proposed mechanism can work V. Empirical Analysis – a set of real world experiments highlight the mechanism's adequacy

VII.2 Final remarks
• This Thesis attempted an 180 degree turn from the
elegant, yet highly sophisticated approach

dominant trend of computationally expensive approaches

• It regressed towards a simpler, transparent and more • Prediction markets were studied in depth and width,
from literature to theory to experiments to practice literature, only future will tell

• Ironically, should this Thesis infinitesimally serve the

thank you
Georgios G. Tziralis