You are on page 1of 38

Introduction to Predictive Learning

LECTURE SET 1 INTRODUCTION and OVERVIEW

Electrical and Computer Engineering 1

OUTLINE of Set 1
1.1 Overview: what is this course about: - subject matter - philosophical connections - prerequisites and HW1 - expected outcome of this course 1.2 Historical Perspective 1.3 Motivation for Empirical Knowledge 1.4 General Experimental Procedure for Estimating Models from Data
2

1.1.1 Subject Matter


Uncertainty and Learning Decision making under uncertainty Biological learning (adaptation) (examples and discussion) Induction, Statistics and Philosophy Ex. 1: Many old men are bald Ex. 2: Sun rises on the East every day
3

(contd) Many old men are bald


Psychological Induction: - inductive statement based on experience - also has certain predictive aspect - no scientific explanation Statistical View: - the lack of hair = random variable - estimate its distribution (depending on age) from past observations (training sample) Philosophy of Science Approach: - find scientific theory to explain the lack of hair - explanation itself is not sufficient - true theory needs to make non-trivial predictions
4

Conceptual Issues
Any theory (or model) has two aspects:
1. explanation of past data (observations) 2. prediction of future (unobserved) data

Achieving both goals perfectly not possible Important issues to be addressed:


- quality of explanation and prediction
- is good prediction possible at all ? - if two models explain past data equally well, which one is better? - how to distinguish between true scientific and pseudoscientific theories?
5

Beliefs vs True Theories


Men have lower life expectancy than women Because they choose to do so Because they make more money (on average) and experience higher stress managing it Because they engage in risky activities Because .. Demarcation problem in philosophy
6

1.1.2 Philosophical Connections


Oxford English dictionary: Induction is the process of inferring a general law or principle from the observations of particular instances. Clearly related to Predictive Learning. All science and (most of) human knowledge involves induction How to form good inductive theories?
7

Challenge of Predictive Learning


Explain the past and predict the future

Background: philosophy
William of Ockham: entities should not be multiplied beyond necessity

Epicurus of Samos: If more than one theory is consistent with the observations, keep all theories
9

Background: philosophy
Thomas Bayes: How to update/ revise beliefs in light of new evidence

Karl Popper: Every true (inductive) theory prohibits certain events or occurences, i.e. it should be falsifiable
10

Background: philosophy
George W. Bush: I am The Decider

11

Background: philosophy
Bill Clinton: I told the Truth

12

1.1.3 Prerequisites and Hwk1


Math: working knowledge of basic Probability + Linear Algebra Statistical software (of your choice): - MATLAB, also R-project, Mathematica etc. Note: you will be using s/w implementations of learning algorithm (not writing programs) Writing: no special requirements Philosophy: no special requirements
13

Homework 1
Purpose: testing background on probability and computer skills Goal: estimate pdf of a random variable X Real Data: X=daily price changes of SP500
i.e. X (t )
Z (t ) Z (t 1) 100 % where Z(t) = closing price Z (t 1)

Typical + Useful Statistics - Histogram (empirical pdf) - mean, standard deviation


14

(contd) Homework 1
Histogram = estimated pdf (from data)
Example: histograms of 5 and 30 bins to model N(0,1) also mean and standard deviation (estimated from data)

500 400 300 200 100 0 -3

100 80 60 40 20 0 -3

-2

-1

-2

-1

15

(contd) Homework 1
NOTE: histogram ~ empirical pdf, i.e. scale of y-axis scale is in % (frequency). See histogram of SP500 daily price changes in 1981:
1981
7.00% 6.00% 5.00% 4.00% 3.00% 2.00% 1.00% 0.00%

-2.00%

-1.80%

-1.60%

-1.40%

-1.20%

-1.00%

-0.80%

-0.60%

-0.40%

-0.20%

0.00%

0.20%

0.40%

0.60%

0.80%

1.00%

1.20%

1.40%

1.60%

1.80%

2.00%

16

Homework 1 (background)
Is the stock market truly random? - model daily price changes of SP500 as a random coin toss process - measure average trend duration, i.e. UP, DOWN,UP, DOWN,~ +1,-1,+1,-1, here average trend duration =1 day and the process is not random Calculate average trend duration for random coin toss process, compare it with the stock market, and draw conclusions
17

1.1.4 Expected Outcome (of this course)


Scientific: Learning = generalization, concepts and issues Math theory: Statistical Learning Theory aka VC-theory Implications for Philosophy and Applications Philosophical: Nature of human knowledge and intelligence Demarcation principle Psychology of learning Practical: Financial engineering Security Genomics Predicting successful marriage, climate modeling etc., etc.

What is this course NOT about


18

OUTLINE of Set 1
1.1 Overview: what is this course about

1.2 Historical Perspective


Handling uncertainty and risk: - probabilistic vs - risk minimization 1.3 Motivation for Empirical Knowledge 1.4 General Experimental Procedure for Estimating Models from Data
19

1.2.1Handling Uncertainty and Risk


Ancient times Probability for quantifying uncertainty
- degree-of-belief - frequentist (Cardano-1525, Pascale, Fermat)

Newton and causal determinism Probability theory and statistics (20th century) Modern science (A. Einstein) Goal of science: estimating a true model or system identification

20

Handling Uncertainty and Risk(2)


Making decisions under uncertainty = risk management Probabilistic approach: - estimate probabilities (of future events) - assign costs and minimize expected risk Risk minimization approach: - apply decisions to known past events - select one minimizing expected risk Common in all living things: learning, generalization 21

Human Generalization
All men by nature desire knowledge - Aristotle Example 1: continue given sequence 6, 10, 14, 18,. Example 2: Sceitnitss osbevred: it is nt inptrant how lteters are msspled isnide the word. It is ipmoratnt that the fisrt and lsat letetrs do not chngae, tehn the txet is itneprted corrcetly
22

Human Generalization
Example 3:
The Dictator Emily Cherkassky

23

Scientific Example: Planetary Motions


How planets move among the stars? - Ptolemaic system (geocentric) - Copernican system (heliocentric) Tycho Brahe (16 century) - measure positions of the planets in the sky - use experimental data to support ones view Johannes Kepler: - used volumes of Tychos data to discover three remarkably simple laws
24

First Keplers Law


Sun lies in the plane of orbit, so we can represent positions as (x,y) pairs An orbit is an ellipse, with the sun at a focus

c1 x

c2 y

c3 xy c4 x c5 y c6

0
25

Second Keplers Law


The radius vector from the sun to the planet sweeps out equal areas in the same time intervals

26

Third Keplers Law


P Mercury 0.24 Venus 0.62 Earth 1.00 Mars 1.88 Jupiter 11.90 Saturn 29.30 P = orbit period D 0.39 0.72 1.00 1.53 5.31 9.55 P2 0.058 0.38 1.00 3.53 142.0 870.0 D3 0.059 0.39 1.00 3.58 141.00 871.00

D = orbit size (half-diameter)


27

For any two planets: P2 ~ D3

Empirical Scientific Theory


Keplers Laws can - explain experimental data - predict new data (i.e., other planets) - BUT do not explain why planets move. Popular explanation - planets move because there are invisible angels beating the wings behind them First principle scientific explanation Galileo, Newton discovered laws of motion and gravity that explain Keplers laws.
28

OUTLINE of Set 1
1.1 Overview: what is this course about 1.2 Historical Perspective 1.3 Motivation for Empirical Knowledge - human (scientific) knowledge - growth of empirical knowledge - the nature of human knowledge 1.4 General Experimental Procedure for Estimating Models from Data
29

1.3.1Human scientific knowledge


Knowledge is relationship between facts and ideas (mental constructs) Two types of scientific knowledge: - First-principles vs empirical Classical (first principles) knowledge: - rich in ideas - relatively few facts (amount of data) - simple relationships Examples/ discussion
30

1.3.2Growth of empirical knowledge


Huge growth of the amount of data in 20th century (computers and sensors) Complex systems (engineering, life sciences and social) Classical first-principles science is inadequate for empirical knowledge Need for Data-Analytic Modeling: How to estimate good predictive models from noisy data
31

1.3.3 Nature of human knowledge


Three types of knowledge - scientific (first-principles, deterministic) - empirical - metaphysical (beliefs)

Boundaries are poorly understood


32

More on Empirical Knowledge Demarcation:


Empirical Knowledge vs Beliefs Examples Empirical vs First Principle Knowledge Examples

33

Example: Empirical Knowledge vs Beliefs


From WSJ, January 13, 2007; Page B5, By Isabelle Lindemayer Dollar Ends Three-Day Advance But Could Be Poised for Gains

"The dollar saw an after-the-fact selloff following better-thanexpected retail-sales numbers for December," Naomi Fink, currency strategist at BNP Paribas, said. "More likely than not, investors felt as though the dollar's threeday rally preceding the release had compensated adequately for the upside surprise," she said

34

Empirical vs First Principle Knowledge

Compare:

First Principle Knowledge Example: Newtons Law of Gravity Empirical Knowledge Example: my wifes successful betting Question for Discussion: What is the difference btwn the two approaches to knowledge discovery?
35

Summary
First-principles knowledge (taught at school): deterministic relationships between a few concepts (variables) Importance of empirical knowledge: - statistical in nature - (possibly) many variables Goal of modeling: to act/perform well, rather than system identification
36

1.4 General Experimental Procedure


1. Statement of the Problem 2. Hypothesis Formulation (Learning Problem Statement) 3. Data Generation/ Experiment Design 4. Data Collection and Preprocessing 5. Model Estimation (learning) 6. Model Interpretion, Model Assessment and Drawing Conclusions Note: - each step is complex and usually involves several iterations - estimated model depends on all previous steps
37

Honest Disclosure of Results


Recall Tycho Brahe (16th century) Modern drug studies Review of studies submitted to FDA Of 74 studies reviewed, 38 were judged to be positive by the FDA. All but one were published. Most of the studies found to have negative or questionable results were not published, researchers found. Source: The New England Journal of
Medicine, WSJ Jan 17, 2008)

Publication bias: widespread in modern research


38