Professional Documents
Culture Documents
Introduction
Darpa GC (Stanley)
RoboCup
Math Programming
real (test)
OJR algorithm compresses these distributions to a very small number of modes while maintaining high
Fig. 2. Synthetic vs. real data. Pairs of depth images and correspondingtree 1
color-coded ground truth body part label
body part classification
images. The 3D body joint positions are also known (but not shown). Note wide variety in pose, shape, clothing, and
crop. The synthetic images look remarkably similar to the real images, lacking primarily just the high-frequency texture.
TRANS. PAMI, SUBMITTED FOR REVIEW, 2012
𝑙1 u
u
prediction model
Fig. 6. Randomized tree Decision Forests. A fore
𝑇 used, the predicti
ensemble of T decision trees. Each tree consists
…nodes (blue) and leaf nodes (green). pl (c) over body
The red
forest is used, th
indicate the different paths that might be taken by d
𝑙 𝑇 ufor a particular input. weighted relative
trees
we describe the
Fig. 3. A single body part varies widely in its Fig.
context. Shotton et al.,Forests.
PAMIA 2012 algorithms can b
6. Randomized Decision forest is an world space vote
ensemble of T decision trees. giveEacha large positive ofresponse
tree consists split infor pixels u n
nodes (blue) and top (green).
of the body,
The but value close tobody.
aarrows the zero Sec.
for p
Fig. 4. leaf nodes
Example base character red
models. aggregated in an
to learn a large degree of invariance. Let us run through lower down the body. By similar reasoning, the
indicate the different paths that might be taken by different based onfind
mean
the variations we simulate: trees for a particular input. feature ( 2 ) may be seen insteadproposals.
to help thins
Base Character. We use 3D models of 15 varied base structures such as the arm.
characters, both male and female, from child to adult, to the sensor, and ambient
The design illumination. To ensure
of these features high mo
was strongly
Climate Modeling
• Satellites measure sea-surface
temperature at sparse locations
Ø Noisy (atmosphere & sensors)
Ø Partial coverage of ocean surface
(satellite tracks)
Ø Sometimes hidden by clouds
• Would like to infer a dense
temperature field, and track
its temporal evolution
NASA Seasonal to Interannual Prediction Project
http://ct.gsfc.nasa.gov/annual.reports/ess98/nsipp.html
Types of prediction problems
• Supervised learning
• Unsupervised learning
– No known target values
– No targets = nothing to predict?
– Reward “patterns” or “explaining features”
– Often, data mining
serious?
Braveheart
The Color
Purple
Amadeus
romance? action?
escapist?
Human Tumor Microarray Data
6 1. Introduction
SIDW299104
SIDW380102
SID73161
BREAST
RENAL
MELANOMA
MELANOMA
MCF7D-repro
COLON
COLON
K562B-repro
COLON
NSCLC
LEUKEMIA
RENAL
MELANOMA
BREAST
CNS
CNS
RENAL
MCF7A-repro
NSCLC
K562A-repro
COLON
CNS
NSCLC
NSCLC
LEUKEMIA
CNS
OVARIAN
BREAST
LEUKEMIA
MELANOMA
MELANOMA
OVARIAN
OVARIAN
NSCLC
RENAL
BREAST
MELANOMA
OVARIAN
OVARIAN
NSCLC
RENAL
BREAST
MELANOMA
LEUKEMIA
COLON
BREAST
LEUKEMIA
COLON
CNS
MELANOMA
NSCLC
PROSTATE
NSCLC
RENAL
RENAL
NSCLC
RENAL
LEUKEMIA
OVARIAN
PROSTATE
COLON
BREAST
RENAL
UNKNOWN
FIGURE 1.3. DNA microarray data: expression matrix of 6830 genes (rows)
and 64 samples (columns), for the human tumor data. Only a random sample
of 100 rows are shown. The display is a heat map, ranging from bright green
(negative, under expressed) to bright red (positive, over expressed). Missing values
are gray. The rows and columns are displayed in a randomly chosen order.
Social Networks & Relationships
William_P_Boardman
Thomas_L_Kempner_Jr
Walter_W_Driver_Jr
Scott_Kapnick
Robert_S_Kaplan William_M_Yarbenet
Nick_O_Donohoe
Ronald_Perelman Sallie_L_Krawcheck William_B_Harrison_Jr
Robert_B_Menschel
Sharmin_Mossavar_Rahmani
Walter_E_Massey Robert_J_Hurst
Timothy_M_George
Thomas_K_Montag
T_Timothy_Ryan_Jr Ron_Di_Russo William_Wicker
Robert_W_Scully Tom_Donohue
Rajat_K_Gupta Robert_L_Ryan
Richard_C_Perry
William_C_Dudley
Mark_Patterson
Peter_K_Scaturro William_H_Gray_III
Ronald_A_Williams Lloyd_C_Blankfein
Russell_L_Carson John_A_Thain Neel_Kashkari Reuben_Jeffery_III
Steven_T_Mnuchin William_R_Rhodes William_W_George
Stephen_M_Ross Marti_ThomasPete_Coneway
Judd_Alan_Gregg Willem_Buiter
Thomas_J_HealeyRaymond_J_McGuire
Rawan_Abdelrazek Rosemary_Zigrossi
Vikram_S_Pandit
Sanford_Weill
William_M_Lewis_Jr Robert_N_Hoglund
Sarah_G_Smith Meredith_Broome
Thomas_Steyer Robert_K_Steel Thomas_E_Tuft
Vartan_Gregorian Mario_Draghi Tracy_R_Wolstencroft
James_P_Gorman Robin_Jermyn_Brooks
Jamie_Dimon
Stephen_Schwarzman
Suzanne_Nora_Johnson Roger_C_Altman Thomas_R_Nides
Tom_Clausen Josh_Bolten
Jerry_Speyer
Leon_D_Black Kenneth_M_Jacobs
Terry_J_Lundgren Richard_Kauffman Stephen_Mandel
John_L_Thornton Ralph_L_Schlosstein
Robert_D_Hormats
Peter_Bass
Roger_W_Ferguson_Jr Stephen_Friedman
John_C_Whitehead
Paul_H_O_Neill
Theodore_Roosevelt_IV Wayne_L_Berman
Wilbur_Ross_JrRobert_L_Joss Timothy_C_Collins
Klaus_Kleinfeld Robert_Zoellick William_McDonough
Thomas_J_Wilson Robert_E_Rubin
Suresh_Sundaresan James_A_Johnson Michael_J_Cavanagh Tom_Daschle
Richard_D_Parsons
Ronald_Kirk Todd_Stern
Paula_H_Crown
Thomas_H_Kean Samuel_Bodman Sheryl_K_Sandberg
W_James_McNerney_Jr
Samuel_A_DiPiazza_Jr Glenn_H_Hutchins Reed_E_Hundt Ron_Moelis
Steven_Roth Nancy_KilleferReynold_Levy
Stanton_Anderson
Robert_M_Perkowitz
Maurice_R_Greenberg John_M_Deutch
Steven_Rattner Laura_D_Tyson
Stuart_Match_Suna Louis_B_Susman Zalmay_Khalilzad
William_C_Rudin Madeleine_K_Albright
Mark_T_Gallogly Peter_J_Solomon
Peter_Orszag Tim_Geithner
Wendy_Neu Richard_L__Jake__Siewert
Steven_A_Denning Michael_Froman
Richard_C_Holbrooke
Robert_Wolf Paul_Volcker Larry_Summers
Neal_S_WolinPeter_G_Peterson Ron_Bloom
Stuart_E_Eizenstat
Sam_Nunn Thomas_F_McLarty_III
Kenneth_M_Duberstein
Robert_S_McNamara Robert_D_Reischauer
Sylvia_Mathews_Burwell Robert_Boorstin
Tony_West
Thomas_P_Gerrity Maya_MacGuineas Peter_J_Wallison
Paul_J_Taubman Jason_Furman
Susan_Rice Michael_Lynton
Vernon_E_Jordan_Jr Martin_S_Feldstein
Thurgood_Marshall_Jr
Robert_H_Herz Scooter_Libby
Stephanie_Cutter
Warren_Christopher
William_A_Haseltine Shirley_Ann_Jackson Mark_B_McClellan
Strobe_Talbott
Roy_J_Bostock Valerie_Jarrett
Rebecca_M_Blank
Michael_H_Moskow
Wahid_Hamid Richard_N_Cooper
Penny_Pritzker
Thomas_E_Donilon
Richard_N_Haass Michael_Strautmanis
William_Donaldson Stephen_M_Wolf
Warren_Bruce_Rudman Vincent_A_Mai
Thomas_R_Pickering
William_Von_Hoene
Stephen_W_Bosworth William_S_Cohen
Tony_Fratto
William_E_Kennard
Robert_Bauer Robert_Gates
Victor_H_Fazio Melissa_E_Winter
Steven_Gluckstern
Robert_Lane_Gibbs
Norm_Eisen Vin_Weber
Richard_J_Danzig
Susan_Mandel Roger_B_Porter
Ronald_L_Olson
Tom_Bernstein Trenton_Arthur
Paul_Tudor_Jones_II
Wendy_Abrams Thomas_S_Foley
Course Logistics
Supervised Learning
CS178 Course Staff
Ø Instructor: Prof. Erik Sudderth
Research Interests: Statistical machine learning, computer vision, AI, …
Ø Teaching Assistants and Readers: Twaha Ibrahim, Daokun Jiang,
Zhenhan Li, Debora Sujono, Tamanna Hossain, Haoyu Ma
Course Information
Canvas page: https://canvas.eee.uci.edu/courses/24330
Course information, lecture slides, homework handouts.
Gradescope page: https://www.gradescope.com/courses/109546
Homework submission and grading.
Python Notebooks
• Present demos Math Programming
• Questions about coding
• Hints for homeworks
• Led by TAs on Mondays starting April 6
• Video streaming and recording information
will be announced via Canvas
Homework Assignments
Homework 1 due April 16, released early next week.
5 Programming Assignments
• We will drop lowest grade
Objective
• Learn to apply ML techniques
• Submission is a “report”
Course Logistics
Supervised Learning
Data exploration
• Machine learning is a data science
– Look at the data; get a “feel” for what might work
• Matlab
– Octave (free)
• R
– Used mainly in statistics
• C++
– For performance, not prototyping
% Histograms in MatPlotLib
import matplotlib.pyplot as plt
X1 = X[:,0] # extract first feature
Bins = np.linspace(4,8,17) # use explicit bin locations
plt.hist( X1, bins=Bins ) # generate the plot
Scatterplots
• Illustrate the relationship between two features
% Plotting in MatPlotLib
plt.plot(X[:,0], X[:,1], ’b.’); % plot data points as blue dots
Scatterplots
• For more than two features we can use a pair plot:
Supervised learning and targets
• Supervised learning: predict target values
• For discrete targets, often visualize with color
colors = ['b','g','r']
for c in np.unique(Y):
plt.plot( X[Y==c,0], X[Y==c,1], 'o',
color=colors[int(c)] )
ml.histy(X[:,1], Y, bins=20)
Machine Learning
Course Logistics
Supervised Learning
How does machine learning work?
• Meta-programming
– Predict – apply rules to examples
– Score – get feedback on performance
– Learn – change predictor to do better
Learning algorithm
Change q
Program (“Learner”)
Improve performance
Characterized by
Training data some parameters q
(examples) “train”
Procedure (using q)
Features
that outputs a prediction
Feedback / “predict”
Target values Score performance
(“cost function”)
Supervised learning
• Notation
– Features x
– Targets y
– Predictions ŷ = f(x ; q)
– Parameters q Learning algorithm
Change q
Program (“Learner”)
Improve performance
Characterized by
Training data some parameters q
(examples) “train”
Procedure (using q)
Features
that outputs a prediction
Feedback / “predict”
Target values Score performance
(“cost function”)
Regression; Scatter plots
40
Target y
y(new) =?
20
x(new)
0
0 10 20
Feature x
y(new) =?
20
x(new)
0
0 10 20
Feature x
• Find training datum x(i) closest to x(new)
Predict y(i)
Nearest neighbor regression
Predictor :
40
Given new features:
Target y
20
0
0 10 20
Feature x
return r
20
0
0 10 20
Feature x
Error or residual
Observation
Prediction
0
0 20
Regression vs. Classification
Regression Classification
y
y flatten
x x
Features x Features x x
Real-valued target y Discrete class c
(usually 0/1 or +1/-1 )
Predict continuous function ŷ(x) Predict discrete function ŷ(x)
Classification
X2 !
X1 !
Classification
X1 !
Summary
• What is machine learning?
– Types of machine learning
– How machine learning works
• Supervised learning
– Training data: features x, targets y
• Regression
– (x,y) scatterplots; predictor outputs f(x)
• Classification
– (x,x) scatterplots
– Decision boundaries, colors & symbols