Welcome to Scribd!

Cost/sensitive Bandots

Uploaded by

0% found this document useful (0 votes)

9 views16 pages

This document discusses cost-sensitive active learning in Vowpal Wabbit (VW). It describes how VW handles cost-sensitive learning by training on data points with costs for different label predictions. It then introduces Cost-Overlapped Active Learning (COAL), which queries points for labeling where the cost ranges of different labels overlap or have a large difference, in order to reduce test cost. Experiments show COAL achieves lower test cost than passive learning with fewer labeled examples on large datasets like ImageNet.

Original Description:

bandits

Original Title

cost/sensitive bandots

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

9 views16 pages

Cost/sensitive Bandots

Uploaded by

Bob Zeno

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 16

Search inside document

Cost-Sensitive Active learning in VW

Akshay Krishnamurthy

akshay@cs.umass.edu

With Alekh Agarwal, Tzu-Kuo Huang, Hal Daumé III, John Langford
Cost-Sensitive Learning
Multi-class prediction where different predictions incur different cost.
• Data (x, c) where c(y ) is cost for predicting label y on x.
Cost-Sensitive Learning
Multi-class prediction where different predictions incur different cost.
• Data (x, c) where c(y ) is cost for predicting label y on x.
• Training: With data (xi , ci )
X
gy = argmin (g (xi ) − ci (y ))2
g ∈G
i
Cost-Sensitive Learning
Multi-class prediction where different predictions incur different cost.
• Data (x, c) where c(y ) is cost for predicting label y on x.
• Training: With data (xi , ci )
X
gy = argmin (g (xi ) − ci (y ))2
g ∈G
i

• Prediction: On example x

ŷ = argmin gy (x)
y
Cost-Sensitive Learning
Multi-class prediction where different predictions incur different cost.
• Data (x, c) where c(y ) is cost for predicting label y on x.
• Training: With data (xi , ci )
X
gy = argmin (g (xi ) − ci (y ))2
g ∈G
i

• Prediction: On example x

ŷ = argmin gy (x)
y

• In VW:

vw --csoaa k
Cost-Sensitive Learning
Multi-class prediction where different predictions incur different cost.
• Data (x, c) where c(y ) is cost for predicting label y on x.
• Training: With data (xi , ci )
X
gy = argmin (g (xi ) − ci (y ))2
g ∈G
i

• Prediction: On example x

ŷ = argmin gy (x)
y

• In VW:

vw --csoaa k

Can we do active learning here?

Yes we can!
0.14 RCV1-v2
Passive
0.12 COAL (1e-1)
COAL (1e-2)
0.10 COAL (1e-3)
Test Cost

0.08

0.06

0.04216 2 18 2 20 2 22 2 24 2 26
Number of Queries

vw --cs active k --mellowness 0.01 --simulation --adax

Cost Overlapped Active Learning (COAL)
On each xi
1. Compute version space Gi (y ) of good regressors for each y .
Cost Overlapped Active Learning (COAL)
On each xi
1. Compute version space Gi (y ) of good regressors for each y .
2. Compute cost ranges

c− (y ) = argmin g (xi ), c+ (y ) = argmax g (xi )

g ∈Gi (y ) g ∈Gi (y )

c+ (y2 )

g? (y1 )

y1 y2 y3 y4
Cost Overlapped Active Learning (COAL)
On each xi
1. Compute version space Gi (y ) of good regressors for each y .
2. Compute cost ranges

c− (y ) = argmin g (xi ), c+ (y ) = argmax g (xi )

g ∈Gi (y ) g ∈Gi (y )

3. Query y if cost range is large and overlaps with best.

c+ (y2 )

g? (y1 )

y1 y2 y3 y4
Properties

1. Guaranteed good generalization (adapts to easy data)

2. Logarithmic label complexity in favorable cases
3. In theory, polynomial time.
Approximate cost ranges
In practice, one pass, linear time.
1. Use online least squares optimization.

go
Approximate cost ranges
In practice, one pass, linear time.
1. Use online least squares optimization.
2. Compute cost range with sensitivity analysis.
3. Look for large weight w such that with new weighted example, loss
is still close

rw(go (x) 1)2

rw(go (x) 0)2

Execution
./vw –cs active 3 -d ../test/train-sets/cs test –cost max 2 –mellowness 0.01
–simulation –adax
Num weight bits = 18
learning rate = 0.5
initial t = 0
power t = 0.5
using no cache
Reading datafile = ../test/train-sets/cs test
num sources = 1
average since example example current current current
loss last counter weight label predict features
1.000000 1.000000 1 1.0 known 1 4
0.500000 0.000000 2 2.0 known 2 4
finished run
number of examples per pass = 3
passes used = 1
weighted example sum = 3.000000
weighted label sum = 0.000000
average loss = 0.333333
total feature number = 12
total queries = 3
Experiments

0.12 ImageNet 40
Passive
0.10 COAL (1e-1)
COAL (1e-2)
0.08 COAL (1e-3)
Test Cost

0.06

0.04

0.02212 2 13 2 14 2 15 2 16 2 17 2 18 2 19
Number of Queries
Experiments
0.12 ImageNet 40
Passive
0.10 COAL (1e-1)
COAL (1e-2)
0.08 COAL (1e-3)
Test Cost

0.06

0.04

0.02212 2 13 2 14 2 15 2 16 2 17 2 18 2 19
Number of Queries

• Hierarchical classification with tree-distance cost.

• COAL gets lower test cost than passive with ≈ 4x fewer queries.

Solution Manual of Advanced Engineering Mathematics by Erwin Kreyszig 9th Edition
Document8 pages
Solution Manual of Advanced Engineering Mathematics by Erwin Kreyszig 9th Edition
Anirban Pal
33% (3)
Introduction To Generative AI - Pre Quiz - Attempt Review
Document4 pages
Introduction To Generative AI - Pre Quiz - Attempt Review
vinay Murakambattu
No ratings yet
CS 229, Public Course Problem Set #3: Learning Theory and Unsuper-Vised Learning
Document4 pages
CS 229, Public Course Problem Set #3: Learning Theory and Unsuper-Vised Learning
suhar adi
No ratings yet
Ass 1
Document3 pages
Ass 1
Vibhanshu Lodhi
No ratings yet
DC Analysis: and Discuss Solution Techniques
Document27 pages
DC Analysis: and Discuss Solution Techniques
x mood
No ratings yet
Machine Choromanska Majorization 01
Document4 pages
Machine Choromanska Majorization 01
Vikas Kumar
No ratings yet
Learning-Demo
Document7 pages
Learning-Demo
Home Workout Plan
No ratings yet
Tutorial: Gaussian Process Models For Machine Learning
Document35 pages
Tutorial: Gaussian Process Models For Machine Learning
Saundaraya Gupta
No ratings yet
SVM Incremental Learning, Adaptation and Optimization - IJCNN 2003 Presentation
Document11 pages
SVM Incremental Learning, Adaptation and Optimization - IJCNN 2003 Presentation
Chris Diehl
No ratings yet
Lecture02 SA T S
Document23 pages
Lecture02 SA T S
Hui Ka Ho
No ratings yet
Lecture 1
Document48 pages
Lecture 1
Gaurav
No ratings yet
Kochurov 2015 - GPU Implementation of Jacobi Method and Gauss-Seidel For Large Data
Document13 pages
Kochurov 2015 - GPU Implementation of Jacobi Method and Gauss-Seidel For Large Data
Amauri Freitas
No ratings yet
CRF Tutorial ISMIR-2013 PDF
Document133 pages
CRF Tutorial ISMIR-2013 PDF
Restu Purwista
No ratings yet
Assignment 5: E1 244 - Detection and Estimation Theory (Jan 2023) Due Date: April 02, 2023 Total Marks: 55
Document2 pages
Assignment 5: E1 244 - Detection and Estimation Theory (Jan 2023) Due Date: April 02, 2023 Total Marks: 55
samyak jain
No ratings yet
Unit 4 - Linear Regression
Document52 pages
Unit 4 - Linear Regression
shinjo
No ratings yet
An Introduction To Support Vector Machines: Biplab Banerjee
Document31 pages
An Introduction To Support Vector Machines: Biplab Banerjee
Akshat sharma
No ratings yet
p6-REF-0 JMLR
Document16 pages
p6-REF-0 JMLR
SELVAKUMAR R
No ratings yet
CSE 373: Practice Midterm 2: I Followed The University's Honor Code
Document5 pages
CSE 373: Practice Midterm 2: I Followed The University's Honor Code
Peter Anderson
No ratings yet
Tutorial 10 Multivariable Calculus
Document2 pages
Tutorial 10 Multivariable Calculus
23004658
No ratings yet
Ghahramani Lecture2
Document30 pages
Ghahramani Lecture2
Carlos Jiménez
No ratings yet
Maths Paper 1
Document6 pages
Maths Paper 1
Somya
No ratings yet
10 Chapter 6
Document29 pages
10 Chapter 6
b.lavanya
No ratings yet
CSCE 5063-001: Assignment 2: 1 Implementation of SVM Via Gradient Descent
Document5 pages
CSCE 5063-001: Assignment 2: 1 Implementation of SVM Via Gradient Descent
sun_917443954
No ratings yet
Bayesian Decision Theory
Document65 pages
Bayesian Decision Theory
Anas Tubail
No ratings yet
Epfl Machine Learning Final Exam 2021 Solutions
Document21 pages
Epfl Machine Learning Final Exam 2021 Solutions
Diogo Valdivieso
No ratings yet
Machine Learning For Quantitative Finance: Fast Derivative Pricing, Hedging and Fitting
Document15 pages
Machine Learning For Quantitative Finance: Fast Derivative Pricing, Hedging and Fitting
J Martin O.
No ratings yet
Q-Learning and Deep Q Networks (DQN)
Document52 pages
Q-Learning and Deep Q Networks (DQN)
bscjjw
No ratings yet
Kernel PCA
Document6 pages
Kernel PCA
शुभमपटेल
No ratings yet
Grey Relation Analysis Formulae
Document1 page
Grey Relation Analysis Formulae
AJAY VASANTH X (1692)
No ratings yet
EE364a Homework 6 Solutions: I 1,..., K I I I
Document20 pages
EE364a Homework 6 Solutions: I 1,..., K I I I
Jessica Fernandes Lopes
No ratings yet
Proximal Gradient Descent: (And Acceleration)
Document33 pages
Proximal Gradient Descent: (And Acceleration)
Saheli Chakraborty
No ratings yet
Support Vector Machine Classification Algorithm and Its Application
Document8 pages
Support Vector Machine Classification Algorithm and Its Application
0191720003 ELIAS ANTONIO BELLO LEON ESTUDIANTE ACTIVO
No ratings yet
Winsem2017-18 Cse4003 Eth Sjt502 Vl2017185003779 Reference Material II Ecc1-Madhu
Document100 pages
Winsem2017-18 Cse4003 Eth Sjt502 Vl2017185003779 Reference Material II Ecc1-Madhu
Mayank Sharma
No ratings yet
Fe Industrial Engineering
Document14 pages
Fe Industrial Engineering
vzimak2355
No ratings yet
Kernel SVM For Image Classification
Document20 pages
Kernel SVM For Image Classification
Ariake Swyce
No ratings yet
Large Scale Learning With String Kernels: Sören Sonnenburg
Document26 pages
Large Scale Learning With String Kernels: Sören Sonnenburg
debo
No ratings yet
Fa11 Final
Document21 pages
Fa11 Final
Sri Pat
No ratings yet
Practice Midterm 2010
Document4 pages
Practice Midterm 2010
Erico Archeti
No ratings yet
Chap3 Lect03 Economics
Document15 pages
Chap3 Lect03 Economics
kartusnorman
No ratings yet
Gretl Guide (401 450)
Document50 pages
Gretl Guide (401 450)
Taha Najid
No ratings yet
HW1
Document4 pages
HW1
sun_917443954
No ratings yet
Test Exam
Document4 pages
Test Exam
MOhmedSharaf
No ratings yet
An Introduction To Statistical Machine Learning - Neural Networks
Document44 pages
An Introduction To Statistical Machine Learning - Neural Networks
Marius_2010
No ratings yet
Domain and Range of Trigonometric Functions
Document38 pages
Domain and Range of Trigonometric Functions
Rishabh Negi
No ratings yet
A Robust Real Time Face Detection
Document55 pages
A Robust Real Time Face Detection
Gaurav Veer Singh
No ratings yet
Chapter 5 - Managing Class A Items
Document23 pages
Chapter 5 - Managing Class A Items
Vy Trần
No ratings yet
12.1. Euclidean Algorithm by Subtraction
Document3 pages
12.1. Euclidean Algorithm by Subtraction
Albert Tsou
No ratings yet
Problem Set 4: Sergi Quintana, Joseph Emmens, Ignasi Merediz Sol'a - Econometrics II May 16, 2020
Document12 pages
Problem Set 4: Sergi Quintana, Joseph Emmens, Ignasi Merediz Sol'a - Econometrics II May 16, 2020
Joe Emmental
No ratings yet
Midterm - EE511 - Part B: K K K K
Document8 pages
Midterm - EE511 - Part B: K K K K
Anonymous zt7IGm
No ratings yet
Bayesian Inference For Multistage and Other Incomplete Designs
Document19 pages
Bayesian Inference For Multistage and Other Incomplete Designs
timobechger
No ratings yet
SoftFRAC Matlab Library For Realization
Document10 pages
SoftFRAC Matlab Library For Realization
Kotadai Le ZK
No ratings yet
WTW 124: Selected Answers To Chapter 1 (1.1 To 1.4) Exercise 1.1
Document4 pages
WTW 124: Selected Answers To Chapter 1 (1.1 To 1.4) Exercise 1.1
ntokozocecilia81
No ratings yet
Admm Without A Fixed Penalty Parameter Faster Convergence With New Adaptive Penalization
Document11 pages
Admm Without A Fixed Penalty Parameter Faster Convergence With New Adaptive Penalization
Nurul Hidayanti Anggraini
No ratings yet
Discrete Mathematics Project: Presented To
Document12 pages
Discrete Mathematics Project: Presented To
Meet Katharotiya
No ratings yet
2223hk1 Slide03 ML2022
Document33 pages
2223hk1 Slide03 ML2022
san cris
No ratings yet
Number Theory Algorithms: Zeph Grunschlag
Document86 pages
Number Theory Algorithms: Zeph Grunschlag
bskc_sunil
No ratings yet
Problemset2 PDF
Document4 pages
Problemset2 PDF
S Mahapatra
No ratings yet
Ece18898g Neural Networks
Document47 pages
Ece18898g Neural Networks
sai
No ratings yet
4/12/2006 Numerical Algorithms 1
Document14 pages
4/12/2006 Numerical Algorithms 1
sabby5525
No ratings yet
Complex Variables
From Everand
Complex Variables
Robert B. Ash
Rating: 5 out of 5 stars
5/5 (1)
Long-Memory Time Series: Theory and Methods
From Everand
Long-Memory Time Series: Theory and Methods
Wilfredo Palma
No ratings yet
Cohomology Operations (AM-50), Volume 50: Lectures by N. E. Steenrod. (AM-50)
From Everand
Cohomology Operations (AM-50), Volume 50: Lectures by N. E. Steenrod. (AM-50)
David B.A. Epstein
No ratings yet
Interview Question: Anagram Detection: Difficulty: Easy
Document4 pages
Interview Question: Anagram Detection: Difficulty: Easy
SourajitDutta
No ratings yet
App PID
Document12 pages
App PID
P Praveen Kumar
No ratings yet
FEMPrimer Part1
Document26 pages
FEMPrimer Part1
Zhiqiang Gu
100% (2)
Boolean Pres14 - 1
Document65 pages
Boolean Pres14 - 1
Amoama Evans
No ratings yet
Object Detection and Identification
Document20 pages
Object Detection and Identification
rohith mukkamala
67% (3)
Supermap Ai Gis Technology: Yan Yuna Product Consultant Supermap R&D Institute
Document52 pages
Supermap Ai Gis Technology: Yan Yuna Product Consultant Supermap R&D Institute
Christian Dave Zablan
No ratings yet
Forecasting Assignment
Document2 pages
Forecasting Assignment
Reyne Yusi
No ratings yet
42655
Document4 pages
42655
gilbertoobjio
No ratings yet
Advanced Systems Lab Exercises From The Book - VISki
Document10 pages
Advanced Systems Lab Exercises From The Book - VISki
Chandana Napagoda
No ratings yet
Source Code For Sudoku Solver
Document3 pages
Source Code For Sudoku Solver
api-292892066
No ratings yet
Individual Assignment - Parthosarothy Dasgupta
Document2 pages
Individual Assignment - Parthosarothy Dasgupta
Savan Parmar
No ratings yet
Python Tutorial
Document210 pages
Python Tutorial
Nilesh Sharma
No ratings yet
Regression Analysis
Document25 pages
Regression Analysis
Ankur Sharma
100% (1)
Digital Communication & Information Theory Course Code TC-311
Document18 pages
Digital Communication & Information Theory Course Code TC-311
Maryam Muneeb
No ratings yet
Ite4001 Network-And-Information-security Eth 1.1 47 Ite4001
Document2 pages
Ite4001 Network-And-Information-security Eth 1.1 47 Ite4001
Vivek Gopalshetty
No ratings yet
Batangas State University College of Engineering, Architecture & Fine Arts
Document9 pages
Batangas State University College of Engineering, Architecture & Fine Arts
Abby gxle
No ratings yet
3D Nvidia
Document20 pages
3D Nvidia
Kushal N
No ratings yet
Assignment 3: Job Man 1 2 3 4 5 I II III IV V
Document3 pages
Assignment 3: Job Man 1 2 3 4 5 I II III IV V
swatiraj05
No ratings yet
Hints For Revision: Functional Programming: Recursive (Answer: Calls Itself)
Document5 pages
Hints For Revision: Functional Programming: Recursive (Answer: Calls Itself)
Callie Jia Li
No ratings yet
Model of Cellular Automata
Document5 pages
Model of Cellular Automata
fitriarohimaatika
No ratings yet
FandI CT4 200709 Exam FINAL
Document8 pages
FandI CT4 200709 Exam FINAL
Tuff Buba
No ratings yet
Early Diabetes Detection Using Machine Learning: A Survey
Document6 pages
Early Diabetes Detection Using Machine Learning: A Survey
IJIRST
No ratings yet
Applications of Matrices in Everyday Life
Document9 pages
Applications of Matrices in Everyday Life
CARLOS LEONEL BARRERA GAMBOA
No ratings yet
Mann Kendall
Document4 pages
Mann Kendall
Om Prakash Sahu
No ratings yet
NLP PDF
Document20 pages
NLP PDF
Thakur Sahil Narayan
No ratings yet
Robust ML Detection Algorithm For Mimo Receivers in Presence of Channel Estimation Error
Document5 pages
Robust ML Detection Algorithm For Mimo Receivers in Presence of Channel Estimation Error
tr
No ratings yet
The Chaos Theory in Finance
Document6 pages
The Chaos Theory in Finance
Sheldon Cuper
No ratings yet
Based On The Book by Fan/Yao: Nonlinear Time Series
Document79 pages
Based On The Book by Fan/Yao: Nonlinear Time Series
Anisa Anisa
No ratings yet