0 views

Uploaded by Avishek01

Machine learning

- dk
- Algebra
- Thesis
- Computability Theory
- Mtech Adsa Final 2nd Jan 1,2,3,4
- 10.1.1.97.8338
- Comparing Boolean Logic and Erasure Coding.pdf
- 2329
- 5. Towards a Design Support System for Urban Walkability
- Review Final
- Top k Association Rules 2012
- Mathematics
- Books to Read
- Powerspectraldensity_MATLAB
- Queue
- Ads
- Eggy
- Local_Scour_Free_Surface_Mesh.pdf
- slides adaptive control
- Zero Knowledge-tutorial Oded Goldreich 2010

You are on page 1of 12

Computational Learning

Theory

(Based on Chapter 7 of Mitchell T..,

Machine Learning, 1997)

1

Overview

Are there general laws that govern learning?

Sample Complexity: How many training examples are needed for

a learner to converge (with high probability) to a successful

hypothesis?

Computational Complexity: How much computational effort is

needed for a learner to converge (with high probability) to a

successful hypothesis?

Mistake Bound: How many training examples will the learner

misclassify before converging to a successful hypothesis?

These questions will be answered within two analytical

frameworks:

The Probably Approximately Correct (PAC) framework

2

Overview (Cont’d)

Rather than answering these questions for

individual learners, we will answer them for

broad classes of learners. In particular we will

consider:

The size or complexity of the hypothesis space

considered by the learner.

The accuracy to which the target concept must be

approximated.

The probability that the learner will output a

successful hypothesis.

The manner in which training examples are

presented to the learner.

3

The PAC Learning Model

Definition: Consider a concept class C

defined over a set of instances X of length n

and a learner L using hypothesis space H. C is

PAC-learnable by L using H if for all cC,

distributions D over X, such that 0< < 1/2,

and such that 0< <1/2, learner L will, with

probability at least (1- ), output a hypothesis

hH such that errorD(h) , in time that is

polynomial in 1/ , 1/ , n , and size(c).

4

Sample Complexity for Finite

Hypothesis Spaces

Given any consistent learner, the number of examples

sufficient to assure that any hypothesis will be probably

(with probability (1- )) approximately (within error )

correct is m= 1/ (ln|H|+ln(1/))

If the learner is not consistent, m= 1/22 (ln|H|+ln(1/))

Conjunctions of Boolean Literals are also PAC-

Learnable and m= 1/ (n.ln3+ln(1/))

k-term DNF expressions are not PAC learnable because

even though they have polynomial sample complexity,

their computational complexity is not polynomial.

Surprisingly, however, k-term CNF is PAC learnable.

5

Sample Complexity for Infinite

Hypothesis Spaces I: VC-Dimension

The PAC Learning framework has 2 disadvantages:

It can lead to weak bounds

We introduce new ideas for dealing with these problems:

Definition: A set of instances S is shattered by hypothesis

hypothesis in H consistent with this dichotomy.

Definition: The Vapnik-Chervonenkis dimension,

space X is the size of the largest finite subset of X

shattered by H. If arbitrarily large finite sets of X can b

shattered by H, then VC(H)=

6

Sample Complexity for Infinite

Hypothesis Spaces II

Upper-Bound on sample complexity, using the VC-

Dimension: m 1/ (4log2(2/)+8VC(H)log2(13/)

Lower Bound on sample complexity, using the VC-

Dimension:

Consider any concept class C such that VC(C) 2, any

learner L, and any 0 < < 1/8, and 0 < < 1/100. Then

there exists a distribution D and target concept in C

such that if L observes fewer examples than

max[1/ log(1/ ),(VC(C)-1)/(32)]

then with probability at least , L outputs a hypothesis

h having errorD(h)> . 7

VC-Dimension for Neural Networks

Let G be a layered directed acyclic graph with n

input nodes and s2 internal nodes, each having

at most r inputs. Let C be a concept class over Rr

of VC dimension d, corresponding to the set of

functions that can be described by each of the s

internal nodes. Let CG be the G-composition of

C, corresponding to the set of functions that can

be represented by G. Then VC(CG)2ds log(es),

where e is the base of the natural logarithm.

This theorem can help us bound the VC-

Dimension of a neural network and thus, its

sample complexity (See, [Mitchell, p.219])!

8

The Mistake Bound Model of Learning

The Mistake Bound framework is different from

the PAC framework as it considers learners that

receive a sequence of training examples and that

predict, upon receiving each example, what its

target value is.

The question asked in this setting is: “How

many mistakes will the learner make in its

predictions before it learns the target concept?”

This question is significant in practical settings

where learning must be done while the system is

in actual use.

9

Optimal Mistake Bounds

Definition: Let C be an arbitrary nonempty

concept class. The optimal mistake bound

for C, denoted Opt(C), is the minimum over

all possible learning algorithms A of MA(C).

Opt(C)=minALearning_Algorithm MA(C)

For any concept class C, the optimal

mistake bound is bound as follows:

VC(C) Opt(C) log2(|C|)

10

A Case Study: The Weighted-

Majority Algorithm

ai denotes the ith prediction algorithm in the pool A of

algorithm. wi denotes the weight associated with ai.

For all i initialize wi <-- 1

For each training example <x,c(x)>

Initialize q0 and q1 to 0

• If ai(x)=1 then q1 <-- q1+wi

If q1 > q0 then predict c(x)=1

11

Relative Mistake Bound for the

Weighted-Majority Algorithm

Let D be any sequence of training examples, let A

be any set of n prediction algorithms, and let k be

the minimum number of mistakes made by any

algorithm in A for the training sequence D. Then

the number of mistakes over D made by the

Weighted-Majority algorithm using =1/2 is at

most 2.4(k + log2n).

This theorem can be generalized for any 0 1

where the bound becomes

(k log2 1/ + log2n)/log2(2/(1+ ))

12

- dkUploaded byTaqi Shah
- AlgebraUploaded byitz1223
- ThesisUploaded byBui Quoc Trung
- Computability TheoryUploaded bySanjida
- Mtech Adsa Final 2nd Jan 1,2,3,4Uploaded byNazimAhmed
- 10.1.1.97.8338Uploaded bygeorgiua
- Comparing Boolean Logic and Erasure Coding.pdfUploaded byvasa00
- 2329Uploaded byemwlupe
- 5. Towards a Design Support System for Urban WalkabilityUploaded bySri Tusnaeni Ningsih
- Review FinalUploaded byNguyên Bùi
- Top k Association Rules 2012Uploaded byRahman Hafeezul
- MathematicsUploaded byasd393
- Books to ReadUploaded byasdfghjkl0118
- Powerspectraldensity_MATLABUploaded byPanku Rangaree
- QueueUploaded byakram
- AdsUploaded byakram
- EggyUploaded bySaad Moro
- Local_Scour_Free_Surface_Mesh.pdfUploaded byjdj2007
- slides adaptive controlUploaded byRameez Hayat
- Zero Knowledge-tutorial Oded Goldreich 2010Uploaded by1102dbircs
- FRAME stiffness methodUploaded byRoif Samsul
- 10.1.1.145Uploaded byashi_vid
- may98prbUploaded byRotimi Owowa
- 67596Uploaded bycarlosfnbsilva
- ViiUploaded byAdHam Averriel
- Autonomous Forklift Final Presentation.pptxUploaded byElbertDavisAngdika
- 2. Complejidad 2012-IUploaded byJuan_Loaiza_6093
- The Complexity Class PUploaded bymanziltakforme
- Lecture 03Uploaded byAbhay Vikram
- Ident 001122Uploaded bymakroum

- ITILdocUploaded byAvishek01
- ChangesUploaded byAvishek01
- Quasi RigidUploaded byAvishek01
- cs638-13Uploaded byAvishek01
- WhatisDB (1)Uploaded byAvishek01
- machinelearningsemanticwebslides-090410100318-phpapp02.pdfUploaded byAvishek01
- jqxTreeGridUploaded byAvishek01
- sql_notesUploaded byAvishek01
- Elec and TeleUploaded byAvishek01
- E04498341C002Guidelines LOR EmployerUploaded byAvishek01
- 11234567891012.pdfUploaded byAvishek01
- 5Uploaded byAvishek01
- di1Uploaded byElizabeth Dibanadane
- commonly used words in IELTS Listening Test11 (1).pdfUploaded byreema1987
- 153-TG Math ProblemsUploaded byEkta Tayal
- Formula Sheet GeometryUploaded bykssuhasreddy_3743947
- New Text Documesdasdsadnt (4)Uploaded byAvishek01
- MB0053Uploaded byAvishek01
- AdvtUploaded byAvishek01
- wsdUploaded byAvishek01
- Claims for CapitationUploaded byAvishek01
- Assignment Solutions WinterUploaded byAvishek Mishra

- Bbp June08 OtolUploaded bymofiw
- Taoism: The Magic, The MysticismUploaded byantix9
- opth-8-725Uploaded byAris Bayu
- altoids final projectUploaded byapi-317238267
- Pneumonia XrayUploaded byLuigi Carlo De Jesus
- Interview With Swami Dayananda SaraswatiUploaded bySiva Kumar
- Week3 Lesson17Team DynamicsUploaded byТеодора Власева
- Cohen 2017_teaching and Learning L2 PragmaticasUploaded byM Lorena Pérez
- Garramuno-Forms of DisbelongingUploaded byIrene Depetris-Chauvin
- ACLS Pre Course Package 2015 BC ActiveUploaded byrousbek
- Effect of Drying Conditions and Plasticizer Type on Some Physical and Mechanical Properties of Amaranth Flour Films 2013 LWT Food Science and TechnoloUploaded byAnonymous vVsidh3n
- Freudian-criticism.pptUploaded byMARGANI_RAHMA
- Engine Assembly ProceduresUploaded byashishkhurana93
- mackey 2016 cvUploaded byapi-304462343
- Chapter 8 Profitability AnalysisUploaded byMXR-3
- Improving OperationUploaded byjsotofmet4918
- Peripheral Nerve BlocksUploaded bySyarifuddin Abdul Jabbar
- 10 PowerPoint.newUploaded byMohan Shanmugam
- NURSING PRACTICE III- Care of Clients With Physiologic and Psychosocial Alterations Nursing - RNpedia (1)Uploaded byBrianMarBeltran
- GR6_Hide_And_Seek.pdfUploaded bydavid quik
- Advanced Investment Research (Ghazi) FA2016Uploaded bydarwin12
- Thulani Maseko Statement of Defence in High Court Swaziland 4th & 5th June 2014. Title: "The failure of leadership in Swaziland: the people are treated with contempt Statement of Defence"Uploaded bySeán Burke
- ASTM E778Uploaded byThoharudin
- Cross SWOT Analysis.docxUploaded byNhocdaica Nguyen
- LTE TG Module 2 Genetics Feb. 26PilarUploaded byAnnEvite
- textile historyUploaded byYasser Eid
- Words of the Mouth and Not by the HeartUploaded byKent Palma
- The Pragmatic Value of Evidential SentencesUploaded byjason_cullen
- 3 3 2 a med history 3 1Uploaded byapi-329932534
- TopicUploaded byThanigaivel Manivannan