© All Rights Reserved

3 views

© All Rights Reserved

- Expanded steps for Example 2.3 from Pattern Recognition/4e by Theodoridis and Koutroumbas
- exc1
- Algebra 2 - Study Notebook
- DataMining
- Notes 25
- BARAN
- Nasa/Tm—2005 213958
- casio fx-82sx
- MPRA_paper_3924
- 6ph08 Er June 11
- Udaan _ct1_functions, Limits Continuity Differentiability Indefinite Integrals(2)
- SPLCalculations.pdf
- sol05
- raynal-villasenor
- SQL
- Physics IA1
- Bayesian Color Constancy
- Hydraulics Lab
- Performance Еvaluation of Тracking Аlgorithm Incorporating Attribute Data Processing via DSmT
- Notation and Terms

You are on page 1of 4

edu

(Q0)

(1) C

(2) B

(a) http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticReg

ression.html

____________________________________________________________________________

(Q1)

Before doing logistic regression as we are using regularisation we want all features to be

penalised proportionately so i normalise each feature value by subtracting the mean of that

feature and dividing by its std deviation and then use them in the regression formulation. Also i

remove ‘time’ field before processing.

(1)

As the number of examples for the negative class are much more than those of the positive

class, while training it learns parameters which classify the negative classes correctly. So even

some of the positive samples might be incorrectly classified as negative and hence the accuracy

for the negative class is much better (99%) than that of the positive class (42%)

(2)

1. Resample the data so that you have a comparable population of both classes , or get

more data for the class which has lower amount of data

2. Change the objective function to downweigh each class by its probability, so if a class

occurs more , still the corresponding term will carry a nearly equal weight to that of the

other class, so equal importance would be given to positive and negative samples in

training.

(3)

Increasing or decreasing C , does not largely change the accuracy for the negative class but it

hugely affects the accuracy for the positive class (the one which has fewer samples) .

The LHS is the posterior which can be thought of as the product of the Likelihood and the Prior.

To maximise P(y | data) there is a tradeoff between the likelihood and the prior.

Now thinking of the P(y) term as a regulariser , it would not matter if the LHS was almost entirely

decided by P(data | y) ie the likelihood. As the first class has a large amount of data , the

likelihood of the data given that class is high and dominates that term. In contrast , the

regulariser controls the MAP estimate of the class with fewer samples.

(4)

M = Σall examples 1/pc t c log y(ɸc) + 1/ pc (1-tc) log (1-y(ɸc))

Basically we calculate the Probabilities of the two classes and maximise, in this case

1 log y(ɸ1)) + Σall examples with class 0 1/p

0 (t

0 log y(ɸ0))

(5)

I just calculated the class weight of each class as the inverse of its probability of occuring.

Specifically i used

-neg_weight = total/ neg

logreg =

linear_model.LogisticRegression(penalty='l2',C=1e-8,class_weight={1:pos_weight,0:neg_weight

})

(6)

We are minimizing this function,

Now here if examples of 1 class for example the negative class are more prevalent then the

formulation will learn to find parameters which maximise the second function or always correctly

classify the negative class. We would instead modify this function as follows.

Where pn is

the probability of the class of example n which is calculated as Sample of that class

/ Total samples.

____________________________________________________________________________

(Q2)

(1)

𝝐 ~ N(0,𝞴I)

P(y_i | x_i, 𝛃) = ?

As y_i = xiT 𝛃 + 𝝐

(2)

MAP estimate of 𝛃 ∝ argmax𝛃 P({y_i,x_i}1n | 𝛃) P(𝛃)

Now ∏ P(y_i| 𝛃,x_i) = ∏ N(xiT 𝛃,𝞴I) ∝ 𝞴-n exp(-1/2𝞴2 (y-XT 𝛃)T(y-XT 𝛃))

-n

So our MAP estimate ∝ argmax𝛃 𝞴

exp(-1/2𝞴2 (y-XT 𝛃)T(y-XT 𝛃)) * 𝞂-1 exp(-𝛃2 /2𝞂2)

-n -1

∝ argmax𝛃 𝞴 𝞂 exp(-𝛃2 /2𝞂2 -1/2 𝞴2 (y

-XT 𝛃)T(y-XT 𝛃))

____________________________________________________________________________

(Q3)

At the decision boundary the Euclidean distance will be same from 𝞵+ and 𝞵- .

So we have

b = (||𝞵+||2 - ||𝞵-||2 )

____________________________________________________________________________

- Expanded steps for Example 2.3 from Pattern Recognition/4e by Theodoridis and KoutroumbasUploaded byheathhunnicutt
- exc1Uploaded byKiran Soni
- Algebra 2 - Study NotebookUploaded byIshneet Dhillon
- DataMiningUploaded byUday
- Notes 25Uploaded bySandeep Singh
- BARANUploaded byapi-3733260
- Nasa/Tm—2005 213958Uploaded byjr-parshanth
- casio fx-82sxUploaded by18_06_1900
- MPRA_paper_3924Uploaded byHeru Susilo
- 6ph08 Er June 11Uploaded byAshique Mahmood
- Udaan _ct1_functions, Limits Continuity Differentiability Indefinite Integrals(2)Uploaded byKAPIL SHARMA
- SPLCalculations.pdfUploaded bybilgipaylasim
- sol05Uploaded bySonnySonni
- raynal-villasenorUploaded bysipil123
- SQLUploaded byPavan Kumar Reddy
- Physics IA1Uploaded byIndraneel Bhattacharjee
- Bayesian Color ConstancyUploaded byAlfonso Min
- Hydraulics LabUploaded byFleurette Soliven
- Performance Еvaluation of Тracking Аlgorithm Incorporating Attribute Data Processing via DSmTUploaded byMia Amalia
- Notation and TermsUploaded byPeach
- Oracle FuncionesUploaded bysulma
- PT1 2U Fort St 2011 & SolutionsUploaded byEileen
- burford-project 2- mathematical modeling autosavedUploaded byapi-430812455
- David PanelUploaded bydatateam
- fulltext.pdfUploaded byMarco Figueiredo
- holmstrom-lecture_split-range (1).pdfUploaded byXIOMARAramos
- gpsworld_june03Uploaded byJose
- Gen Math Pre TestUploaded bydianne
- solapri2001Uploaded by雪郎かざき
- Tolong Ya 2Uploaded byAdvarel

- 26311845 Godel Escher Bach an Eternal Golden BraidUploaded byDonald Robinson
- Theory of Computation Assignment 1Uploaded byishan_chawla123
- Quelea-PLDI15Uploaded byishan_chawla123
- VC-SavageUploaded byishan_chawla123
- AHW3_ansUploaded byishan_chawla123
- Homework 1Uploaded byishan_chawla123
- _5434d3ec492babe5dcb015df70703096_Neural-probabilisic-language-models.pdfUploaded byAndres Suarez
- 234-Trees.pptxUploaded bymln35
- 1981 Tods Kung RobinsonUploaded byishan_chawla123
- Advanced system design 2Uploaded byishan_chawla123
- Advanced system designUploaded byishan_chawla123
- Lecture 3Uploaded byishan_chawla123
- LicenseUploaded byPerseus Jackson

- A Leadership Si Personality ConstructsUploaded byPersephona13
- Capstone Project Report - Hotel Room Pricing in Indian CitiesUploaded byShrey Shailesh Shah
- FACTORS AFFECTING FOREIGN DIRECT INVESTMENT IN INDONESIAUploaded byLlintang Widayanto
- Multinomial Probit and Logit ModelsUploaded byFreddye Carrasco
- Numerical Methods NotesUploaded bySyed Hasif Sy Mohamad
- 12_ISSN_1392-1215_Electronic Parking Control SystemsUploaded byhiloactive
- Seminar of Probabilistic and StatisticsUploaded byhajriyanti yatmar
- Fiscal Decentralization after Implementation of Regional Autonomy in IndonesiaUploaded byHarryanto Endhy
- Pair Strength AnalyzerUploaded bykhairil2781
- 9 Sc Effect of Corporal Punishment on Student Motivation.pdfUploaded bySK
- Lab Program ListUploaded byAmit Mishra
- Testing_Non-nested_Non-linear_Regression_Models.pdfUploaded bywuri
- Free FallUploaded bySunday Glo M. Cabuyao
- Analysis of Infrastructural Economic Planning for Special Economic ZonesUploaded bynishanthnaidu
- The Potential Benefits of Sugar-free Chewing Gum on the Oral Health and Quality of LifeUploaded byAhmad Ulil Albab
- John M. Yancey -- Ten Rules for Reading Clinical research reportsUploaded bydanmerc
- Statistical Modeling for Biomedical ResearchersUploaded byAli Ashraf
- 2015 TripartiteUploaded byUhudhu Ahmed
- Finquiz Mock 2018 QuestionsUploaded byEdgar Lay
- MBA-SemI Syllabus 80-20 PatternUploaded bySunil Sunita
- Load Cell ManualUploaded byrahuldevpatel5
- 10_Ford_JMLUploaded byfernanda136
- Household SurveysUploaded byBrandon Lo
- kernlabUploaded byOlfa Ghribi
- Linear Regression is an Important Concept in Finance and Practically All Forms of ResearchUploaded byYusuf Hussein
- Christoffersen (1998)Uploaded byVictor Roos
- Modeling High-Frequency FX Data DynamicsUploaded bykerplah218
- Basic Stats and ProbabilityUploaded byMustafa Ghafouri
- 01490575Uploaded byapi-3697505
- Humor , stress and coping strategiesUploaded byLaura Cristina Rodriguez

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.