Supervised Learning and K Nearest Neighbors: Business Intelligence For Managers

Uploaded by

swetha sastry

0% found this document useful (0 votes)

5 views15 pages

Original Title

kNN

Copyright

Available Formats

PPT, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PPT, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

5 views15 pages

Supervised Learning and K Nearest Neighbors: Business Intelligence For Managers

Uploaded by

swetha sastry

Copyright:

Available Formats

Download as PPT, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 15

Search inside document

Supervised Learning

and k Nearest
Neighbors
Business Intelligence for Managers
Supervised learning
and classification
 Given: dataset of instances with
known categories
 Goal: using the “knowledge” in the
dataset, classify a given instance
 predict the category of the given
instance that is rationally consistent
with the dataset
Classifiers

X1
X2
feature X3 Y
… Classifier category
values
Xn

collection of instances
DB
with known categories
Algorithms

 K Nearest Neighbors (kNN)

 Naïve-Bayes

 Decision trees

 Many others (support vector

machines, neural networks, genetic
algorithms, etc)
K - Nearest Neighbors
 For a given instance T, get the top k
dataset instances that are “nearest” to
T
 Select a reasonable distance measure
 Inspect the category of these k
instances, choose the category C that
represent the most instances
 Conclude that T belongs to category
C
Example 1
 Determining decision on scholarship
application based on the following features:
 Household income (annual income in
millions of pesos)
 Number of siblings in family
 High school grade (on a QPI scale of 1.0 –
4.0)
 Intuition (reflected on data set): award
scholarships to high-performers and to
those with financial need
Distance formula
 Euclidian distance: squareroot of sum
of squares of differences

for two features: (x)2 + (y)2

 Intuition: similar samples should be

close to each other
 May not always apply
(example: quota and actual sales)
Incomparable ranges

 The Euclidian distance formula has

the implicit assumption that the
different dimensions are comparable
 Features that span wider ranges
affect the distance value more than
features with limited ranges
Example revisited

 Suppose household income was

instead indicated in thousands of
pesos per month and that grades are
given on a 70-100 scale
 Note different results produced by
kNN algorithm on the same dataset
Non-numeric data
 Feature values are not always
numbers
 Example
 Boolean values: Yes or no, presence
or absence of an attribute
 Categories: Colors, educational
attainment, gender
 How do these values factor into the
computation of distance?
Dealing with non-numeric
data
 Boolean values => convert to 0 or 1
 Applies to yes-no/presence-absence
attributes
 Non-binary characterizations
 Use natural progression when applicable;
e.g., educational attainment: GS, HS,
College, MS, PHD => 1,2,3,4,5
 Assign arbitrary numbers but be careful
about distances; e.g., color: red, yellow, blue
=> 1,2,3
 How about unavailable data?
(0 value not always the answer)
Preprocessing your dataset
 Dataset may need to be preprocessed
to ensure more reliable data mining
results
 Conversion of non-numeric data to
numeric data
 Calibration of numeric data to reduce
effects of disparate ranges
 Particularly when using the Euclidean
distance metric
k-NN variations
 Value of k
 Larger k increases confidence in prediction
 Note that if k is too large, decision may be
skewed
 Weighted evaluation of nearest neighbors
 Plain majority may unfairly skew decision
 Revise algorithm so that closer neighbors
have greater “vote weight”
 Other distance measures
Other distance measures
 City-block distance (Manhattan dist)
 Add absolute value of differences
 Cosine similarity
 Measure angle formed by the two samples
(with the origin)
 Jaccard distance
 Determine percentage of exact matches
between the samples (not including
unavailable data)
 Others
k-NN Time Complexity

 Suppose there are m instances and n

features in the dataset
 Nearest neighbor algorithm requires
computing m distances
 Each distance computation involves
scanning through each feature value
 Running time complexity is
proportional to m X n

ML Lecture 11 (K Nearest Neighbors)
Document13 pages
ML Lecture 11 (K Nearest Neighbors)
Md Fazle Rabby
No ratings yet
Integer and Goal Programming
Document33 pages
Integer and Goal Programming
Aashrith Parvathaneni
100% (1)
Santana Gopala Stotram Telugu
Document12 pages
Santana Gopala Stotram Telugu
talkman
83% (12)
Lecture Week 2 KNN and Model Evaluation PDF
Document53 pages
Lecture Week 2 KNN and Model Evaluation PDF
Hoàng Phạm
100% (1)
Datanest - Data Science Interview
Document19 pages
Datanest - Data Science Interview
Juan Guillermo Ferrer Velando
No ratings yet
Introduction To Machine Learning: K-Nearest Neighbor Algorithm
Document25 pages
Introduction To Machine Learning: K-Nearest Neighbor Algorithm
Alwan Siddiq
No ratings yet
Lecture 07 KNN 14112022 034756pm
Document24 pages
Lecture 07 KNN 14112022 034756pm
Misbah
100% (1)
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
K Nearest Neighbor KNN
Document18 pages
K Nearest Neighbor KNN
Talha Khan
No ratings yet
K-Nearest Neighbor Learning
Document31 pages
K-Nearest Neighbor Learning
Edward Kenway
No ratings yet
Algo Paper
Document5 pages
Algo Paper
Uzman
No ratings yet
Decision Tree
Document18 pages
Decision Tree
Mo Shah
No ratings yet
K Nearest Neighbors: Probably A Duck."
Document14 pages
K Nearest Neighbors: Probably A Duck."
Daneil Radcliffe
No ratings yet
Accelerated Data Science Introduction To Machine Learning Algorithms
Document37 pages
Accelerated Data Science Introduction To Machine Learning Algorithms
sanketjaiswal
No ratings yet
Lecture 4
Document31 pages
Lecture 4
Ahmed Mosa
No ratings yet
Instance Based Learning
Document39 pages
Instance Based Learning
Darshan R Gowda
No ratings yet
DADM S15 K-NN Classification
Document13 pages
DADM S15 K-NN Classification
Anisha Sapra
No ratings yet
6 - KNN Classifier
Document10 pages
6 - KNN Classifier
Carlos Lopez
No ratings yet
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
Document47 pages
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
Vinay Kushwaha
No ratings yet
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
Document47 pages
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
pinak mishra
No ratings yet
IME672 - Lecture 44
Document16 pages
IME672 - Lecture 44
Himanshu Beniwal
No ratings yet
Data Mining: Kabith Sivaprasad (BE/1234/2009) Rimjhim (BE/1134/2009) Utkarsh Ahuja (BE/1226/2009)
Document32 pages
Data Mining: Kabith Sivaprasad (BE/1234/2009) Rimjhim (BE/1134/2009) Utkarsh Ahuja (BE/1226/2009)
Rule2
No ratings yet
Clustering: Analisis Big Data - Pertemuan 6
Document51 pages
Clustering: Analisis Big Data - Pertemuan 6
Nadya Novita
No ratings yet
Data Mining All Summary
Document47 pages
Data Mining All Summary
vijayakumar2k93983
No ratings yet
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
Document13 pages
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
「瞳」你分享
No ratings yet
TOD 212 - Digging Through Data - PPT - For Students - Monsoon 2023 (Autosaved)
Document18 pages
TOD 212 - Digging Through Data - PPT - For Students - Monsoon 2023 (Autosaved)
dhyani.s
No ratings yet
Road Traffic Algorithm
Document5 pages
Road Traffic Algorithm
syamprasad
No ratings yet
UCS551 Chapter 7 - Clustering
Document9 pages
UCS551 Chapter 7 - Clustering
Farah Yahaya
No ratings yet
DM Chapter 5 (Clustering)
Document40 pages
DM Chapter 5 (Clustering)
world channel
No ratings yet
Unit-6: Classification and Prediction
Document63 pages
Unit-6: Classification and Prediction
Nayan Patel
No ratings yet
Unit 5 - DA - Classification & Clustering
Document105 pages
Unit 5 - DA - Classification & Clustering
881Aritra Pal
No ratings yet
KNN VS Kmeans
Document3 pages
KNN VS Kmeans
Soubhagya Kumar Sahoo
No ratings yet
KNN and Naive Bayes
Document61 pages
KNN and Naive Bayes
vdjohn
No ratings yet
A Complete Guide To KNN
Document16 pages
A Complete Guide To KNN
cinculiranje
No ratings yet
Unit 4
Document20 pages
Unit 4
aknumsn7724
No ratings yet
w5 Classification
Document34 pages
w5 Classification
Swastik Sindhani
No ratings yet
K Nearest Neighbors - Classification: Algorithm
Document4 pages
K Nearest Neighbors - Classification: Algorithm
Muthu Kumaran
No ratings yet
Unit 5
Document31 pages
Unit 5
minichel
No ratings yet
Day 2 - Session 2: - KNN - Decision Tree - Random Forest - Naïve Bayes Classification
Document50 pages
Day 2 - Session 2: - KNN - Decision Tree - Random Forest - Naïve Bayes Classification
sartg
No ratings yet
Methods and Techniques
Document39 pages
Methods and Techniques
Antonio Gotev
No ratings yet
Unit 5 Classification PDF
Document131 pages
Unit 5 Classification PDF
Juee Jamsandekar
No ratings yet
Adaptive Learning-Based K-Nearest Neighbor Classifiers With Resilience To Class Imbalance
Document17 pages
Adaptive Learning-Based K-Nearest Neighbor Classifiers With Resilience To Class Imbalance
Ayush
No ratings yet
Unit - Iii
Document52 pages
Unit - Iii
Laxmi
No ratings yet
SUMSEM-2020-21 MEE6070 ETH VL2020210700842 Reference Material I 16-Jul-2021 K-Nearest Neighbors (KNN) Algorithm (Repaired) Week-3
Document40 pages
SUMSEM-2020-21 MEE6070 ETH VL2020210700842 Reference Material I 16-Jul-2021 K-Nearest Neighbors (KNN) Algorithm (Repaired) Week-3
Krupa
No ratings yet
Slides Courtesy: Ling Chen lchen@L3S.de
Document42 pages
Slides Courtesy: Ling Chen lchen@L3S.de
fero_sher22
No ratings yet
1694600817-Unit2.3 KNN CU 2.0
Document25 pages
1694600817-Unit2.3 KNN CU 2.0
prime9316586191
No ratings yet
Wikipedia K Nearest Neighbor Algorithm
Document4 pages
Wikipedia K Nearest Neighbor Algorithm
Radu Cimpeanu
No ratings yet
KNN ALGORITHM IN MACHINELEARNING
Document10 pages
KNN ALGORITHM IN MACHINELEARNING
nithinmamidala999
No ratings yet
Chapter 10 - Introduction To Data Mining
Document69 pages
Chapter 10 - Introduction To Data Mining
Pankaj Marye
No ratings yet
Mullick 2018
Document13 pages
Mullick 2018
Emilio
No ratings yet
ML (Interview)
Document20 pages
ML (Interview)
ratnadepp
No ratings yet
Isbm College of Engineering, Pune Department of Computer Engineering Academic Year 2021-22
Document9 pages
Isbm College of Engineering, Pune Department of Computer Engineering Academic Year 2021-22
Vaibhav Srivastava
No ratings yet
K-Means Consistency in Clustering
Document10 pages
K-Means Consistency in Clustering
Naveen Kumar
No ratings yet
Machine Learning
Document33 pages
Machine Learning
shobhit
No ratings yet
ML Unit 3 Part 3
Document33 pages
ML Unit 3 Part 3
jkdprince3
No ratings yet
Instance-Based Learning: K-Nearest Neighbour Learning
Document21 pages
Instance-Based Learning: K-Nearest Neighbour Learning
feathh1
No ratings yet
Why Nearest Neighbor?: Used To Classify Objects Based On Closest Training Examples in The Feature Space
Document40 pages
Why Nearest Neighbor?: Used To Classify Objects Based On Closest Training Examples in The Feature Space
UrsTruly Anirudh
No ratings yet
Data Mining Intro
Document46 pages
Data Mining Intro
Manish Sejpal
No ratings yet
KMEANS
Document9 pages
KMEANS
johnzenbano120
No ratings yet
An Introduction To Data Mining
Document35 pages
An Introduction To Data Mining
saloni_chugh
No ratings yet
Classification DecisionTreesNaiveBayeskNN
Document75 pages
Classification DecisionTreesNaiveBayeskNN
Dev kartik Agarwal
No ratings yet
Note 1518944988
Document27 pages
Note 1518944988
Ayman Ayman
No ratings yet
A Review of Various KNN Techniques
Document6 pages
A Review of Various KNN Techniques
IJRASETPublications
No ratings yet
Two-Stage Generative Adversarial Networks For Docu PDF
Document30 pages
Two-Stage Generative Adversarial Networks For Docu PDF
bro
No ratings yet
CS-13410 Introduction To Machine Learning
Document33 pages
CS-13410 Introduction To Machine Learning
fake Tiger
No ratings yet
CHE 411 Lesson 11 Note
Document39 pages
CHE 411 Lesson 11 Note
David Akomolafe
No ratings yet
Gauss Jordan Elimination
Document2 pages
Gauss Jordan Elimination
Kim Ryan Pomar
No ratings yet
Advanced Training Course On FPGA Design and VHDL For Hardware Simulation and Synthesis
Document20 pages
Advanced Training Course On FPGA Design and VHDL For Hardware Simulation and Synthesis
Lalita Kumari
No ratings yet
Classical Encryption Techniques
Document63 pages
Classical Encryption Techniques
man
No ratings yet
The Handbook of Formulas and Tables For Signal Processing
Document10 pages
The Handbook of Formulas and Tables For Signal Processing
dimil6
No ratings yet
Database Management Systems: BITS Pilani
Document17 pages
Database Management Systems: BITS Pilani
Anjali Rana
No ratings yet
LDPC Project
Document27 pages
LDPC Project
Velu Murugan
No ratings yet
Understanding WPAWPA2 Pre-Shared-Key Cracking PDF
Document6 pages
Understanding WPAWPA2 Pre-Shared-Key Cracking PDF
Jonathan Jaeger
No ratings yet
Methods of Substitution & Appln
Document5 pages
Methods of Substitution & Appln
Boobalan R
No ratings yet
Matrix Operation Linear Programming: Algorithms
Document28 pages
Matrix Operation Linear Programming: Algorithms
peter26194
No ratings yet
Assignment 2
Document3 pages
Assignment 2
AnasAhmed
0% (1)
Kayky Santos CV EN
Document1 page
Kayky Santos CV EN
Cultura Nerd/Geek
No ratings yet
Useful Sites
Document2 pages
Useful Sites
Rezen Rana Bhat
No ratings yet
Matrix Sample
Document3 pages
Matrix Sample
Alby
No ratings yet
Quiz April 6 MA 212M: Maths Minor: Instructor: Dr. Arabin Kumar Dey, Department of Mathematics, IIT Guwahati
Document2 pages
Quiz April 6 MA 212M: Maths Minor: Instructor: Dr. Arabin Kumar Dey, Department of Mathematics, IIT Guwahati
Shreya Rai
No ratings yet
Teme Pentru Referate La Cursul "Retele Neuronale"
Document3 pages
Teme Pentru Referate La Cursul "Retele Neuronale"
cernatand
No ratings yet
Fast Implementation of MD5
Document3 pages
Fast Implementation of MD5
dheeraj8r
No ratings yet
Optimization Techniques Draft
Document4 pages
Optimization Techniques Draft
Tabitha Raj
No ratings yet
Inventory Management: A Further Look: By: DR Aatma Maharajh
Document35 pages
Inventory Management: A Further Look: By: DR Aatma Maharajh
Anton Williams
No ratings yet
L3.4 Alignment
Document90 pages
L3.4 Alignment
Azamu Shahiullah Prottoy
No ratings yet
Download textbook Splitting Methods In Communication Imaging Science And Engineering Roland Glowinski ebook all chapter pdf
Document53 pages
Download textbook Splitting Methods In Communication Imaging Science And Engineering Roland Glowinski ebook all chapter pdf
clayton.fentress915
No ratings yet
Game Theory: Universit e Paris Dauphine-PSL
Document41 pages
Game Theory: Universit e Paris Dauphine-PSL
jfksldjf
No ratings yet
Nonlinear Modulational Instability in Dispersive Pdes
Document54 pages
Nonlinear Modulational Instability in Dispersive Pdes
Antonio Milos Radakovic
No ratings yet
2nd Prize CDMA Technology Amit Balani
Document5 pages
2nd Prize CDMA Technology Amit Balani
mhah_2005
No ratings yet