Document 1

Uploaded by

Zack Zack

0% found this document useful (0 votes)

11 views1 page

Original Title

document1.docx

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

11 views1 page

Document 1

Uploaded by

Zack Zack

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 1

Search inside document

1. In real world data, tuples with missing values for some attributes are a common occurrence.

Methods for handing this problem are :

- Ignore the tuple(s): when the tuple has many missing values
- Fill in the missing manually: time consuming and not effective, not feasible with large data sets
- Use a global constant to fill in the missing value
- Use the attribute mean for all samples belonging to the same class as given tupel
- User the most probable value to fill in the missing value: using regression, inference-based tools or
decision tree reduction.
2.
Issues to consider during data aggregation: potential loss of interesting details
3.
Some methods of Dimensionality Reduction:
- Principal component analysis(PCA):
+ Based on condition that the data in a higher dimensional space need to map to data in a lower
dimension space. The variance of the data in the lower dimensional space should be maximum.
+ Steps: construct the covariance matrix of the data -> compute the eigenvector of theis matrix.
4.
A ‘good’ subset of the original attributes can be found through methods:
- Heuristic methods:
+ Stepwise forward selection: the procedure starts with an empty set of attributes as the reduced set.
Through many iteration, the best of the original attributes are determined and added to the reduced
set.

+ Stepwise backward elimination: the procedure starts with the full set of attributes. At each step, it
removes the worst attribute remaining in the set.

+ Combination of forward selection and backward elimination: at each step, the procedure selects the
best attributes and removes the worst from among the remaining attributes.

+ Decision tree induction: construct a flow chart like structure where each internal node denotes a
test on an attribute, each branch corresponds to an outcome of the test, and each external node
denotes a class prediction. At each node , the algorithm chooses the best attribute to partition the
data into individual classes. Based on the given data, a tree is construct that those attributes that do
not appear in the tree are assumed to be irrelevant
5.
Purpose of normalization: to scale the data of an attribute so that it falls in a smaller range
Some methods of data normalization:
- Decimal scaling: moving the decimal point of values of the data, we divide each value of the data by
the maximum absolute value of data using the formula: vi' = vi/ 10^j
where j is the smallest integer such that max(|vi'|) < 1.

- Min-max normalization: linear transformation is performed on the original data. Minimum and
maximum value from data is fetched and each value is replaced according to the formula.
v' = (v - min(A)) * (new_max(A) - new_min(A)) / (max(A) - min(A)) + new_min(A)
- Z-score normalization: values are normalized based on mean and standard deviation of the data A.
The formula: new entry = (old entry - standard deviation of A) / mean of A

Major Issues in Data Mining
Document5 pages
Major Issues in Data Mining
Gaurav Jaiswal
No ratings yet
Ques 1.give Some Examples of Data Preprocessing Techniques?: Assignment - DWDM Submitted By-Tanya Sikka 1719210284
Document7 pages
Ques 1.give Some Examples of Data Preprocessing Techniques?: Assignment - DWDM Submitted By-Tanya Sikka 1719210284
Sachin Chauhan
No ratings yet
Research Citation Notes
Document35 pages
Research Citation Notes
Web Best Wabii
No ratings yet
DuongToGiangSon 517H0162 HW2 Nov-26
Document17 pages
DuongToGiangSon 517H0162 HW2 Nov-26
Son Tran
No ratings yet
Data Mining Unit-1 Lect-4
Document49 pages
Data Mining Unit-1 Lect-4
Pooja Reddy
No ratings yet
DWDM
Document9 pages
DWDM
ph2in3856
No ratings yet
ML Notes
Document79 pages
ML Notes
Ina Vlad
No ratings yet
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
Document59 pages
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
Indumathy Paranthaman
No ratings yet
ML Unit 2
Document41 pages
ML Unit 2
abhijit kate
No ratings yet
Machine Learning Theory
Document12 pages
Machine Learning Theory
airplaneunderwater
No ratings yet
Group A Assignment No2 Writeup
Document9 pages
Group A Assignment No2 Writeup
403 Chaudhari Sanika Sagar
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
Document47 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
tanvi wadhwa
No ratings yet
UNIT-III Data Warehouse and Minig Notes MDU
Document42 pages
UNIT-III Data Warehouse and Minig Notes MDU
neha srivastava
No ratings yet
Data Preprocessing
Document66 pages
Data Preprocessing
TuLbig E. Winnower
No ratings yet
Data Preprocessing in Data Mining
Document4 pages
Data Preprocessing in Data Mining
Dhananjai Saini
No ratings yet
Unit Iii
Document43 pages
Unit Iii
42. Nikita Singh
No ratings yet
Data Mining Questions and Answers
Document22 pages
Data Mining Questions and Answers
debmatra
No ratings yet
Missing Data Analysis: University College London, 2015
Document37 pages
Missing Data Analysis: University College London, 2015
charudattasonawane55
No ratings yet
Major Issues in Data Mining
Document9 pages
Major Issues in Data Mining
Gaurav Jaiswal
No ratings yet
ML Practical File
Document43 pages
ML Practical File
Pankaj Singh
100% (1)
Data Mining (Viva)
Document18 pages
Data Mining (Viva)
Anubhav Shrivastava
No ratings yet
Machine Learning Notes
Document6 pages
Machine Learning Notes
Nikhita Nair
No ratings yet
Prediction: All Topics in Scanned Copy "Adaptive Business Intelligence" by Zbigniewmichlewicz Martin Schmidt)
Document46 pages
Prediction: All Topics in Scanned Copy "Adaptive Business Intelligence" by Zbigniewmichlewicz Martin Schmidt)
rash
No ratings yet
ML - Machine Learning PDF
Document13 pages
ML - Machine Learning PDF
David Esteban Meneses Rendic
No ratings yet
Missing Data Imputation Using Singular Value Decomposition
Document6 pages
Missing Data Imputation Using Singular Value Decomposition
Alamgir Mohammed
No ratings yet
Unit-2 Lecture Notes
Document33 pages
Unit-2 Lecture Notes
Sravani Gunnu
No ratings yet
Semi Supervised Learning
Document86 pages
Semi Supervised Learning
chaudharylalit025
No ratings yet
Data Prep Roc Es
Document31 pages
Data Prep Roc Es
M sindhu
No ratings yet
Kenny-230718-The Ultimate Machine Learning Cheat Sheet
Document20 pages
Kenny-230718-The Ultimate Machine Learning Cheat Sheet
vanjchao
No ratings yet
Principal Component Analysis
Document13 pages
Principal Component Analysis
Shil Shambharkar
No ratings yet
Unit 2
Document6 pages
Unit 2
Dakshkohli31 Kohli
No ratings yet
A COMPLETE GUIDE TO PRINCIPAL COMPONENT ANALYSIS in ML 1598272724
Document16 pages
A COMPLETE GUIDE TO PRINCIPAL COMPONENT ANALYSIS in ML 1598272724
「瞳」你分享
No ratings yet
Learning Types ML
Document18 pages
Learning Types ML
21124059
No ratings yet
EDAB Module 5 Singular Value Decomposition (SVD)
Document58 pages
EDAB Module 5 Singular Value Decomposition (SVD)
nagarajan
No ratings yet
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
Document19 pages
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
subithaperiyasamy
No ratings yet
Dimensionality Reduction
Document19 pages
Dimensionality Reduction
Atul Patil
No ratings yet
Decision Tree Using Sci-Kit Learn
Document9 pages
Decision Tree Using Sci-Kit Learn
sudeepvmenon
No ratings yet
Lab Manual Computer Science & Engineering
Document29 pages
Lab Manual Computer Science & Engineering
41- Vaibhav Vyas
No ratings yet
A Novel Approach of Implementing An Optimal K-Means Plus Plus Algorithm For Scalar Data
Document6 pages
A Novel Approach of Implementing An Optimal K-Means Plus Plus Algorithm For Scalar Data
sinigersky
No ratings yet
Data Preprocessing
Document3 pages
Data Preprocessing
Im' Possible
No ratings yet
3.2 Pca
Document27 pages
3.2 Pca
Javada Javada
No ratings yet
6 - Data Pre-Processing-III
Document30 pages
6 - Data Pre-Processing-III
Kanika Chanana
No ratings yet
Cap6 - Data Reduction
Document27 pages
Cap6 - Data Reduction
priyanshidubey2008
No ratings yet
Data Preprocessing Part 3
Document31 pages
Data Preprocessing Part 3
new acc jeet
No ratings yet
Unit Ii DM
Document82 pages
Unit Ii DM
Suganthi D PSGRKCW
No ratings yet
ML Unit 1 Part 2
Document56 pages
ML Unit 1 Part 2
jkdprince3
No ratings yet
The Complete Guide To Data Preprocessing
Document50 pages
The Complete Guide To Data Preprocessing
hariharikulewati
No ratings yet
UNIT2SVMKNN
Document31 pages
UNIT2SVMKNN
Aditya Sharma
No ratings yet
Unit 3 & 4 (p18)
Document18 pages
Unit 3 & 4 (p18)
Kashif Baig
No ratings yet
Lec 2 Preparing The Data
Document4 pages
Lec 2 Preparing The Data
Alia butt
No ratings yet
Week 7 - Tree-Based Model
Document8 pages
Week 7 - Tree-Based Model
Nguyễn Trường Sơn
100% (1)
Supervised and Unsupervised Learning: Ciro Donalek Ay/Bi 199 - April 2011
Document69 pages
Supervised and Unsupervised Learning: Ciro Donalek Ay/Bi 199 - April 2011
Emmanuel Harris
No ratings yet
Unit 4
Document4 pages
Unit 4
adityapawar1865
No ratings yet
ML 1
Document20 pages
ML 1
Adwait Raich
No ratings yet
Unit-3 Data Preprocessing
Document7 pages
Unit-3 Data Preprocessing
Khal Drago
100% (1)
DSA Activity Lec 2
Document8 pages
DSA Activity Lec 2
Atheena Danielle Dumago
No ratings yet
Revised Clustering Business Report
Document5 pages
Revised Clustering Business Report
Pratigya pathak
No ratings yet
Term Paper
Document5 pages
Term Paper
Akash Sudan
No ratings yet
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Google Colab Material
Document4 pages
Google Colab Material
Zack Zack
No ratings yet
Research Critique
Document2 pages
Research Critique
Zack Zack
No ratings yet
Improvement
Document1 page
Improvement
Zack Zack
No ratings yet
Course Name: Software Engineer 1 Year: Fall 2019
Document4 pages
Course Name: Software Engineer 1 Year: Fall 2019
Zack Zack
No ratings yet
Synopsis - Diabetes Prediction
Document28 pages
Synopsis - Diabetes Prediction
Soham Bilolikar
No ratings yet
Machine Learning Internship Report
Document31 pages
Machine Learning Internship Report
suchithra Nijaguna
33% (9)
A Review On Machine Learning Techniques
Document5 pages
A Review On Machine Learning Techniques
Editor IJRITCC
No ratings yet
Unit 4 Machine Learning Tools, Techniques and Applications
Document78 pages
Unit 4 Machine Learning Tools, Techniques and Applications
Jyothi Pulikanti
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
Document11 pages
Data Mining: Practical Machine Learning Tools and Techniques
John
No ratings yet
Leakage Identification in Water Distribution Networks Based On Xgboost Algorithm
Document13 pages
Leakage Identification in Water Distribution Networks Based On Xgboost Algorithm
Ezhilarasi Bhaskaran
No ratings yet
Is-Cost-Effective Prediction of Pci
Document23 pages
Is-Cost-Effective Prediction of Pci
Yoga Aulia
No ratings yet
Random Forest Regression
Document22 pages
Random Forest Regression
Raja
No ratings yet
ITB1 Documentation Detection of Phishing Website Using ML
Document49 pages
ITB1 Documentation Detection of Phishing Website Using ML
NAVYA Tadisetty
No ratings yet
Classification and Prediction
Document81 pages
Classification and Prediction
Krishnan Swami
No ratings yet
Machine Learning For PI System and SQL Server Analysis Services
Document23 pages
Machine Learning For PI System and SQL Server Analysis Services
Claudio Berton Cardenas
No ratings yet
Unit-III Decision Tree: Course In-Charges
Document69 pages
Unit-III Decision Tree: Course In-Charges
ss sri
No ratings yet
Decision Trees in Machine Learning - by Prashant Gupta - Towards Data Science
Document6 pages
Decision Trees in Machine Learning - by Prashant Gupta - Towards Data Science
Akash Mukherjee
No ratings yet
Hronsky - Session 4 - Mineral Exploration Tactics
Document44 pages
Hronsky - Session 4 - Mineral Exploration Tactics
junior.geologia
No ratings yet
Gini Vs Entrophy
Document8 pages
Gini Vs Entrophy
Jorge Yupanqui Duran
No ratings yet
ML Lab Manual TE 2021-22
Document43 pages
ML Lab Manual TE 2021-22
FD04 Alok Kadu
No ratings yet
BDMDM Telemarketing
Document16 pages
BDMDM Telemarketing
Swapnil Jain
No ratings yet
Descriptive and Predictive Analytics On Adventure Works Cycle: A Corporate Decision Making
Document9 pages
Descriptive and Predictive Analytics On Adventure Works Cycle: A Corporate Decision Making
Nguyễn Hoa
100% (1)
Lecture 06 Part A - Macine Learning
Document77 pages
Lecture 06 Part A - Macine Learning
Asnad Ahmed
No ratings yet
TYIT SEM VI BI May 2019 Solution
Document21 pages
TYIT SEM VI BI May 2019 Solution
vivek
0% (1)
Asset v1 ACCA+ML001+2T2021+Type@Asset+Block@Glossary
Document5 pages
Asset v1 ACCA+ML001+2T2021+Type@Asset+Block@Glossary
Felicia Fortuna
No ratings yet
Data Mining 2 Marks
Document17 pages
Data Mining 2 Marks
Suganya Periasamy
100% (1)
Decision Trees and Dealing With Uncertainty Printable
Document28 pages
Decision Trees and Dealing With Uncertainty Printable
8612959
No ratings yet
Crop Price Prediction Using Machine Learning
Document5 pages
Crop Price Prediction Using Machine Learning
Tejas hv
No ratings yet
ML Lab Program3
Document3 pages
ML Lab Program3
Vali Bhasha
No ratings yet
Final Last
Document34 pages
Final Last
akhil
No ratings yet
Concepts and Techniques: Data Mining
Document88 pages
Concepts and Techniques: Data Mining
Hasibur Rahman Porag
No ratings yet
Python Programming & Data Science Lab Manual
Document25 pages
Python Programming & Data Science Lab Manual
SYEDA
No ratings yet
1548 6066 1 PB
Document10 pages
1548 6066 1 PB
Oun Vikreth
No ratings yet
Classification Algorithms Used in Data Mining. This Is A Lecture Given To MSC Students.
Document63 pages
Classification Algorithms Used in Data Mining. This Is A Lecture Given To MSC Students.
Sushil Kulkarni
100% (5)