0 Up votes0 Down votes

30 views2 pagesDec 16, 2017

© © All Rights Reserved

PDF, TXT or read online from Scribd

© All Rights Reserved

30 views

© All Rights Reserved

- Lecture 7 data structures
- Technical question and answers in c , c++, ds , oracle, unix
- Eminer Help
- 10.1.1.167.3624
- btrees
- Algorithm.docx
- GraphPresentation
- [19]inductionofDecisionTrees
- Unit1 Notes CAT1
- IJETTCS-2013-06-11-084
- Detecting Internet Worms Using Data Mining Techniques
- Lecture12.pdf
- Avl Trees
- 01_ForecastingforInventory
- Android Widget Framework
- The ID3 Algorithm
- David_Ramos.ppt
- 01 Data Mining-Classification Basic
- Tutorial 9
- Www Geeksforgeeks Org

You are on page 1of 2

will split the data as a series of

Decision trees are a type of splits on single features Additional splits can occur until a

supervised machine learning stopping criteria is reached

They use known training data to Common stopping criteria are

create a process that predicts the Maximum Depth: the maximum

results of that data. That process number of splits in a row has been

can then be used to predict the reached

results for unknown data

A decision tree processes data into

groups based on the value of the

Maximum Leafs: dont allow any

data and the features it is provided

more leafs

At each decision gate, the data gets

Min Samples Split: Only split a node

split into two separate branches,

if it has at least this many samples

which might be split again

Min Samples Leaf: Only split a node

The final grouping of the data is

if both children resulting have at

called Leaf Nodes It will not do what a human might

least this many samples

do, shown below

Min Weight Fraction: Only split a

node if it has at least this

percentage of the total samples

RANDOMIZATION

When a single decision tree is run,

it usually looks at every point in the

available data.

However some algorithms combine

multiple decision trees to capture

their benefits while mitigating over

fitting

Two widely used algorithms that

Decision trees can be used for use multiple decision trees are

regression to get a real numeric Time Complexity Random Forests and Gradient

value. Or they can be used for For a single decision tree, the most Boosted Trees

classification to split data into expensive operation is picking the Random Forests use a large number

different distinct categories. best splits of slightly different decision trees

Picking the best split requires run in parallel and averaged to get a

evaluating every possible split final result.

PICKING THE SPLITS Finding every possible split requires Gradient Boosted Trees use

sorting all the data on every feature decision trees run in series, with

Decision trees look through every That means for M features and N later trees correcting errors on

possible split and pick the best split points, this takes M * N lg(N) time previous trees.

between two adjacent points in a For machine learning algorithms If multiple trees are combined, it

given feature that use multiple decision trees, can be advantageous to have a

If the feature is in the real range, those sorts can be done a single random element in the creation of

the split occurs halfway between time and cached the trees, which can mitigate over

the two points fitting

The split are always selected on a Overfitting Bagging (short for Bootstrap

single feature only, not any Over fitting, which means too aggregating) is drawing samples

interaction between multiple closely matching the training data from the data set with

features at the expense of the test data, is a replacement.

That means if there is a relationship key concern for decision trees If you had 100 data points, you

between multiple features, for Different stopping criteria should would randomly draw 100 points

instance, if density is the critical be evaluated with cross-validation and on average get 63.2% unique

feature, but the data is provided in to mitigate over fitting points and 36.8% repeats

Sub-Sampling - is drawing a smaller that error, sum the results and take Classification Information Gain

set than your data set. For the average Classification trees measure their

instance, if you have 100 points, information gain either by

randomly draw 60 of them EXAMPLE REGRESSION Entropy or Gini Impurity

(typically without replacement) The best result for those values is

Random Forests tend to use This is an example of a regression also zero, and the worst result

bagging, Gradient Boosted Trees tree attempting to match a sine occurs when every category has an

tend to use sub-sampling. wave using a single split equal likelihood of being at any

Additionally the features the tree given node

splits on can be randomized Gini equation

Instead of evaluating every feature

for the optimum split, a subset of

the features can be evaluated

(frequenlty the square root of the

number of features are evaluated) Where p is the probability of having

a given data class in your data set.

Entropy Equation

REGRESSION DECISION TREES

Below is an example of a depth 2

Regression trees default to splitting regression tree, which gets 3 splits

data based on mean squared error (i.e. 4 final groups)

(MSE) Those results end up being fairly

To calculate MSE: take the average similar

value of all points in each group,

and for each point in that group

subtract the average value from the

true value, square the error, sum all

of the squares and take the

average.

The chart below shows data that

was split in order to generate two

new groups that had the lowest

possible MSE

INFORMATION GAIN

FEATURE IMPORTANCES

Decision trees pick their splits in

order to maximize Information Decision trees lend themselves to

Gain identifying which features were the

Information is a metric which tells most important

you how much error you probably This is by calculating which feature

have in your decision tree resulted in the most information

Information gain is how much less gain (weighted by number of data

error you have after the split than points), across all of the splits

before the split The information gain can be MSE /

MSE by default tends to focus on MAE for regression trees or Entropy

points with the highest error and Regression Tree Information Gain / Gini for classification trees

split them into their own groups,

Regression trees use either MSE or

since their error has a squared MAE as their metric

effect.

The information gain is the MSE /

Mean Average Error (MAE) is also MAE before the split minus the MSE

an option for picking splits instead / MAE after the split (weighted by

of MSE. the number of data points that split

To calculate MAE: take the median operated on)

(not mean) of all points in the The best MSE / MAE is zero

group, and for each point subtract

the median value from the true

value, take the absolute value of

- Lecture 7 data structuresUploaded bySakura2709
- Technical question and answers in c , c++, ds , oracle, unixUploaded byapi-19750719
- Eminer HelpUploaded byRamana Reddy G
- 10.1.1.167.3624Uploaded bymongrel73
- btreesUploaded byRichard Wells
- Algorithm.docxUploaded bysaurab_baraik
- GraphPresentationUploaded bynagarajuvcc123
- [19]inductionofDecisionTreesUploaded bySanthosh Kiran Reddy
- Unit1 Notes CAT1Uploaded byfelix777s
- IJETTCS-2013-06-11-084Uploaded byAnonymous vQrJlEN
- Detecting Internet Worms Using Data Mining TechniquesUploaded byVMads3850
- Lecture12.pdfUploaded bySteev Zamudio Reyes
- Avl TreesUploaded bySumanta Datta
- 01_ForecastingforInventoryUploaded byFairusGooners
- Android Widget FrameworkUploaded byKiril Stanoev
- The ID3 AlgorithmUploaded byShivam Shukla
- David_Ramos.pptUploaded byMuftiaAlfian
- 01 Data Mining-Classification BasicUploaded byRaj Endran
- Tutorial 9Uploaded byadchy7
- Www Geeksforgeeks OrgUploaded bymebin
- VARIOUS RELATIONS ON NEW INFORMATION DIVERGENCE MEASURESUploaded byijitcsjournal
- Work SuperUploaded byHector Caceres Velasquez
- Benchmarking Functions for Genetic AlgorithmsUploaded bykalanath
- Comparison Between WEKA and Salford System in Data Mining SoftwareUploaded byAndysah Putra Utama Siahaan
- Csa FinalUploaded byNeha
- koltchinskii15aUploaded byVanidevi Mani
- 06-074ProcUploaded byPra Deep
- Chapter 6 CompleteUploaded byVasantha Kumari
- Technical Aptitude Questions eBookUploaded byjabin
- 11 IJASCA the Accuracy of XGBoost 159 171Uploaded byNovita Olivera

- AitUploaded byVIVEKANAND Kolachala
- SAS Miner Get Started 53Uploaded byTruely Male
- Credit Risk Evaluation of Online Personal Loan Applicants: a Data Mining approachUploaded bydmitry8071
- 1603.02754v3Uploaded byNguyễn Đăng Chiến
- Gradient Boosting - Wikipedia, The Free EncyclopediaUploaded bySrijan Sharma
- Predictive Modeling of Titanic Survivors (1)Uploaded byAndre Hawari
- Evolution of Regression - Ols to Gps to Mars Sf MeetupUploaded byVolodja
- DTREGUploaded byyxiefacebook
- 2018_MWDSI_Forecasting Intermittent Demand Patterns With Time SeriesUploaded byAnonymous qq17d7UgRT
- 2006 - An Empirical Comparison of Supervised Learning AlgorithmsUploaded byFranck Dernoncourt
- Rohan's MS Project UCSD - Kaggle.comUploaded byrohananil
- Integrated Learners - Mlr TutorialUploaded byAri Clecius
- Entity Embeddings of Categorical VariablesUploaded byAxel Straminsky
- Complete Guide to Parameter Tuning in Gradient Boosting (GBM) in Python.pdfUploaded byTeodor von Burg
- A Complete Tutorial on Tree Based Modeling from Scratch (in R & Python).pdfUploaded byTeodor von Burg
- GBM VignetteUploaded bymanoj1390
- 16128_FULLTEXTUploaded byMuhammad Iqbal Pratama
- Brochure Big DataUploaded bySheik Mohamed Imran
- Introducing_Azure_Machine_Learning.pdfUploaded byAnonymous EMyy4EvYce
- Research_paper(team09).docxUploaded byMANISH KUMAR
- ADS FantasticUploaded byscribd_fcosta
- A Statistical Overview of Sand Demand in Asia and EuropeUploaded byYuvaraj Dhandapani
- kaggle-competition.pdfUploaded byTeodor von Burg
- DTREGUploaded byVirat Vaid
- P14 Final Slides - Beyond GLMs (Priest_Conort)_DemobbUploaded bynvtcmn
- Getting Started With SAS Enterprise Miner 5 3Uploaded byVaneet Singh Arora
- Questions Data ScienceUploaded byVipul Sehgal
- A note on RUploaded byJohnson O.V
- XGBoost R Tutorial DocUploaded byNitish
- BoostedTree.pdfUploaded bysleepyhollowinoz

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.