SlideShare a Scribd company logo
1 of 20
Deepak George
Senior Data Scientist – Machine Learning
Decision Tree Ensembles
Bagging, Random Forest & Gradient Boosting Machines
December 2015
 Education
 Computer Science Engineering – College Of Engineering Trivandrum
 Business Analytics & Intelligence – Indian Institute Of Management Bangalore
 Career
 Mu Sigma
 Accenture Analytics
 Data Science
 1st Prize Best Data Science Project (BAI 5) – IIM Bangalore
 Top 10% (out of 1100) finish Kaggle Coupon Purchase Prediction (Recommender
System)
 SAS Certified Statistical Business Analyst: Regression and Modeling Credentials
 Statistical Learning – Stanford University
 Passion
 Photography, Football, Data Science, Machine Learning
 Contact
 Deepak.george14@iimb.ernet.in
 linkedin.com/in/deepakgeorge7
Copyright @ Deepak George, IIM Bangalore
2
About Me
Copyright @ Deepak George, IIM Bangalore
3
Bias-Variance Tradeoff
Expected test MSE
 Bias
 Error that is introduced by approximating a
complicated relationship, by a much simpler
model.
 Difference between the truth and what you
expect to learn
 Underfitting
 Variance
 Amount by which model would change if we
estimated it using a different training data.
 If a model has high variance then small
changes in the training data can result in
large changes in the model.
 Overfitting
Copyright @ Deepak George, IIM Bangalore
4
Bias-Variance Tradeoff
Underfitting Ideal Learner Overfitting
 Problem: Decision tree have low bias & suffer from high variance
 Goal: Reduce variance of decision trees
 Hint: Given set of n independent observations Z1, . . . , Zn, each
with variance σ2, the variance of the mean of the observations is given
by σ2/n.
 In other words, averaging a set of observations reduces variance.
 Theoretically: Take multiple independent samples S’ from the population
 Fit “bushy”/deep decision trees on each S1,S2…. Sn
 Trees are grown deep and are not pruned
 Variance reduces linearly & Bias remain unchanged
 Practically: We only have one sample/training set & not the population.
 So take bootstrap samples i.e. multiple samples from the
single sample with replacement
 Variance reduces sub-linearly & Bias often increase slightly
because bootstrap samples are correlated.
 Final Classifier: Average of predictions for regression or majority vote
for classification.
 High Variance introduced by deep decision trees are mitigated by
averaging predictions from each decision trees.
Copyright @ Deepak George, IIM Bangalore
5
Bagging
Population
Alice# 14# 0# 1#
Bob# 10# 1# 1#
Carol# 13# 0# 1#
Dave# 8# 1# 0#
Erin# 11# 0# 0#
Frank# 9# 1# 1#
Gena# 8# 0# 0#
James# 11# 1# 1#
Jessica# 14# 0# 1#
Alice# 14# 0# 1#
Amy# 12# 0# 1#
Bob# 10# 1# 1#
Xavier# 9# 1# 0#
Cathy# 9# 0# 1#
Carol# 13# 0# 1#
Eugene# 13# 1# 0#
Rafael# 12# 1# 1#
Dave# 8# 1# 0#
Peter# 9# 1# 0#
Henry# 13# 1# 0#
Erin# 11# 0# 0#
Rose# 7# 0# 0#
Iain# 8# 1# 1#
Paulo# 12# 1# 0#
Margaret# 10# 0# 1#
Frank# 9# 1# 1#
Jill# 13# 0# 0#
Leon# 10# 1# 0#
Sarah# 12# 0# 0#
Gena# 8# 0# 0#
Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]###
Alice# 14# 0# 1#
Bob# 10# 1# 1#
Carol# 13# 0# 1#
Dave# 8# 1# 0#
Erin# 11# 0# 0#
Frank# 9# 1# 1#
Gena# 8# 0# 0#
James# 11# 1# 1#
Jessica# 14# 0# 1#
Alice# 14# 0# 1#
Amy# 12# 0# 1#
Bob# 10# 1# 1#
Xavier# 9# 1# 0#
Cathy# 9# 0# 1#
Carol# 13# 0# 1#
Eugene# 13# 1# 0#
Rafael# 12# 1# 1#
Dave# 8# 1# 0#
Peter# 9# 1# 0#
Henry# 13# 1# 0#
Erin# 11# 0# 0#
Rose# 7# 0# 0#
Iain# 8# 1# 1#
Paulo# 12# 1# 0#
Margaret# 10# 0# 1#
Frank# 9# 1# 1#
Jill# 13# 0# 0#
Leon# 10# 1# 0#
Sarah# 12# 0# 0#
Gena# 8# 0# 0#
Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]###
S1
Alice# 14# 0# 1#
Bob# 10# 1# 1#
Carol# 13# 0# 1#
Dave# 8# 1# 0#
Erin# 11# 0# 0#
Frank# 9# 1# 1#
Gena# 8# 0# 0#
James# 11# 1# 1#
Jessica# 14# 0# 1#
Alice# 14# 0# 1#
Amy# 12# 0# 1#
Bob# 10# 1# 1#
Xavier# 9# 1# 0#
Cathy# 9# 0# 1#
Carol# 13# 0# 1#
Eugene# 13# 1# 0#
Rafael# 12# 1# 1#
Dave# 8# 1# 0#
Peter# 9# 1# 0#
Henry# 13# 1# 0#
Erin# 11# 0# 0#
Rose# 7# 0# 0#
Iain# 8# 1# 1#
Paulo# 12# 1# 0#
Margaret# 10# 0# 1#
Frank# 9# 1# 1#
Jill# 13# 0# 0#
Leon# 10# 1# 0#
Sarah# 12# 0# 0#
Gena# 8# 0# 0#
Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]###
S2
Alice# 14# 0# 1#
Bob# 10# 1# 1#
Carol# 13# 0# 1#
Dave# 8# 1# 0#
Erin# 11# 0# 0#
Frank# 9# 1# 1#
Gena# 8# 0# 0#
James# 11# 1# 1#
Jessica# 14# 0# 1#
Alice# 14# 0# 1#
Amy# 12# 0# 1#
Bob# 10# 1# 1#
Xavier# 9# 1# 0#
Cathy# 9# 0# 1#
Carol# 13# 0# 1#
Eugene# 13# 1# 0#
Rafael# 12# 1# 1#
Dave# 8# 1# 0#
Peter# 9# 1# 0#
Henry# 13# 1# 0#
Erin# 11# 0# 0#
Rose# 7# 0# 0#
Iain# 8# 1# 1#
Paulo# 12# 1# 0#
Margaret# 10# 0# 1#
Frank# 9# 1# 1#
Jill# 13# 0# 0#
Leon# 10# 1# 0#
Sarah# 12# 0# 0#
Gena# 8# 0# 0#
Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]###
Sn
.
.
.
Samples
Sample
Alice# 14# 0# 1#
Bob# 10# 1# 1#
Carol# 13# 0# 1#
Dave# 8# 1# 0#
Erin# 11# 0# 0#
Frank# 9# 1# 1#
Gena# 8# 0# 0#
James# 11# 1# 1#
Jessica# 14# 0# 1#
Alice# 14# 0# 1#
Amy# 12# 0# 1#
Bob# 10# 1# 1#
Xavier# 9# 1# 0#
Cathy# 9# 0# 1#
Carol# 13# 0# 1#
Eugene# 13# 1# 0#
Rafael# 12# 1# 1#
Dave# 8# 1# 0#
Peter# 9# 1# 0#
Henry# 13# 1# 0#
Erin# 11# 0# 0#
Rose# 7# 0# 0#
Iain# 8# 1# 1#
Paulo# 12# 1# 0#
Margaret# 10# 0# 1#
Frank# 9# 1# 1#
Jill# 13# 0# 0#
Leon# 10# 1# 0#
Sarah# 12# 0# 0#
Gena# 8# 0# 0#
Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]###
S1
Alice# 14# 0# 1#
Bob# 10# 1# 1#
Carol# 13# 0# 1#
Dave# 8# 1# 0#
Erin# 11# 0# 0#
Frank# 9# 1# 1#
Gena# 8# 0# 0#
James# 11# 1# 1#
Jessica# 14# 0# 1#
Alice# 14# 0# 1#
Amy# 12# 0# 1#
Bob# 10# 1# 1#
Xavier# 9# 1# 0#
Cathy# 9# 0# 1#
Carol# 13# 0# 1#
Eugene# 13# 1# 0#
Rafael# 12# 1# 1#
Dave# 8# 1# 0#
Peter# 9# 1# 0#
Henry# 13# 1# 0#
Erin# 11# 0# 0#
Rose# 7# 0# 0#
Iain# 8# 1# 1#
Paulo# 12# 1# 0#
Margaret# 10# 0# 1#
Frank# 9# 1# 1#
Jill# 13# 0# 0#
Leon# 10# 1# 0#
Sarah# 12# 0# 0#
Gena# 8# 0# 0#
Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]###
S2
Alice# 14# 0# 1#
Bob# 10# 1# 1#
Carol# 13# 0# 1#
Dave# 8# 1# 0#
Erin# 11# 0# 0#
Frank# 9# 1# 1#
Gena# 8# 0# 0#
James# 11# 1# 1#
Jessica# 14# 0# 1#
Alice# 14# 0# 1#
Amy# 12# 0# 1#
Bob# 10# 1# 1#
Xavier# 9# 1# 0#
Cathy# 9# 0# 1#
Carol# 13# 0# 1#
Eugene# 13# 1# 0#
Rafael# 12# 1# 1#
Dave# 8# 1# 0#
Peter# 9# 1# 0#
Henry# 13# 1# 0#
Erin# 11# 0# 0#
Rose# 7# 0# 0#
Iain# 8# 1# 1#
Paulo# 12# 1# 0#
Margaret# 10# 0# 1#
Frank# 9# 1# 1#
Jill# 13# 0# 0#
Sn
.
.
.
Bootstrap Samples
Alice# 14# 0# 1#
Bob# 10# 1# 1#
Carol# 13# 0# 1#
Dave# 8# 1# 0#
Erin# 11# 0# 0#
Frank# 9# 1# 1#
Gena# 8# 0# 0#
James# 11# 1# 1#
Jessica# 14# 0# 1#
Alice# 14# 0# 1#
Amy# 12# 0# 1#
Bob# 10# 1# 1#
Xavier# 9# 1# 0#
Cathy# 9# 0# 1#
Carol# 13# 0# 1#
Eugene# 13# 1# 0#
Rafael# 12# 1# 1#
Dave# 8# 1# 0#
Peter# 9# 1# 0#
Henry# 13# 1# 0#
Erin# 11# 0# 0#
Rose# 7# 0# 0#
Iain# 8# 1# 1#
Paulo# 12# 1# 0#
Margaret# 10# 0# 1#
Frank# 9# 1# 1#
Jill# 13# 0# 0#
Leon# 10# 1# 0#
Sarah# 12# 0# 0#
Gena# 8# 0# 0#
Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]###
Copyright @ Deepak George, IIM Bangalore
6
Bootstrap sampling
Bootstrap sample
should have same
sample size as the
original sample.
With replacement results
in repetition of values
Bootstrap sample on an
average uses only 2/3 of
the data in the original
sample
Copyright @ Deepak George, IIM Bangalore
7
Random Forest
 Problem: Bagging still have relatively high variance
 Goal: Reduce variance of Bagging
 Solution: Along with sampling of data in Bagging, take samples of features also!
 In other words, in building a random forest, at each split in the tree,
the use only a random subset of features instead of all the features.
 This de-correlates the trees.
 Its mathematically proved that 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠 is a good approximate value for
predictor subset size (mtry/max_features).
 Evaluation: A bootstrap sample uses only approximately 2/3 of the observations of original
sample.
 Remaining training data (OOB) are used to estimate error and variable importance
 Hyperparameters are knobs to control bias & variance tradeoff of any
machine learning algorithm.
 Key Hyper parameters
 Max Features – De-correlates the trees
 Number of Trees in the forest – Higher number reduce more variance
Random Forest - Key Hyperparameters
8
Copyright @ Deepak George, IIM Bangalore
Copyright @ Deepak George, IIM Bangalore
9
Random Forest – R Implementation
library(randomForest)
library(MASS) #Contains Boston dataframe
library(caret)
View(Boston)
#Cross Validation
cv.ctrl <- trainControl(method = "repeatedcv", repeats = 2,number = 5, allowParallel=T)
#GridSeach
rf.grid <- expand.grid(mtry = 2:13)
set.seed(1861) ## make reproducible here, but not if generating many random samples
#Hyper Parametertuning
rf_tune <-train(medv~.,
data=Boston,
method="rf",
trControl=cv.ctrl,
tuneGrid=rf.grid,
ntree = 1000,
importance = TRUE)
#Cross Validation results
rf_tune
plot(rf_tune)
#Variable Importance
varImp(rf_tune)
plot(varImp(rf_tune), top = 10)
Copyright @ Deepak George, IIM Bangalore
10
Boosting
 Intuition: Ensemble many “weak” classifiers (typically decision trees) to
produce a final “strong” classifier
 Weak classifier  Error rate is only slightly better than random
guessing.
 Boosting is a Forward Stagewise Additive model
 Boosting sequentially apply the weak classifiers one by one to repeatedly
reweighted versions of the data.
 Each new weak learner in the sequence tries to correct the
misclassification/error made by the previous weak learners.
 Initially all of the weights are set to Wi = 1/N
 For each successive step the observation weights are individually
modified and a new weak learner is fitted on the reweighted
observations.
 At step m, those observations that were misclassified by the
classifier Gm−1(x) induced at the previous step have their weights
increased, whereas the weights are decreased for those that were
classified correctly.
 Final “strong” classifier is based on weighted vote of weak classifiers
X1
X2
AdaBoost – Illustration
11Copyright @ Deepak George, IIM Bangalore
Step 1
Input Data
Initially all observations are
assigned equal weight (1/N)
Observations that are
misclassified in the ith
iteration is given higher
weights in the (i+1)th iteration
Observations that are correctly
classified in the ith iteration is
given lower weights in the
(i+1)th iteration
Copyright @ Deepak George, IIM Bangalore
12
Copyright @ Deepak George, IIM Bangalore
Step 2
Step 3
AdaBoost – Illustration
13
Copyright @ Deepak George, IIM Bangalore
Final Ensemble/Model
AdaBoost – Illustration
AdaBoost - Algorithm
14
Copyright @ Deepak George, IIM Bangalore
 Generalization of AdaBoost to work with arbitrary loss functions resulted in GBM.
Gradient Boosting = Gradient Descent + Boosting
 GBM uses gradient descent algorithm which can optimize any differentiable loss
function.
 In Adaboost, ‘shortcomings’ are identified by high-weight data points.
 In Gradient Boosting,“shortcomings” are identified by negative gradients (also
called pseudo residuals).
 In GBM instead of reweighting used in adaboost, each new tree is fit to the
negative gradients of the previous tree.
 Each tree in GBM is a successive gradient descent step.
Gradient Boosting Machines
15
Copyright @ Deepak George, IIM Bangalore
 AdaBoost is equivalent to forward stagewise additive modeling using the
exponential loss function.
Gradient Boosting - Algorithm
16
Copyright @ Deepak George, IIM Bangalore
 GBM has 3 types of hyper parameters
 Tree Structure
 Max depth of the trees - Controls the degree of features
interactions
 Min samples leaf – Minimum number of samples in leaf node.
 Number of Trees
 Shrinkage
 Learning rate - Slows learning by shrinking tree predictions.
 Unlike fitting a single large decision tree to the data, which amounts
to fitting the data hard and potentially overfitting, the boosting
approach instead learns slowly
 Stochastic Gradient Boosting
 SubSample: Select random subset of the training set for fitting each
tree than using the complete training data.
 Max features: Select random subset of features for each tree.
GBM – Key Hyperparameters
17
Copyright @ Deepak George, IIM Bangalore
Copyright @ Deepak George, IIM Bangalore
18
Tree Ensembles- Interpretation
library(xgboost)
library(MASS) #Contains Boston dataframe
library(caret)
#Cross Validation
cv.ctrl <- trainControl(method = "repeatedcv", repeats = 2,number = 5, allowParallel=T)
#GridSeach
xgb.grid <- expand.grid(nrounds=1000,eta = c(0.005,0.01,0.05,0.1) ,max_depth = c(4,5,6,7,8))
set.seed(1860)
#Model training
xgb_tune <-train(medv~.,
data=Boston,
method="xgbTree",
trControl=cv.ctrl,
tuneGrid=xgb.grid,
importance = TRUE,
subsample =0.8)
#Cross Validation results
xgb_tune
plot(xgb_tune)
#Variable Importance
plot(varImp(xgb_tune), top = 10)
Copyright @ Deepak George, IIM Bangalore
19
GBM – R Implementation
Copyright @ Deepak George, IIM Bangalore
20
End
Questions ?

More Related Content

What's hot

Random forest
Random forestRandom forest
Random forestUjjawal
 
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...Simplilearn
 
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Parth Khare
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and ldaSuresh Pokharel
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersFunctional Imperative
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)Pravinkumar Landge
 
Ensemble learning Techniques
Ensemble learning TechniquesEnsemble learning Techniques
Ensemble learning TechniquesBabu Priyavrat
 
Lect5 principal component analysis
Lect5 principal component analysisLect5 principal component analysis
Lect5 principal component analysishktripathy
 
Random Forest In R | Random Forest Algorithm | Random Forest Tutorial |Machin...
Random Forest In R | Random Forest Algorithm | Random Forest Tutorial |Machin...Random Forest In R | Random Forest Algorithm | Random Forest Tutorial |Machin...
Random Forest In R | Random Forest Algorithm | Random Forest Tutorial |Machin...Simplilearn
 
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...Simplilearn
 
decision tree regression
decision tree regressiondecision tree regression
decision tree regressionAkhilesh Joshi
 
Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Kush Kulshrestha
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsMd. Main Uddin Rony
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forestsViet-Trung TRAN
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning Mohammad Junaid Khan
 
Introduction to random forest and gradient boosting methods a lecture
Introduction to random forest and gradient boosting methods   a lectureIntroduction to random forest and gradient boosting methods   a lecture
Introduction to random forest and gradient boosting methods a lectureShreyas S K
 

What's hot (20)

Random forest
Random forestRandom forest
Random forest
 
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
 
Decision tree
Decision treeDecision tree
Decision tree
 
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and lda
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
 
Ensemble methods
Ensemble methodsEnsemble methods
Ensemble methods
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
 
Ensemble learning Techniques
Ensemble learning TechniquesEnsemble learning Techniques
Ensemble learning Techniques
 
Lect5 principal component analysis
Lect5 principal component analysisLect5 principal component analysis
Lect5 principal component analysis
 
Random Forest In R | Random Forest Algorithm | Random Forest Tutorial |Machin...
Random Forest In R | Random Forest Algorithm | Random Forest Tutorial |Machin...Random Forest In R | Random Forest Algorithm | Random Forest Tutorial |Machin...
Random Forest In R | Random Forest Algorithm | Random Forest Tutorial |Machin...
 
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
 
decision tree regression
decision tree regressiondecision tree regression
decision tree regression
 
Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees
 
Bagging.pptx
Bagging.pptxBagging.pptx
Bagging.pptx
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forests
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
 
Introduction to random forest and gradient boosting methods a lecture
Introduction to random forest and gradient boosting methods   a lectureIntroduction to random forest and gradient boosting methods   a lecture
Introduction to random forest and gradient boosting methods a lecture
 

Viewers also liked

Gbm.more GBM in H2O
Gbm.more GBM in H2OGbm.more GBM in H2O
Gbm.more GBM in H2OSri Ambati
 
GBM package in r
GBM package in rGBM package in r
GBM package in rmark_landry
 
Understanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeUnderstanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeGilles Louppe
 
Automated data analysis with Python
Automated data analysis with PythonAutomated data analysis with Python
Automated data analysis with PythonGramener
 
Gradient boosting in practice: a deep dive into xgboost
Gradient boosting in practice: a deep dive into xgboostGradient boosting in practice: a deep dive into xgboost
Gradient boosting in practice: a deep dive into xgboostJaroslaw Szymczak
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorVivian S. Zhang
 
REV CITYSCAPES 042.GDJ137.V2
REV CITYSCAPES 042.GDJ137.V2REV CITYSCAPES 042.GDJ137.V2
REV CITYSCAPES 042.GDJ137.V2Darryl Moore
 
Landscape architecture
Landscape architectureLandscape architecture
Landscape architectureRaima Hashmi
 
Bird Friendly Architecture
Bird Friendly ArchitectureBird Friendly Architecture
Bird Friendly ArchitectureSurya Ramesh
 
Landscape Architect portfolio
Landscape Architect portfolioLandscape Architect portfolio
Landscape Architect portfolioAhmad Al-khalaqi
 
INTERNAT 014.GDJ137.V1
INTERNAT 014.GDJ137.V1INTERNAT 014.GDJ137.V1
INTERNAT 014.GDJ137.V1Darryl Moore
 
Vegetation in landscape
Vegetation in landscapeVegetation in landscape
Vegetation in landscapeSaima Iqbal
 
National symbols of India
National symbols of IndiaNational symbols of India
National symbols of Indiaglamflower
 

Viewers also liked (20)

Inlining Heuristics
Inlining HeuristicsInlining Heuristics
Inlining Heuristics
 
Gbm.more GBM in H2O
Gbm.more GBM in H2OGbm.more GBM in H2O
Gbm.more GBM in H2O
 
XGBoost (System Overview)
XGBoost (System Overview)XGBoost (System Overview)
XGBoost (System Overview)
 
GBM package in r
GBM package in rGBM package in r
GBM package in r
 
Understanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeUnderstanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to Practice
 
Automated data analysis with Python
Automated data analysis with PythonAutomated data analysis with Python
Automated data analysis with Python
 
GBM theory code and parameters
GBM theory code and parametersGBM theory code and parameters
GBM theory code and parameters
 
Gradient boosting in practice: a deep dive into xgboost
Gradient boosting in practice: a deep dive into xgboostGradient boosting in practice: a deep dive into xgboost
Gradient boosting in practice: a deep dive into xgboost
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
 
REV CITYSCAPES 042.GDJ137.V2
REV CITYSCAPES 042.GDJ137.V2REV CITYSCAPES 042.GDJ137.V2
REV CITYSCAPES 042.GDJ137.V2
 
InternationalNov
InternationalNovInternationalNov
InternationalNov
 
Mangalavanam Bird Sanctuary
Mangalavanam Bird SanctuaryMangalavanam Bird Sanctuary
Mangalavanam Bird Sanctuary
 
this is india
this is indiathis is india
this is india
 
Landscape architecture
Landscape architectureLandscape architecture
Landscape architecture
 
GDJ155.14.v2
GDJ155.14.v2GDJ155.14.v2
GDJ155.14.v2
 
Bird Friendly Architecture
Bird Friendly ArchitectureBird Friendly Architecture
Bird Friendly Architecture
 
Landscape Architect portfolio
Landscape Architect portfolioLandscape Architect portfolio
Landscape Architect portfolio
 
INTERNAT 014.GDJ137.V1
INTERNAT 014.GDJ137.V1INTERNAT 014.GDJ137.V1
INTERNAT 014.GDJ137.V1
 
Vegetation in landscape
Vegetation in landscapeVegetation in landscape
Vegetation in landscape
 
National symbols of India
National symbols of IndiaNational symbols of India
National symbols of India
 

Similar to Decision Tree Ensembles - Bagging, Random Forest & Gradient Boosting Machines

Similar to Decision Tree Ensembles - Bagging, Random Forest & Gradient Boosting Machines (8)

Ensemble methods.pptx
Ensemble methods.pptxEnsemble methods.pptx
Ensemble methods.pptx
 
Readme
ReadmeReadme
Readme
 
Appendix
AppendixAppendix
Appendix
 
Appendix
AppendixAppendix
Appendix
 
BoD presi w notes-1mar2013
BoD presi w notes-1mar2013BoD presi w notes-1mar2013
BoD presi w notes-1mar2013
 
Volume c
Volume cVolume c
Volume c
 
Volume c
Volume cVolume c
Volume c
 
Volume d
Volume dVolume d
Volume d
 

Recently uploaded

BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...amitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...karishmasinghjnh
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 

Recently uploaded (20)

CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 

Decision Tree Ensembles - Bagging, Random Forest & Gradient Boosting Machines

  • 1. Deepak George Senior Data Scientist – Machine Learning Decision Tree Ensembles Bagging, Random Forest & Gradient Boosting Machines December 2015
  • 2.  Education  Computer Science Engineering – College Of Engineering Trivandrum  Business Analytics & Intelligence – Indian Institute Of Management Bangalore  Career  Mu Sigma  Accenture Analytics  Data Science  1st Prize Best Data Science Project (BAI 5) – IIM Bangalore  Top 10% (out of 1100) finish Kaggle Coupon Purchase Prediction (Recommender System)  SAS Certified Statistical Business Analyst: Regression and Modeling Credentials  Statistical Learning – Stanford University  Passion  Photography, Football, Data Science, Machine Learning  Contact  Deepak.george14@iimb.ernet.in  linkedin.com/in/deepakgeorge7 Copyright @ Deepak George, IIM Bangalore 2 About Me
  • 3. Copyright @ Deepak George, IIM Bangalore 3 Bias-Variance Tradeoff Expected test MSE  Bias  Error that is introduced by approximating a complicated relationship, by a much simpler model.  Difference between the truth and what you expect to learn  Underfitting  Variance  Amount by which model would change if we estimated it using a different training data.  If a model has high variance then small changes in the training data can result in large changes in the model.  Overfitting
  • 4. Copyright @ Deepak George, IIM Bangalore 4 Bias-Variance Tradeoff Underfitting Ideal Learner Overfitting
  • 5.  Problem: Decision tree have low bias & suffer from high variance  Goal: Reduce variance of decision trees  Hint: Given set of n independent observations Z1, . . . , Zn, each with variance σ2, the variance of the mean of the observations is given by σ2/n.  In other words, averaging a set of observations reduces variance.  Theoretically: Take multiple independent samples S’ from the population  Fit “bushy”/deep decision trees on each S1,S2…. Sn  Trees are grown deep and are not pruned  Variance reduces linearly & Bias remain unchanged  Practically: We only have one sample/training set & not the population.  So take bootstrap samples i.e. multiple samples from the single sample with replacement  Variance reduces sub-linearly & Bias often increase slightly because bootstrap samples are correlated.  Final Classifier: Average of predictions for regression or majority vote for classification.  High Variance introduced by deep decision trees are mitigated by averaging predictions from each decision trees. Copyright @ Deepak George, IIM Bangalore 5 Bagging Population Alice# 14# 0# 1# Bob# 10# 1# 1# Carol# 13# 0# 1# Dave# 8# 1# 0# Erin# 11# 0# 0# Frank# 9# 1# 1# Gena# 8# 0# 0# James# 11# 1# 1# Jessica# 14# 0# 1# Alice# 14# 0# 1# Amy# 12# 0# 1# Bob# 10# 1# 1# Xavier# 9# 1# 0# Cathy# 9# 0# 1# Carol# 13# 0# 1# Eugene# 13# 1# 0# Rafael# 12# 1# 1# Dave# 8# 1# 0# Peter# 9# 1# 0# Henry# 13# 1# 0# Erin# 11# 0# 0# Rose# 7# 0# 0# Iain# 8# 1# 1# Paulo# 12# 1# 0# Margaret# 10# 0# 1# Frank# 9# 1# 1# Jill# 13# 0# 0# Leon# 10# 1# 0# Sarah# 12# 0# 0# Gena# 8# 0# 0# Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]### Alice# 14# 0# 1# Bob# 10# 1# 1# Carol# 13# 0# 1# Dave# 8# 1# 0# Erin# 11# 0# 0# Frank# 9# 1# 1# Gena# 8# 0# 0# James# 11# 1# 1# Jessica# 14# 0# 1# Alice# 14# 0# 1# Amy# 12# 0# 1# Bob# 10# 1# 1# Xavier# 9# 1# 0# Cathy# 9# 0# 1# Carol# 13# 0# 1# Eugene# 13# 1# 0# Rafael# 12# 1# 1# Dave# 8# 1# 0# Peter# 9# 1# 0# Henry# 13# 1# 0# Erin# 11# 0# 0# Rose# 7# 0# 0# Iain# 8# 1# 1# Paulo# 12# 1# 0# Margaret# 10# 0# 1# Frank# 9# 1# 1# Jill# 13# 0# 0# Leon# 10# 1# 0# Sarah# 12# 0# 0# Gena# 8# 0# 0# Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]### S1 Alice# 14# 0# 1# Bob# 10# 1# 1# Carol# 13# 0# 1# Dave# 8# 1# 0# Erin# 11# 0# 0# Frank# 9# 1# 1# Gena# 8# 0# 0# James# 11# 1# 1# Jessica# 14# 0# 1# Alice# 14# 0# 1# Amy# 12# 0# 1# Bob# 10# 1# 1# Xavier# 9# 1# 0# Cathy# 9# 0# 1# Carol# 13# 0# 1# Eugene# 13# 1# 0# Rafael# 12# 1# 1# Dave# 8# 1# 0# Peter# 9# 1# 0# Henry# 13# 1# 0# Erin# 11# 0# 0# Rose# 7# 0# 0# Iain# 8# 1# 1# Paulo# 12# 1# 0# Margaret# 10# 0# 1# Frank# 9# 1# 1# Jill# 13# 0# 0# Leon# 10# 1# 0# Sarah# 12# 0# 0# Gena# 8# 0# 0# Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]### S2 Alice# 14# 0# 1# Bob# 10# 1# 1# Carol# 13# 0# 1# Dave# 8# 1# 0# Erin# 11# 0# 0# Frank# 9# 1# 1# Gena# 8# 0# 0# James# 11# 1# 1# Jessica# 14# 0# 1# Alice# 14# 0# 1# Amy# 12# 0# 1# Bob# 10# 1# 1# Xavier# 9# 1# 0# Cathy# 9# 0# 1# Carol# 13# 0# 1# Eugene# 13# 1# 0# Rafael# 12# 1# 1# Dave# 8# 1# 0# Peter# 9# 1# 0# Henry# 13# 1# 0# Erin# 11# 0# 0# Rose# 7# 0# 0# Iain# 8# 1# 1# Paulo# 12# 1# 0# Margaret# 10# 0# 1# Frank# 9# 1# 1# Jill# 13# 0# 0# Leon# 10# 1# 0# Sarah# 12# 0# 0# Gena# 8# 0# 0# Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]### Sn . . . Samples Sample Alice# 14# 0# 1# Bob# 10# 1# 1# Carol# 13# 0# 1# Dave# 8# 1# 0# Erin# 11# 0# 0# Frank# 9# 1# 1# Gena# 8# 0# 0# James# 11# 1# 1# Jessica# 14# 0# 1# Alice# 14# 0# 1# Amy# 12# 0# 1# Bob# 10# 1# 1# Xavier# 9# 1# 0# Cathy# 9# 0# 1# Carol# 13# 0# 1# Eugene# 13# 1# 0# Rafael# 12# 1# 1# Dave# 8# 1# 0# Peter# 9# 1# 0# Henry# 13# 1# 0# Erin# 11# 0# 0# Rose# 7# 0# 0# Iain# 8# 1# 1# Paulo# 12# 1# 0# Margaret# 10# 0# 1# Frank# 9# 1# 1# Jill# 13# 0# 0# Leon# 10# 1# 0# Sarah# 12# 0# 0# Gena# 8# 0# 0# Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]### S1 Alice# 14# 0# 1# Bob# 10# 1# 1# Carol# 13# 0# 1# Dave# 8# 1# 0# Erin# 11# 0# 0# Frank# 9# 1# 1# Gena# 8# 0# 0# James# 11# 1# 1# Jessica# 14# 0# 1# Alice# 14# 0# 1# Amy# 12# 0# 1# Bob# 10# 1# 1# Xavier# 9# 1# 0# Cathy# 9# 0# 1# Carol# 13# 0# 1# Eugene# 13# 1# 0# Rafael# 12# 1# 1# Dave# 8# 1# 0# Peter# 9# 1# 0# Henry# 13# 1# 0# Erin# 11# 0# 0# Rose# 7# 0# 0# Iain# 8# 1# 1# Paulo# 12# 1# 0# Margaret# 10# 0# 1# Frank# 9# 1# 1# Jill# 13# 0# 0# Leon# 10# 1# 0# Sarah# 12# 0# 0# Gena# 8# 0# 0# Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]### S2 Alice# 14# 0# 1# Bob# 10# 1# 1# Carol# 13# 0# 1# Dave# 8# 1# 0# Erin# 11# 0# 0# Frank# 9# 1# 1# Gena# 8# 0# 0# James# 11# 1# 1# Jessica# 14# 0# 1# Alice# 14# 0# 1# Amy# 12# 0# 1# Bob# 10# 1# 1# Xavier# 9# 1# 0# Cathy# 9# 0# 1# Carol# 13# 0# 1# Eugene# 13# 1# 0# Rafael# 12# 1# 1# Dave# 8# 1# 0# Peter# 9# 1# 0# Henry# 13# 1# 0# Erin# 11# 0# 0# Rose# 7# 0# 0# Iain# 8# 1# 1# Paulo# 12# 1# 0# Margaret# 10# 0# 1# Frank# 9# 1# 1# Jill# 13# 0# 0# Sn . . . Bootstrap Samples Alice# 14# 0# 1# Bob# 10# 1# 1# Carol# 13# 0# 1# Dave# 8# 1# 0# Erin# 11# 0# 0# Frank# 9# 1# 1# Gena# 8# 0# 0# James# 11# 1# 1# Jessica# 14# 0# 1# Alice# 14# 0# 1# Amy# 12# 0# 1# Bob# 10# 1# 1# Xavier# 9# 1# 0# Cathy# 9# 0# 1# Carol# 13# 0# 1# Eugene# 13# 1# 0# Rafael# 12# 1# 1# Dave# 8# 1# 0# Peter# 9# 1# 0# Henry# 13# 1# 0# Erin# 11# 0# 0# Rose# 7# 0# 0# Iain# 8# 1# 1# Paulo# 12# 1# 0# Margaret# 10# 0# 1# Frank# 9# 1# 1# Jill# 13# 0# 0# Leon# 10# 1# 0# Sarah# 12# 0# 0# Gena# 8# 0# 0# Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]###
  • 6. Copyright @ Deepak George, IIM Bangalore 6 Bootstrap sampling Bootstrap sample should have same sample size as the original sample. With replacement results in repetition of values Bootstrap sample on an average uses only 2/3 of the data in the original sample
  • 7. Copyright @ Deepak George, IIM Bangalore 7 Random Forest  Problem: Bagging still have relatively high variance  Goal: Reduce variance of Bagging  Solution: Along with sampling of data in Bagging, take samples of features also!  In other words, in building a random forest, at each split in the tree, the use only a random subset of features instead of all the features.  This de-correlates the trees.  Its mathematically proved that 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠 is a good approximate value for predictor subset size (mtry/max_features).  Evaluation: A bootstrap sample uses only approximately 2/3 of the observations of original sample.  Remaining training data (OOB) are used to estimate error and variable importance
  • 8.  Hyperparameters are knobs to control bias & variance tradeoff of any machine learning algorithm.  Key Hyper parameters  Max Features – De-correlates the trees  Number of Trees in the forest – Higher number reduce more variance Random Forest - Key Hyperparameters 8 Copyright @ Deepak George, IIM Bangalore
  • 9. Copyright @ Deepak George, IIM Bangalore 9 Random Forest – R Implementation library(randomForest) library(MASS) #Contains Boston dataframe library(caret) View(Boston) #Cross Validation cv.ctrl <- trainControl(method = "repeatedcv", repeats = 2,number = 5, allowParallel=T) #GridSeach rf.grid <- expand.grid(mtry = 2:13) set.seed(1861) ## make reproducible here, but not if generating many random samples #Hyper Parametertuning rf_tune <-train(medv~., data=Boston, method="rf", trControl=cv.ctrl, tuneGrid=rf.grid, ntree = 1000, importance = TRUE) #Cross Validation results rf_tune plot(rf_tune) #Variable Importance varImp(rf_tune) plot(varImp(rf_tune), top = 10)
  • 10. Copyright @ Deepak George, IIM Bangalore 10 Boosting  Intuition: Ensemble many “weak” classifiers (typically decision trees) to produce a final “strong” classifier  Weak classifier  Error rate is only slightly better than random guessing.  Boosting is a Forward Stagewise Additive model  Boosting sequentially apply the weak classifiers one by one to repeatedly reweighted versions of the data.  Each new weak learner in the sequence tries to correct the misclassification/error made by the previous weak learners.  Initially all of the weights are set to Wi = 1/N  For each successive step the observation weights are individually modified and a new weak learner is fitted on the reweighted observations.  At step m, those observations that were misclassified by the classifier Gm−1(x) induced at the previous step have their weights increased, whereas the weights are decreased for those that were classified correctly.  Final “strong” classifier is based on weighted vote of weak classifiers
  • 11. X1 X2 AdaBoost – Illustration 11Copyright @ Deepak George, IIM Bangalore Step 1 Input Data Initially all observations are assigned equal weight (1/N) Observations that are misclassified in the ith iteration is given higher weights in the (i+1)th iteration Observations that are correctly classified in the ith iteration is given lower weights in the (i+1)th iteration Copyright @ Deepak George, IIM Bangalore
  • 12. 12 Copyright @ Deepak George, IIM Bangalore Step 2 Step 3 AdaBoost – Illustration
  • 13. 13 Copyright @ Deepak George, IIM Bangalore Final Ensemble/Model AdaBoost – Illustration
  • 14. AdaBoost - Algorithm 14 Copyright @ Deepak George, IIM Bangalore
  • 15.  Generalization of AdaBoost to work with arbitrary loss functions resulted in GBM. Gradient Boosting = Gradient Descent + Boosting  GBM uses gradient descent algorithm which can optimize any differentiable loss function.  In Adaboost, ‘shortcomings’ are identified by high-weight data points.  In Gradient Boosting,“shortcomings” are identified by negative gradients (also called pseudo residuals).  In GBM instead of reweighting used in adaboost, each new tree is fit to the negative gradients of the previous tree.  Each tree in GBM is a successive gradient descent step. Gradient Boosting Machines 15 Copyright @ Deepak George, IIM Bangalore  AdaBoost is equivalent to forward stagewise additive modeling using the exponential loss function.
  • 16. Gradient Boosting - Algorithm 16 Copyright @ Deepak George, IIM Bangalore
  • 17.  GBM has 3 types of hyper parameters  Tree Structure  Max depth of the trees - Controls the degree of features interactions  Min samples leaf – Minimum number of samples in leaf node.  Number of Trees  Shrinkage  Learning rate - Slows learning by shrinking tree predictions.  Unlike fitting a single large decision tree to the data, which amounts to fitting the data hard and potentially overfitting, the boosting approach instead learns slowly  Stochastic Gradient Boosting  SubSample: Select random subset of the training set for fitting each tree than using the complete training data.  Max features: Select random subset of features for each tree. GBM – Key Hyperparameters 17 Copyright @ Deepak George, IIM Bangalore
  • 18. Copyright @ Deepak George, IIM Bangalore 18 Tree Ensembles- Interpretation
  • 19. library(xgboost) library(MASS) #Contains Boston dataframe library(caret) #Cross Validation cv.ctrl <- trainControl(method = "repeatedcv", repeats = 2,number = 5, allowParallel=T) #GridSeach xgb.grid <- expand.grid(nrounds=1000,eta = c(0.005,0.01,0.05,0.1) ,max_depth = c(4,5,6,7,8)) set.seed(1860) #Model training xgb_tune <-train(medv~., data=Boston, method="xgbTree", trControl=cv.ctrl, tuneGrid=xgb.grid, importance = TRUE, subsample =0.8) #Cross Validation results xgb_tune plot(xgb_tune) #Variable Importance plot(varImp(xgb_tune), top = 10) Copyright @ Deepak George, IIM Bangalore 19 GBM – R Implementation
  • 20. Copyright @ Deepak George, IIM Bangalore 20 End Questions ?