This Random Forest Algorithm Presentation will explain how Random Forest algorithm works in Machine Learning. By the end of this video, you will be able to understand what is Machine Learning, what is classification problem, applications of Random Forest, why we need Random Forest, how it works with simple examples and how to implement Random Forest algorithm in Python.
Below are the topics covered in this Machine Learning Presentation:
1. What is Machine Learning?
2. Applications of Random Forest
3. What is Classification?
4. Why Random Forest?
5. Random Forest and Decision Tree
6. Comparing Random Forest and Regression
7. Use case - Iris Flower Analysis
- - - - - - - -
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars.This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
- - - - - - -
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
- - - - - -
What skills will you learn from this Machine Learning course?
By the end of this Machine Learning course, you will be able to:
1. Master the concepts of supervised, unsupervised and reinforcement learning concepts and modeling.
2. Gain practical mastery over principles, algorithms, and applications of Machine Learning through a hands-on approach which includes working on 28 projects and one capstone project.
3. Acquire thorough knowledge of the mathematical and heuristic aspects of Machine Learning.
4. Understand the concepts and operation of support vector machines, kernel SVM, naive Bayes, decision tree classifier, random forest classifier, logistic regression, K-nearest neighbors, K-means clustering and more.
5. Be able to model a wide variety of robust Machine Learning algorithms including deep learning, clustering, and recommendation systems
- - - - - - -
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine Learning | Simplilearn
1. What’s in it for you?
What is Machine Learning
Applications of Random Forest
What is Classification
Why Random Forest
What is Random Forest
Random Forest and Decision Tree
Use Case
How does Random Forest work
2. Remote
Sensing
Used in ETM devices to
acquire images of the
earth’s surface.
Accuracy is higher and
training time is less
Object Detection
Multiclass object
detection is done
using Random
Forest algorithms
Provides better
detection in
complicated
environments
Kinect
Random Forest is
used in a game
console called
Kinect
Tracks body
movements and
recreates it in the
game
Application of Random Forest
3. Remote
Sensing
Used in ETM devices to
acquire images of the
earth’s surface.
Accuracy is higher and
training time is less
Object Detection
Multiclass object
detection is done
using Random
Forest algorithms
Provides better
detection in
complicated
environments
Kinect
Random Forest is
used in a game
console called
Kinect
Tracks body
movements and
recreates it in the
game
Application of Random Forest
4. Kinect
Random Forest is
used in a game
console called
Kinect
Tracks body
movements and
recreates it in the
game
Application of Random Forest
5. Application of Random Forest
User performs a step Kinect registers the movement Marks the user based on accuracy
6. Application of Random Forest
User performs a step Kinect registers the movement Marks the user based on accuracy
Training set to identify body parts Random forest classifier
learns
Identifies the body parts
while dancing
Score game avatar based
on accuracy
7. What’s in it for you?
What is Machine Learning?
Applications of Random Forest
What is Classification?
Why Random Forest?
What is Random Forest?
Random Forest and Decision Tree
Comparing Random Forest and Regression
Use Case – Iris Flower Analysis
22. Why Random Forest?
No overfitting
Use of multiple trees
reduce the risk of
overfitting
Training time is less
Estimates missing data
Random Forest
can maintain
accuracy when a
large proportion of
data is missing
High accuracy
For large data, it
produces highly
accurate predictions
Runs efficiently on large
database
24. Random forest or Random Decision Forest is a method that operates by constructing multiple Decision Trees during training
phase.
The Decision of the majority of the trees is chosen by the random forest as the final decision
What is Random Forest?
25. Random forest or Random Decision Forest is a method that operates by constructing multiple Decision Trees during training
phase.
The Decision of the majority of the trees is chosen by the random forest as the final decision
Decision Tree 1
Output 1
What is Random Forest?
26. Random forest or Random Decision Forest is a method that operates by constructing multiple Decision Trees during training
phase.
The Decision of the majority of the trees is chosen by the random forest as the final decision
Decision Tree 1
Decision Tree 2
Output 1
Output 2
What is Random Forest?
27. Random forest or Random Decision Forest is a method that operates by constructing multiple Decision Trees during training
phase.
The Decision of the majority of the trees is chosen by the random forest as the final decision
Decision Tree 1
Decision Tree 2
Decision Tree 3
Output 1
Output 2
Output 3
What is Random Forest?
28. Random forest or Random Decision Forest is a method that operates by constructing multiple Decision Trees during training
phase.
The Decision of the majority of the trees is chosen by the random forest as the final decision
Decision Tree 1
Decision Tree 2
Decision Tree 3
Random Forest
Output 1
Output 2
Output 3
What is Random Forest?
29. Random forest or Random Decision Forest is a method that operates by constructing multiple Decision Trees during training
phase.
The Decision of the majority of the trees is chosen by the random forest as the final decision
Decision Tree 1
Decision Tree 2
Decision Tree 3
Random Forest
Output 1
Output 2
Output 3
What is Random Forest?
30. Random forest or Random Decision Forest is a method that operates by constructing multiple Decision Trees during training
phase.
The Decision of the majority of the trees is chosen by the random forest as the final decision
Decision Tree 1
Decision Tree 2
Decision Tree 3
Random Forest
Output 1
Output 2
Output 3
Majority
What is Random Forest?
31. Random forest or Random Decision Forest is a method that operates by constructing multiple Decision Trees during training
phase.
The Decision of the majority of the trees is chosen by the random forest as the final decision
Decision Tree 1
Decision Tree 2
Decision Tree 3
Random Forest
Output 1
Output 2
Output 3
Majority
Final Decision
What is Random Forest?
32. Random forest or Random Decision Forest is a method that operates by constructing multiple Decision Trees during training
phase.
The Decision of the majority of the trees is chosen by the random forest as the final decision
Decision Tree 1
Decision Tree 2
Decision Tree 3
Random Forest
Output 1
Output 2
Output 3
Majority
Final Decision
What is Random Forest?
34. Decision Tree
Decision Tree is a tree shaped diagram used to determine a course of action. Each branch of the tree represents a
possible decision, occurrence or reaction
Is diameter>=3
False True
Is color Orange?
False True
35. Decision Tree is a Graph that uses branching method to illustrate every possible outcome of a decision
Entropy Information gain Leaf Node Decision
Node
Root Node
Entropy is the
measure of
randomness or
unpredictability in
the dataset
Decision Tree- Important Terms
36. Entropy
Entropy is the
measure of
randomness or
unpredictability in
the dataset
High entropy E1
Decision Tree - Important Terms
37. Entropy
Entropy is the
measure of
randomness or
unpredictability in
the dataset
Initial Dataset
Decision split
Set 1 Set 2
High entropy E1
Decision Tree - Important Terms
38. Entropy
Entropy is the
measure of
randomness or
unpredictability in
the dataset
Initial Dataset
Decision split
Set 1 Set 2
High entropy E1
Decision Tree - Important Terms
39. Entropy
Entropy is the
measure of
randomness or
unpredictability in
the dataset
Initial Dataset
Decision Split
Set 1 Set 2
High entropy
Lower entropy
After
Splitting
E1
E2
Decision Tree - Important Terms
40. Decision Tree is a Graph that uses branching method to illustrate every possible outcome of a decision
Entropy Information gain Leaf Node Decision
Node
Root Node
It is the measure of
decrease in entropy
after the dataset is
split
Decision Tree - Important Terms
41. Decision Tree is a Graph that uses branching method to illustrate every possible outcome of a decision
EntropyInformation gain
It is the measure of
decrease in entropy
after the dataset is
split
High entropy
Lower entropy
After
Splitting
E1
E2
Initial Dataset
Decision Split
Set 1 Set 2
E1
E2 E2
Decision Tree - Important Terms
42. Decision Tree is a Graph that uses branching method to illustrate every possible outcome of a decision
EntropyInformation gain
It is the measure of
decrease in entropy
after the dataset is
split
High entropy
Lower entropy
After
splitting
E1
E2
Information gain= E1-E2
Initial Dataset
Decision Split
Set 1 Set 2
E1
E2 E2
Decision Tree - Important Terms
43. Decision Tree is a Graph that uses branching method to illustrate every possible outcome of a decision
Entropy Information gain Leaf Node Decision
Node
Root Node
Leaf node
carries the
classification or
the decision
Decision Tree - Important Terms
44. EntropyLeaf Node
Leaf node
carries the
classification or
the decision
Initial Dataset
Decision Split
Set 1 Set 2
E1
E2 E2
Leaf Node
Decision Tree - Important Terms
45. Decision Tree is a Graph that uses branching method to illustrate every possible outcome of a decision
Entropy Information gain Leaf Node Decision
Node
Root Node
Decision node
has two or more
branches
Decision Tree - Important Terms
47. Decision Tree is a Graph that uses branching method to illustrate every possible outcome of a decision
Entropy Information gain Leaf Node Decision
Node
Root Node
The top most
Decision node is
known as the
Root node
Decision Tree - Important Terms
48. Decision Tree is a Graph that uses branching method to illustrate every possible outcome of a decision
EntropyRoot Node
The top most
Decision node is
known as the
Root node
Initial Dataset
Decision Split
Set 1 Set 2
E1
E2 E2
Root Node
Decision Tree - Important Terms
52. Problem statement
To classify the different types of fruits in
the bowl based on different features
How does a Decision Tree work?
53. Problem statement
To classify the different types of fruits in
the bowl based on different features
The dataset(bowl) is looking quite messy
and the entropy is high in this case
How does a Decision Tree work?
54. Problem statement
To classify the different types of fruits in
the bowl based on different features
The dataset(bowl) is looking quite messy
and the entropy is high in this case
Training Dataset
Color Diameter Label
Red
Yellow
Purple
Red
Yellow
3
1
1
3
Purple
3
3
Apple
Apple
Lemon
Lemon
Grapes
Grapes
How does a Decision Tree work?
55. How to split the data
We have to frame the conditions that split
the data in such a way that the
information gain is the highest
How does a Decision Tree work?
56. How to split the data
We have to frame the conditions that split
the data in such a way that the
information gain is the highest
Note
Gain is the measure of
decrease in entropy after
splitting
How does a Decision Tree work?
57. Now we will try to choose a
condition that gives us the
highest gain
How does a Decision Tree work?
58. Now we will try to choose a
condition that gives us the
highest gain
We will do that by splitting
the data using each condition
and checking the gain that
we get out them.
How does a Decision Tree work?
59. We will do that by splitting
the data using each condition
and checking the gain that
we get out them.
The condition that gives us
the highest gain will be used
to make the first split
How does a Decision Tree work?
60. Conditions
Training Dataset
Color Diameter Label
Red
Yellow
purple
Red
Yellow
3
1
1
3
purple
3
3
Apple
Apple
Lemon
Lemon
Grapes
Grapes
Color== purple?
Diameter=3
Color==Yellow?
Color== Red?
Diameter=1
How does a Decision Tree work?
61. Conditions
Training Dataset
Color Diameter Label
Red
Yellow
purple
Red
Yellow
3
1
1
3
purple
3
3
Apple
Apple
Lemon
Lemon
Grapes
Grapes
Color== purple?
Diameter=3
Color==Yellow?
Color== Red?
Diameter=1
Let’s say this condition gives us the
maximum gain
How does a Decision Tree work?
Color==Yellow?
62. We split the data
Is diameter >=3?
False True
How does a Decision Tree work?
63. Is diameter >=3?
False True
The entropy after splitting has
decreased considerably
How does a Decision Tree work?
64. Is diameter >=3?
False True
The entropy after splitting has
decreased considerablyThis NODE has already
attained an entropy value
of zero
As you can see, there is
only one kind of label left
for this branch
How does a Decision Tree work?
65. Is diameter >=3?
False True
So no further splitting is
required for this NODE
How does a Decision Tree work?
66. Is diameter >=3?
False True
So no further splitting is
required for this NODE
However, this node still
requires a split to decrease
the entropy further
How does a Decision Tree work?
67. Is diameter>=3
False True
Is color yellow?
False True
So, we split the right
node further based on
the color
How does a Decision Tree work?
68. Is diameter>=3
False True
Is color yellow?
False True
So the entropy in this
case is now zero
How does a Decision Tree work?
69. Is diameter>=3
False True
Is color yellow?
False True
Predict lemon with
100% accuracy
How does a Decision Tree work?
70. Is diameter>=3
False True
Is color yellow?
False True
Predict apple with
100% accuracy
How does a Decision Tree work?
71. Is diameter>=3
False True
Is color yellow?
False True
Predict grapes
with 100%
accuracy
How does a Decision Tree work?
74. Let this beTree 2
Is color=red
FalseTrue
Is shape==circle?
False true
How does a Random Forest work?
75. Let this beTree 3
Is diameter=1
FalseTrue
Grows in summer?
False True
How does a Random Forest work?
76. Is diameter>=3
False True
Is color Orange?
False True
Is color=red
FalseTrue
Is shape==circle?
False true
Is diameter=1
FalseTrue
Grows in summer?
False True
tree 1 tree 2 tree 3
How does a Random Forest work?
77. Now Lets try to classify this
fruit
How does a Random Forest work?
78. Is diameter>=3
False True
Is color Orange?
False True
Tree 1 classifies it as an
orange
How does a Random Forest work?
Diameter = 3
Colour = orange
Grows in summer = yes
SHAPE = CIRCLE
79. Tree 2 classifies it as
cherries
Is color=red
FalseTrue
Is shape==circle?
False True
How does a Random Forest work?
Diameter = 3
Colour = orange
Grows in summer = yes
SHAPE = CIRCLE
80. Tree 3 classifies it as
orange
Is diameter=1
FalseTrue
Grows in summer?
False True
How does a Random Forest work?
Diameter = 3
Colour = orange
Grows in summer = yes
SHAPE = CIRCLE
81. Is diameter>=3
False True
Is color Orange?
False True
Is color=red
FalseTrue
Is shape==circle?
False True
Is diameter=1
FalseTrue
Grows in summer?
False True
Tree 1 Tree 2 Tree 3
How does a Random Forest work?
82. How does a Random Forest work?
orange
Is diameter>=3
False True
Is color Orange?
False True
Is color=red
FalseTrue
Is shape==circle?
False True
Is diameter=1
FalseTrue
Grows in summer?
False True
Tree 1 Tree 2 Tree 3
83. How does a Random Forest work?
cherryorange
Is diameter>=3
False True
Is color Orange?
False True
Is color=red
FalseTrue
Is shape==circle?
False True
Is diameter=1
FalseTrue
Grows in summer?
False True
Tree 1 Tree 2 Tree 3
84. How does a Random Forest work?
orangecherryorange
Is diameter>=3
False True
Is color Orange?
False True
Is color=red
FalseTrue
Is shape==circle?
False True
Is diameter=1
FalseTrue
Grows in summer?
False True
Tree 1 Tree 2 Tree 3
85. How does a Random Forest work?
orangecherryorange
Is diameter>=3
False True
Is color Orange?
False True
Is color=red
FalseTrue
Is shape==circle?
False True
Is diameter=1
FalseTrue
Grows in summer?
False True
Tree 1 Tree 2 Tree 3
90. Wonder what species of Iris do these
flowers belong to?
Use Case - Problem Statement
91. Let’s try to predict the species of the
flowers using machine learning in Python
Use Case - Problem Statement
92. Let’s See how it can be done
Use Case - Implementation
93. # Loading the library with the iris dataset
from sklearn.datasets import load_iris
# Loading scikit's random forest classifier library
from sklearn.ensemble import RandomForestClassifier
# Loading pandas
import pandas as pd
# Loading numpy
import numpy as np
# Setting random seed
np.random.seed(0)
Use Case - Implementation
94. # Creating an object called iris with the iris data
iris = load_iris()
# Creating a dataframe with the four feature variables
df = pd.DataFrame(iris.data, columns=iris.feature_names)
# Viewing the top 5 rows
df.head()
Use Case - Implementation
95. # Adding a new column for the species name
df['species'] = pd.Categorical.from_codes(iris.target,
iris.target_names)
# Viewing the top 5 rows
df.head()
Use Case - Implementation
96. # Creating Test and Train Data
df['is_train'] = np.random.uniform(0, 1, len(df)) <= .75
# View the top 5 rows
df.head()
Use Case - Implementation
97. # Creating dataframes with test rows and training rows
train, test = df[df['is_train']==True], df[df['is_train']==False]
# Show the number of observations for the test and training dataframes
print('Number of observations in the training data:', len(train))
print('Number of observations in the test data:',len(test))
Use Case - Implementation
98. # Create a list of the feature column's names
features = df.columns[:4]
# View features
features
Use Case - Implementation
99. # Converting each species name into digits
y = pd.factorize(train['species'])[0]
# Viewing target
y
Use Case - Implementation
100. # Creating a random forest Classifier.
clf = RandomForestClassifier(n_jobs=2, random_state=0)
# Training the classifier
clf.fit(train[features], y)
Use Case - Implementation
101. # Applying the trained Classifier to the test
clf.predict(test[features])
Use Case - Implementation
102. # Viewing the predicted probabilities of the first 10
observations
clf.predict_proba(test[features])[0:10]
Use Case - Implementation
103. # mapping names for the plants for each predicted plant class
preds = iris.target_names[clf.predict(test[features])]
# View the PREDICTED species for the first five
observations
preds[0:5]
Use Case - Implementation
104. # Viewing the ACTUAL species for the first five observations
test['species'].head()
Use Case - Implementation
105. # Creating confusion matrix
pd.crosstab(test['species'], preds, rownames=['Actual Species'],
colnames=['Predicted Species'])
Use Case - Implementation
106. Use Case - Implementation
Total number of predictions = 32
107. Use Case - Implementation
Number of accurate predictions = 30
108. Use Case - Implementation
Number of inaccurate predictions = 2
109. Use Case - Implementation
ModelAccuracy
30
32
X 100 = 93
110. Use Case - Implementation
So the model accuracy
is 93%
ModelAccuracy
30
32
X 100 = 93