VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
기계학습(Machine learning) 입문하기
1. Terry Taewoong Um (terry.t.um@gmail.com)
University of Waterloo
Department of Electrical & Computer Engineering
Terry Taewoong Um
INTRODUCTION TO
MACHINE LEARNING
AND DEEP LEARNING
1
T-robotics.blogspot.com
Facebook.com/TRobotics
2. Terry Taewoong Um (terry.t.um@gmail.com)
CAUTION
• I cannot explain everything
• You cannot get every details
2
• Try to get a big picture
• Get some useful keywords
• Connect with your research
3. Terry Taewoong Um (terry.t.um@gmail.com)
CONTENTS
1. What is Machine Learning?
2. What is Deep Learning?
3
4. Terry Taewoong Um (terry.t.um@gmail.com)
CONTENTS
4
1. What is Machine Learning?
5. Terry Taewoong Um (terry.t.um@gmail.com)
WHAT IS MACHINE LEARNING?
"A computer program is said to learn from experience E
with respect to some class of tasks T and performance
measure P, if its performance at tasks in T, as measured
by P, improves with experience E“ – T. Michell (1997)
Example: A program for soccer tactics
5
T : Win the game
P : Goals
E : (x) Players’ movements
(y) Evaluation
6. Terry Taewoong Um (terry.t.um@gmail.com)
WHAT IS MACHINE LEARNING?
6
“Toward learning robot table tennis”, J. Peters et al. (2012)
https://youtu.be/SH3bADiB7uQ
"A computer program is said to learn from experience E
with respect to some class of tasks T and performance
measure P, if its performance at tasks in T, as measured
by P, improves with experience E“ – T. Michell (1997)
7. Terry Taewoong Um (terry.t.um@gmail.com)
TASKS
7
classification
discrete target values
x : pixels (28*28)
y : 0,1, 2,3,…,9
regression
real target values
x ∈ (0,100)
y : 0,1, 2,3,…,9
clustering
no target values
x ∈ (-3,3)×(-3,3)
"A computer program is said to learn from experience E
with respect to some class of tasks T and performance
measure P, if its performance at tasks in T, as measured
by P, improves with experience E“ – T. Michell (1997)
8. Terry Taewoong Um (terry.t.um@gmail.com)
PERFORMANCE
8
"A computer program is said to learn from experience E
with respect to some class of tasks T and performance
measure P, if its performance at tasks in T, as measured
by P, improves with experience E“ – T. Michell (1997)
classification
0-1 loss function
regression
L2 loss function
clustering
9. Terry Taewoong Um (terry.t.um@gmail.com)
EXPERIENCE
9
"A computer program is said to learn from experience E
with respect to some class of tasks T and performance
measure P, if its performance at tasks in T, as measured
by P, improves with experience E“ – T. Michell (1997)
classification
labeled data
(pixels)→(number)
regression
labeled data
(x) → (y)
clustering
unlabeled data
(x1,x2)
10. Terry Taewoong Um (terry.t.um@gmail.com)
A TOY EXAMPLE
10
? Height(cm)
Weight
(kg)
[Input X]
[Output Y]
11. Terry Taewoong Um (terry.t.um@gmail.com)
11
180 Height(cm)
Weight
(kg)
80
Y = aX+b
Model : Y = aX+b Parameter : (a, b)
[Goal] Find (a,b) which best fits the given data
A TOY EXAMPLE
12. Terry Taewoong Um (terry.t.um@gmail.com)
12
[Analytic Solution]
Least square problem
(from AX = b, X=A#b where
A# is A’s pseudo inverse)
Not always available
[Numerical Solution]
1. Set a cost function
2. Apply an optimization method
(e.g. Gradient Descent (GD) Method)
L
(a,b)
http://www.yaldex.com/game-
development/1592730043_ch18lev1sec4.html
Local minima problem
http://mnemstudio.org/neural-networks-
multilayer-perceptron-design.htm
A TOY EXAMPLE
13. Terry Taewoong Um (terry.t.um@gmail.com)
13
32 Age(year)
Running
Record
(min)
140
WHAT WOULD BE THE CORRECT MODEL?
Select a model → Set a cost function → Optimization
14. Terry Taewoong Um (terry.t.um@gmail.com)
14
? X
Y
WHAT WOULD BE THE CORRECT MODEL?
1. Regularization 2. Nonparametric model
“overfitting”
15. Terry Taewoong Um (terry.t.um@gmail.com)
15
L2 REGULARIZATION
(e.g. w=(a,b) where Y=aX+b)
Avoid a complicated model!
• Another interpretation :
: Maximum a Posteriori (MAP)
http://goo.gl/6GE2ix
http://goo.gl/6GE2ix
16. Terry Taewoong Um (terry.t.um@gmail.com)
16
WHAT WOULD BE THE CORRECT MODEL?
1. Regularization 2. Nonparametric model
training time
error
training error
test error
we should
stop here
training
set
validation
set
test
set
for training
(parameter
optimization)
for early
stopping
(avoid
overfitting)
for evaluation
(measure the
performance)
keep watching the validation error
17. Terry Taewoong Um (terry.t.um@gmail.com)
17
NONPARAMETRIC MODEL
• It does not assume any parametric models (e.g. Y = aX+b, Y=aX2+bX+c, etc.)
• It often requires much more samples
• Kernel methods are frequently applied for modeling the data
• Gaussian Process Regression (GPR), a sort of kernel method, is a widely-used
nonparametric regression method
• Support Vector Machine (SVM), also a sort of kernel method, is a widely-used
nonparametric classification method
kernel function
[Input space] [Feature space]
18. Terry Taewoong Um (terry.t.um@gmail.com)
18
SUPPORT VECTOR MACHINE (SVM)
“Myo”, Thalmic Labs (2013)
https://youtu.be/oWu9TFJjHaM
[Linear classifiers] [Maximum margin]
Support vector Machine Tutorial, J. Weston, http://goo.gl/19ywcj
[Dual formulation] ( )
kernel function
kernel function
19. Terry Taewoong Um (terry.t.um@gmail.com)
19
GAUSSIAN PROCESS REGRESSION (GPR)
https://youtu.be/YqhLnCm0KXY
https://youtu.be/kvPmArtVoFE
• Gaussian Distribution
• Multivariate regression likelihood
posterior
prior
likelihood
prediction conditioning the joint distribution of the observed & predicted values
https://goo.gl/EO54WN
http://goo.gl/XvOOmf
20. Terry Taewoong Um (terry.t.um@gmail.com)
20
DIMENSION REDUCTION
[Original space] [Feature space]
low dim. high dim.
high dim. low dim.
𝑋 → ∅(𝑋)
• Principal Component Analysis
: Find the best orthogonal axes
(=principal components) which
maximize the variance of the data
Y = P X
* The rows in P are m largest eigenvectors
of
1
𝑁
𝑋𝑋 𝑇
(covariance matrix)
21. Terry Taewoong Um (terry.t.um@gmail.com)
21
DIMENSION REDUCTION
http://jbhuang0604.blogspot.kr/2013/04/miss-korea-2013-contestants-face.html
22. Terry Taewoong Um (terry.t.um@gmail.com)
22
SUMMARY - PART 1
• Machine Learning
- Tasks : Classification, Regression, Clustering, etc.
- Performance : 0-1 loss, L2 loss, etc.
- Experience : labeled data, unlabelled data
• Machine Learning Process
(1) Select a parametric / nonparametric model
(2) Set a performance measurement including regularization term
(3) Training data (optimizing parameters) until validation error increases
(4) Evaluate the final performance using test set
• Nonparametric model : Support Vector Machine, Gaussian Process Regression
• Dimension reduction : used as pre-processing data