2. Outline
● Problem Definition
● Motivation
● Training a Regression DNN
● Training a Classification DNN
● Open Source Packages
● Summary + Questions
2
4. Tutorial
● Goal: Detect facial
landmarks on (normal)
face images
● Data set provided by
Dr. Yoshua Bengio
● Tutorial code available:
https://github.com/dnouri/kfkd-tutorial/blob/master/kfkd.py
4
8. Python DL Framework
Wrapper to Lasagne
Theano extension for Deep Learning
Define, optimize, and evaluate mathematical expressions
Efficient Cuda GPU for DNN
8
Low Level
High Level
HW Supports: GPU & CPU
OS: Linux, OS X, Windows
9. Training a Deep Neural Network
1. Data Analysis
2. Architecture Engineering
3. Optimization
4. Training the DNN
9
10. Training a Deep Neural Network
1. Data Analysis
a. Exploration + Validation
b. Pre-Processing
c. Batch and Split
2. Architecture Engineering
3. Optimization
4. Training the DNN
10
11. Data Exploration + Validation
Data:
● 7K gray-scale images of detected faces
● 96x96 pixels per image
● 15 landmarks per image (?)
Data validation:
● Some Landmarks are missing
11
1
15. Training a Deep Neural Network
1. Data Analysis
2. Architecture Engineering
a. Layers Definition
b. Layers Implementation
3. Optimization
4. Training
15
22. Training a Deep Neural Network
1. Data Analysis
2. Architecture Engineering
3. Optimization
a. Back Propagation
b. Objective
c. SGD
d. Updates
e. Convergence Tuning
4. Training the DNN 22
32. Training a Deep Neural Network
1. Data Analysis
2. Architecture Engineering
3. Optimization
4. Training the DNN
a. Fit
b. Fine Tune Pre-Trained
c. Learning Curves
32
42. Outline
● Problem Definition
● Motivation
● Training a regression DNN
● Training a classification DNN
● Improving the DNN
● Open Source Packages
● Summary
42
43. Matlab DL Framework
Open Source CNN Toolbox by
Numerical computing using Parallel Computing Toolbox
Efficient Cuda GPU for DNN
43
Low Level
High Level
HW Supports: GPU & CPU
OS: Linux, OS X, Windows
44. Problem Statement
Classify a, b, …, z images into 26 classes:
44http://www.robots.ox.ac.uk/~vgg/practicals/cnn/
Bonus - OCR:
45. Training a Deep Neural Network
1. Data Analysis
2. Training the DNN
3. Architecture Engineering
4. Optimization
45
65. Beyond Training
1. Training a classification DNN
2. Improving the DNN
a. Analysis Capabilities
b. Augmentation
3. Open Source Packages
4. Summary
65
72. Deal with NaN
1. If in first 100 iterations
a. Learning rate is too high
2. Beyond 100 iterations
a. Gradient explosion
i. Consider gradient clipping
b. Illegal math operation
i. SoftMax: inf/inf
ii. Division by zero by one of your customized layers
72http://russellsstewart.com//notes/0.html
73. The Net Doesn’t Learn Anything
1. Training loss does not reduce after first 100 iterations
a. Reduce the training size to 10 instances (images) to overfit it
i. Achieve 100% training accuracy on a small portion of data
b. Change batch size to 1 to and monitor the error per batch
c. Solve the simplest version of your problem
73
http://russellsstewart.com//notes/0.html
74. Beyond Training
1. Training a classification DNN
2. Improving the DNN
3. Open Source Packages
a. DL Open Source Packages
b. Effort Estimation
4. Summary
74
75. Tips from Other Packages
Torch code organization Caffe’s separation
configuration ↔code
NeuralNet → YAML text format
defining experiment’s configuration
75
76. DL Open Source Packages
76
Caffe & MatConvNet for applications
Torch, TensorFlow and Theano for research on DL
http://fastml.com/torch-vs-theano/
Simple dnnComplex dnn