Deep learning continues to push the state of the art in domains such as computer vision, natural language understanding and recommendation engines. One of the key reasons for this progress is the availability of highly flexible and developer friendly deep learning frameworks. Apache MXNet is a fully-featured, flexibly-programmable and ultra-scalable deep learning framework supporting innovative deep models including convolutional neural networks (CNNs), and long short-term memory networks (LSTMs). This Tech Talk will show you how to launch the deep learning cloud formation template and deploy the deep learning AMI to train your own deep neural network, using MNIST, to recognize handwritten digits and test it for accuracy.
Learning Objectives:
- Learn about the features and benefits of Apache MXNet
- Learn about the deep learning AMIs with the tools you need for DL
- Learn how to train a neural network using MXNet
2. Agenda
• Apache MXNet introduction
• Distributed Deep Learning with AWS Cloudformation
• Deep Learning motivation and basics
• MXNet programing model overview
• Train our first neural network using MXNet
3. Deep Learning Applications
Significantly improve many applications on multiple domains
image understanding speech recognition natural language
processing
autonomy
• Netflix – Recommendation Engine
• FINRA – Anonmaly detection, Sequence matching
• TuSimple - Computer Vision for Autonomous Driving
• Pinterest - Image recognition search
• Mapillary - Computer vision for crowd sourced maps
AI Customers on AWS
4. AI Services
AI Platform
AI Engines
Amazon
Rekognition
Amazon
Polly
Amazon
Lex
More to come
in 2017
Amazon
Machine Learning
Amazon Elastic
MapReduce
Spark &
SparkML
More to come
in 2017
Apache
MXNet
TensorFlow Caffe Theano KerasTorch CNTK
P2 ECS LambdaEMR/Spark GreenGrass FPGA
More to come
in 2017
Hardware
Democratizing Artificial Intelligence
5. Apache MXNet
Programmable Portable High Performance
Near linear scaling
across hundreds of GPUs
Highly efficient
models for mobile
and IoT
Simple syntax,
multiple languages
88% efficiency
on 256 GPUs
Resnet 1024 layer network
is ~4GB
12. Artificial Neuron
output
synaptic
weights
• Input
Vector of training data x
• Output
Linear function of inputs
• Nonlinearity
Transform output into desired range
of values, e.g. for classification we
need probabilities [0, 1]
• Training
Learn the weights w and bias b
13. Deep Neural Network
hidden layers
The optimal size of the hidden
layer (number of neurons) is
usually between the size of the
input and size of the output
layers
Input layer
output
14. The “Learning” in Deep Learning
0.4 0.3
0.2 0.9
...
back propogation (gradient descent)
X1 != X
0.4 ± 𝛿 0.3 ± 𝛿
new
weights
new
weights
0
1
0
1
1
.
.
-
-
X
input
label
...
X1
17. import numpy as np
a = np.ones(10)
b = np.ones(10) * 2
c = b * a
• Straightforward and flexible.
• Take advantage of language
native features (loop,
condition, debugger)
• E.g. Numpy, Matlab, Torch, …
• Hard to optimize
PROS
CONS
d = c + 1c
Easy to tweak
with python codes
Imperative Programing
18. • More chances for optimization
• Cross different languages
• E.g. TensorFlow, Theano,
Caffe
• Less flexible
PROS
CONS
C can share memory with D
because C is deleted later
A = Variable('A')
B = Variable('B')
C = B * A
D = C + 1
f = compile(D)
d = f(A=np.ones(10),
B=np.ones(10)*2)
A B
1
+
X
Declarative Programing
19. IMPERATIVE
NDARRAY API
DECLARATIVE
SYMBOLIC
EXECUTOR
>>> import mxnet as mx
>>> a = mx.nd.zeros((100, 50))
>>> b = mx.nd.ones((100, 50))
>>> c = a + b
>>> c += 1
>>> print(c)
>>> import mxnet as mx
>>> net = mx.symbol.Variable('data')
>>> net = mx.symbol.FullyConnected(data=net, num_hidde
>>> net = mx.symbol.SoftmaxOutput(data=net)
>>> texec = mx.module.Module(net)
>>> texec.forward(data=c)
>>> texec.backward()
NDArray can be set
as input to the graph
MXNet: Mixed programming paradigm
21. MXNet Overview
• Founded by: U.Washington, Carnegie Mellon U. (~1.5yrs old)
• Recently Accepted to the Apache Incubator
• State of the Art Model Support: Convolutional Neural Networks (CNN), Long
Short-Term Memory (LSTM)
• Scalable: Near-linear scaling equals fastest time to model
• Multi-language: Support for Scala, Python, R, etc.. for legacy code leverage and
easy integration with Spark
• Ecosystem: Vibrant community from Academia and Industry
Open Source Project on Github | Apache-2 Licensed
22. Application Examples | Python notebooks
• https://github.com/dmlc/mxnet-notebooks
• Basic concepts
• NDArray - multi-dimensional array computation
• Symbol - symbolic expression for neural networks
• Module - neural network training and inference
• Applications
• MNIST: recognize handwritten digits
• Check out the distributed training results
• Predict with pre-trained models
• LSTMs for sequence learning
• Recommender systems
• Train a state of the art Computer Vision model (CNN)
• Lots more..
23. Call to Action
MXNet Resources:
• MXNet Blog Post | AWS Endorsement
• Read up on MXNet and Learn More: mxnet.io
• MXNet Github Repo
• MXNet Recommender Systems Talk | Leo Dirac
Developer Resources:
• Deep Learning AMI | Amazon Linux
• Deep Learning AMI | Ubuntu – NEW!!!
• P2 Instance Information
• CloudFormation Template Instructions
• Deep Learning Benchmark
• MXNet on Lambda
• MXNet on ECS/Docker
• MXNet on Raspberry Pi | Wine Detector