SlideShare une entreprise Scribd logo
1  sur  76
@PRACTICALDLBOOK@PRACTICALDLBOOK
Deep Learning On
Mobile
A PRACTITIONER’S GUIDE
@PRACTICALDLBOOK 2
@PRACTICALDLBOOK@PRACTICALDLBOOK
Deep Learning On
Mobile
A PRACTITIONER’S GUIDE
@PRACTICALDLBOOK
@SiddhaGanju
@MeherKasam
@AnirudhKoul
4
@PRACTICALDLBOOK
Why Deep Learning on Mobile?
Privacy Reliability
Cost Latency
5
@PRACTICALDLBOOK 6
https://media.giphy.com/media/fBzSGPMxD0isw/giphy.gif
@PRACTICALDLBOOK
Latency Is Expensive!
7
100 milliseconds 1% loss
[Amazon 2008]
@PRACTICALDLBOOK@PRACTICALDLBOOK
Latency Is Expensive!
8
>3 sec
load time
53%
bounce
Mobile Site Visits
[Google Research, Webpagetest.org]
@PRACTICALDLBOOK@PRACTICALDLBOOK
Power of 10
9
0.1s
Seamless Uninterrupted
flow of thought
1s 10s
Limit of
attention
[Miller 1968; Card et al. 1991; Nielsen 1993]
@PRACTICALDLBOOK 10
Efficient Mobile
Inference Engine
Efficient
Model+ = DL App
@PRACTICALDLBOOK
How to Train My
Model?
11
@PRACTICALDLBOOK 12
Learn to Play
Melodica
3 Months
@PRACTICALDLBOOK
Already
Play
Piano?
13
@PRACTICALDLBOOK 14
FINE-TUNE
your skills
3months
1week
@PRACTICALDLBOOK@PRACTICALDLBOOK
Fine-tuning
15
Assemble a
dataset
Find a pre-
trained
model
Fine-tune a
pre-trained
model
Run using
existing
frameworks
“Don’t Be A Hero”
- Andrej Karpathy
@PRACTICALDLBOOK@PRACTICALDLBOOK
CustomVision.ai
16
Use Fatkun Browser Extension to download images from Search Engine, or use Bing Image Search API to
programmatically download photos with proper rights
@PRACTICALDLBOOK 17
@PRACTICALDLBOOK 18
@PRACTICALDLBOOK
Demo
19
@PRACTICALDLBOOK
How Do I Run My Models?
20
@PRACTICALDLBOOK 21
Core ML TF Lite ML Kit
@PRACTICALDLBOOK@PRACTICALDLBOOK
Apple Ecosystem
22
Metal
• 2014
BNNS + MPS
• 2016
Core ML
• 2017
Core ML 2
• 2018
Core ML 3
• 2019
- Tiny models (~ KB)!
- 1 bit model quantization support
- Batch API for improved performance
- Conversion support for MXNet, ONNX
- tf-coreml
@PRACTICALDLBOOK@PRACTICALDLBOOK
Apple Ecosystem
23
Metal
• 2014
BNNS + MPS
• 2016
Core ML
• 2017
Core ML 2
• 2018
Core ML 3
• 2019
- On-device training
- Personalization
- Create ML UI
@PRACTICALDLBOOK
Core ML Benchmark
538
129
75
557
109
7877 44 3674 35 3071 33 2926 18 15
0
100
200
300
400
500
600
ResNet-50 MobileNet SqueezeNet
EXECUTION TIME (MS) ON APPLE
DEVICES
iPhone 5s (2013) iPhone 6 (2014) iPhone 6s (2015)
iPhone 7 (2016) iPhone X (2017) iPhone XS (2018)
24
https://heartbeat.fritz.ai/ios-12-core-ml-benchmarks-b7a79811aac1
GPUs became a
thing here!
@PRACTICALDLBOOK@PRACTICALDLBOOK
TensorFlow Ecosystem
25
TensorFlow
• 2015
TensorFlow Mobile
• 2016
TensorFlow Lite
• 2018
Smaller Faster Minimal
dependencies
Allows running
custom operators
@PRACTICALDLBOOK@PRACTICALDLBOOK
TensorFlow Lite is small
26
75KB
Core
Interpreter
1.5MB
TensorFlow
Mobile
400KB
Core Interpreter +
Supported
Operations
@PRACTICALDLBOOK@PRACTICALDLBOOK
TensorFlow Lite is Fast
27
Takes advantage of on-device
hardware acceleration
FlatBuffers
•Reduces code footprint,
memory usage
•Reduces CPU cycles on
serialization and
deserialization
•Improves startup time
Pre-fused activations
Combining batch
normalization layer with
previous Convolution
Static memory and static
execution plan
Decreases load time
@PRACTICALDLBOOK@PRACTICALDLBOOK
TensorFlow Ecosystem
28
TensorFlow
• 2015
TensorFlow Mobile
• 2016
TensorFlow Lite
• 2018
Smaller Faster Minimal
dependencies
Allows running
custom operators
@PRACTICALDLBOOK@PRACTICALDLBOOK
TensorFlow Ecosystem
29
TensorFlow
• 2015
TensorFlow Mobile
• 2016
TensorFlow Lite
• 2018
$ tflite_convert --keras_model_file = keras_model.h5 --output_file=foo.tflite
@PRACTICALDLBOOK@PRACTICALDLBOOK
TensorFlow Ecosystem
30
TensorFlow
• 2015
TensorFlow Mobile
• 2016
TensorFlow Lite
• 2018
Trained
TensorFlow Model
TF Lite Converter .tflite model
Android App
iOS App
@PRACTICALDLBOOK@PRACTICALDLBOOK
ML Kit
31
Simple Abstraction over
TensorFlow Lite
Built in APIs for
Image Labeling, OCR, Face
Detection, Barcode
scanning, Landmark
detection, Smart reply
Model management
with Firebase
Upload model on web
interface to distribute
A/B Testing
@PRACTICALDLBOOK 32
How Do I
Keep My IP
Safe?
@PRACTICALDLBOOK
Fritz
Full fledged mobile lifecycle support
Deployment, instrumentation, etc. from Python
33
@PRACTICALDLBOOK@PRACTICALDLBOOK
Recommendation for Product Development
34
Train a model
using
framework of
choice
Convert to
TensorFlow Lite
format
Upload to
Firebase
Deploy to
iOS/Android
apps with MLKit
@PRACTICALDLBOOK@PRACTICALDLBOOK
An Important Question
35
APP TOO BIG! WHAT DO?
Apple does not allow apps over 200 MB to be
downloaded over cellular network. Download on
demand, and interpret on device instead.
@PRACTICALDLBOOK 36
What Effect Does
Hardware have on
Performance?
@PRACTICALDLBOOK
Big Things Come In Small Packages
37
@PRACTICALDLBOOK
Effect of Hardware
L-R: iPhone XS,
iPhone X, iPhone 5
38
https://twitter.com/matthieurouif/status/1126575118812110854?s=11
@PRACTICALDLBOOK
TensorFlow Lite Benchmarks
Alpha Lab releases Numericcal: http://alpha.lab.numericcal.com/
@PRACTICALDLBOOK 40
TensorFlow Lite Benchmarks
Crowdsourcing AI Benchmark App by Andrey Ignatov from ETH Zurich. http://ai-benchmark.com/
@PRACTICALDLBOOK 41
Alchemy by Fritz
https://alchemy.fritz.ai/
Python library to analyze and
estimate mobile performance
No need to deploy on mobile
@PRACTICALDLBOOK 42
User
Experience
Standpoint
TO GET 95%+ USER COVERAGE,
SUPPORT PHONES RELEASED IN THE
PAST 3.5 YEARS
IF NOT POSSIBLE, OFFER GRACEFUL
DEGRADATION
@PRACTICALDLBOOK 43
Battery
Standpoint
Won’t AI inference kill the battery quickly?
Answers: You don’t usually run AI models constantly, you run
it for a few seconds.
With a modern flagship phone, running Mobilenet
at 30 fps should burn battery in 2-3 hours.
Bigger question - Do you really need to run it at 30
FPS? Or could it be run 1 FPS?
@PRACTICALDLBOOK@PRACTICALDLBOOK
Energy Reduction from 30 FPS to 1 FPS
44
iPad Pro 2017
@PRACTICALDLBOOK 45
What Exciting
Applications Can I
Build?
@PRACTICALDLBOOK
Seeing AI
Audible Barcode recognition
Aim: Help blind users identify products using barcode
Issue: Blind users don’t know where the barcode is
Solution: Guide user in finding a barcode with audio cues
46
@PRACTICALDLBOOK
AR Hand Puppets
Hart Woolery from 2020CV
Object Detection (Hand) + Key Point Estimation
47
[https://twitter.com/2020cv_inc/status/1093219359676280832]
AR Hand Puppets, Hart Woolery from 2020CV, Object Detection (Hand) + Key Point Estimation
@PRACTICALDLBOOK 48
Zero-Gravity Space, Takahiro Horikawa, Mask RCNN (segmentation) + PixMix (Image In-Painting) + Unity (Physics)
@PRACTICALDLBOOK 49
[HomeCourt.ai]Object Detection (Ball, Hoop, Player) + Body Pose + Perspective Transformation
@PRACTICALDLBOOK 50
Polarr, Machine Guided Composition, Automated cropping with highest aesthetic score
@PRACTICALDLBOOK
Remove objects
Brian Schulman, Adventurous Co.
Object Segmentation + Image In Painting
51
https://twitter.com/smashfactory/status/1139461813710442496
@PRACTICALDLBOOK
Magic Sudoku App
Edge Detection + Classification + AR Kit
52
https://twitter.com/braddwyer/status/910030265006923776
@PRACTICALDLBOOK
People Segmentation
AR Kit
Abound Labs https://www.aboundlabs.com/
53
https://twitter.com/nobbis/status/1135975245406515202
@PRACTICALDLBOOK@PRACTICALDLBOOK
Snapchat
54
Face Swap
GANs
@PRACTICALDLBOOK
Can I Make My Model Even More
Efficient?
55
@PRACTICALDLBOOK@PRACTICALDLBOOK
How To Find Efficient Pre-Trained Models
56
Papers with Code
https://paperswithcode.com/sota
Model Zoo
https://modelzoo.co
@PRACTICALDLBOOK@PRACTICALDLBOOK 57
What you can affordWhat you want
@PRACTICALDLBOOK
Model Pruning
Aim: Remove all connections with absolute weights below a threshold
58
Song Han, Jeff Pool, John Tran, William J. Dally, "Learning both Weights and Connections for Efficient Neural Networks", 2015
@PRACTICALDLBOOK
Pruning in Keras
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation=tf.nn.relu),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
59
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(),
prune.Prune(tf.keras.layers.Dense(512, activation=tf.nn.relu)),
tf.keras.layers.Dropout(0.2),
prune.Prune(tf.keras.layers.Dense(10, activation=tf.nn.softmax))
])
@PRACTICALDLBOOK
So many techniques - so little time!
Quantization
Weight sharing
Channel pruning
Filter pruning (ThiNet)
Better Layers (Dilated Conv, HetConv, OctConv)
Knowledge Distillation
Binary networks (BNN, XNOR-Net)
Lottery Ticket Hypothesis
and many more ...
60
@PRACTICALDLBOOK 61
The one with
the best thing
ever
@PRACTICALDLBOOK
Pocket Flow – 1 Line to Make a Model Efficient
Tencent AI Labs created an Automatic Model Compression (AutoMC) framework
62
@PRACTICALDLBOOK 63
Can I design a better
architecture myself?
Maybe? But AI can
do it much better!
@PRACTICALDLBOOK@PRACTICALDLBOOK
AutoML – Let AI Design an Efficient Arch
64
Neural Architecture Search (NAS) - An
automated approach for designing models using
reinforcement learning while maximizing
accuracy.
Hardware Aware NAS = Maximizes accuracy
while minimizing run-time on device
Incorporates latency information into the reward
objective function
Measure real-world inference latency by
executing on a particular platform
1.5x faster than MobileNetV2 (MnasNet)
ResNet-50 accuracy with 19x less parameters
SSD300 mAP with 35x less FLOPs
@PRACTICALDLBOOK 65
Evolution of Mobile NAS Methods
Method Top-1 Acc (%) Pixel-1 Runtime Search Cost
(GPU Hours)
MobileNetV1 70.6 113 Manual
MobileNetV2 72.0 75 Manual
MnasNet 74.0 76 40,000 (4 years+)
ProxylessNas-R 74.6 78 200
Single-Path NAS 74.9 79.5 3.75 hours
@PRACTICALDLBOOK
ProxylessNAS – Per Hardware Tuned CNNs
66
Han Cai and Ligeng Zhu and Song Han, "ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware", ICLR 2019
@PRACTICALDLBOOK 67
@PRACTICALDLBOOK
Can I Train on a Mobile Device?
68
@PRACTICALDLBOOK
On-Device Training in Core ML
69
let updateTask = try MLUpdateTask(
forModelAt: modelUrl,
trainingData: trainingData,
configuration: configuration,
completionHandler: { [weak self]
self.model = context.model
context.model.write(to: newModelUrl)
})
⁻ Core ML 3 introduced on device learning
⁻ Never have to send training data to the server with the
help of MLUpdateTask.
⁻ Schedule training when device is charging to save power
@PRACTICALDLBOOK
Can I Train a Global Model
Without Access to User Data?
70
@PRACTICALDLBOOK 71
FEDERATED LEARNING!!!
https://federated.withgoogle.com/
@PRACTICALDLBOOK 72
TensorFlow
Federated
Train a global model using 1000s of devices
without access to data
Encryption + Secure Aggregation Protocol
Can take a few days to wait for aggregations to
build up
https://github.com/tensorflow/federated
@PRACTICALDLBOOK
What We Learnt Today
73
⁻ Why deep learning on mobile?
⁻ Building a model
⁻ Running a model
⁻ Hardware factors
⁻ Benchmarking
⁻ State-of-the-art applications
⁻ Making a model more efficient
⁻ Federated Learning
@PRACTICALDLBOOK@PRACTICALDLBOOK
How to Access the Slides
in 1 Second
HTTP://PRACTICALDL.AI
@PRACTICALDLBOOK
@PRACTICALDLBOOK
@SiddhaGanju
@MeherKasam
@AnirudhKoul
75
@PRACTICALDLBOOK

Contenu connexe

Tendances

OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroNumenta
 
Efficient and effective passage search via contextualized late interaction ov...
Efficient and effective passage search via contextualized late interaction ov...Efficient and effective passage search via contextualized late interaction ov...
Efficient and effective passage search via contextualized late interaction ov...taeseon ryu
 
AIと最適化の違いをうっかり聞いてしまう前に
AIと最適化の違いをうっかり聞いてしまう前にAIと最適化の違いをうっかり聞いてしまう前に
AIと最適化の違いをうっかり聞いてしまう前にMonta Yashi
 
AI Agent and Chatbot Trends For Enterprises
AI Agent and Chatbot Trends For EnterprisesAI Agent and Chatbot Trends For Enterprises
AI Agent and Chatbot Trends For EnterprisesTeewee Ang
 
Artificial Intelligence Virtual Assistants & Chatbots
Artificial Intelligence Virtual Assistants & ChatbotsArtificial Intelligence Virtual Assistants & Chatbots
Artificial Intelligence Virtual Assistants & ChatbotsaNumak & Company
 
【DL輪読会】Ego-Exo: Transferring Visual Representations from Third-person to Firs...
【DL輪読会】Ego-Exo: Transferring Visual Representations from Third-person to Firs...【DL輪読会】Ego-Exo: Transferring Visual Representations from Third-person to Firs...
【DL輪読会】Ego-Exo: Transferring Visual Representations from Third-person to Firs...Deep Learning JP
 
IOT and Application Performance Monitoring
IOT and Application Performance MonitoringIOT and Application Performance Monitoring
IOT and Application Performance MonitoringSupongkiba Kichu
 
LLM 모델 기반 서비스 실전 가이드
LLM 모델 기반 서비스 실전 가이드LLM 모델 기반 서비스 실전 가이드
LLM 모델 기반 서비스 실전 가이드Tae Young Lee
 
Anaconda navigatorのアップデートが終わらないときの対処方法メモ
Anaconda navigatorのアップデートが終わらないときの対処方法メモAnaconda navigatorのアップデートが終わらないときの対処方法メモ
Anaconda navigatorのアップデートが終わらないときの対処方法メモayohe
 
AIによるアニメ生成の挑戦
AIによるアニメ生成の挑戦AIによるアニメ生成の挑戦
AIによるアニメ生成の挑戦Koichi Hamada
 
複数のGNSSを用いたポーズグラフ最適化
複数のGNSSを用いたポーズグラフ最適化複数のGNSSを用いたポーズグラフ最適化
複数のGNSSを用いたポーズグラフ最適化TaroSuzuki15
 
Machine Learning + Analytics
Machine Learning + AnalyticsMachine Learning + Analytics
Machine Learning + AnalyticsSplunk
 
AndroidでARの夢を再び 〜ARCoreの導入から応用まで
AndroidでARの夢を再び 〜ARCoreの導入から応用までAndroidでARの夢を再び 〜ARCoreの導入から応用まで
AndroidでARの夢を再び 〜ARCoreの導入から応用までKenichi Takahashi
 
Protocols for internet of things
Protocols for internet of thingsProtocols for internet of things
Protocols for internet of thingsCharles Gibbons
 
【DL輪読会】NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics vi...
【DL輪読会】NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics vi...【DL輪読会】NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics vi...
【DL輪読会】NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics vi...Deep Learning JP
 
【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】マルチモーダル 基盤モデル【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】マルチモーダル 基盤モデルDeep Learning JP
 
Customizing LLMs
Customizing LLMsCustomizing LLMs
Customizing LLMsJim Steele
 
非技術者でもわかる(?)コンピュータビジョン紹介資料
非技術者でもわかる(?)コンピュータビジョン紹介資料非技術者でもわかる(?)コンピュータビジョン紹介資料
非技術者でもわかる(?)コンピュータビジョン紹介資料Takuya Minagawa
 

Tendances (20)

OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
 
Efficient and effective passage search via contextualized late interaction ov...
Efficient and effective passage search via contextualized late interaction ov...Efficient and effective passage search via contextualized late interaction ov...
Efficient and effective passage search via contextualized late interaction ov...
 
Iot
IotIot
Iot
 
AIと最適化の違いをうっかり聞いてしまう前に
AIと最適化の違いをうっかり聞いてしまう前にAIと最適化の違いをうっかり聞いてしまう前に
AIと最適化の違いをうっかり聞いてしまう前に
 
AI Agent and Chatbot Trends For Enterprises
AI Agent and Chatbot Trends For EnterprisesAI Agent and Chatbot Trends For Enterprises
AI Agent and Chatbot Trends For Enterprises
 
Artificial Intelligence Virtual Assistants & Chatbots
Artificial Intelligence Virtual Assistants & ChatbotsArtificial Intelligence Virtual Assistants & Chatbots
Artificial Intelligence Virtual Assistants & Chatbots
 
【DL輪読会】Ego-Exo: Transferring Visual Representations from Third-person to Firs...
【DL輪読会】Ego-Exo: Transferring Visual Representations from Third-person to Firs...【DL輪読会】Ego-Exo: Transferring Visual Representations from Third-person to Firs...
【DL輪読会】Ego-Exo: Transferring Visual Representations from Third-person to Firs...
 
IOT and Application Performance Monitoring
IOT and Application Performance MonitoringIOT and Application Performance Monitoring
IOT and Application Performance Monitoring
 
LLM 모델 기반 서비스 실전 가이드
LLM 모델 기반 서비스 실전 가이드LLM 모델 기반 서비스 실전 가이드
LLM 모델 기반 서비스 실전 가이드
 
Anaconda navigatorのアップデートが終わらないときの対処方法メモ
Anaconda navigatorのアップデートが終わらないときの対処方法メモAnaconda navigatorのアップデートが終わらないときの対処方法メモ
Anaconda navigatorのアップデートが終わらないときの対処方法メモ
 
AIによるアニメ生成の挑戦
AIによるアニメ生成の挑戦AIによるアニメ生成の挑戦
AIによるアニメ生成の挑戦
 
複数のGNSSを用いたポーズグラフ最適化
複数のGNSSを用いたポーズグラフ最適化複数のGNSSを用いたポーズグラフ最適化
複数のGNSSを用いたポーズグラフ最適化
 
Machine Learning + Analytics
Machine Learning + AnalyticsMachine Learning + Analytics
Machine Learning + Analytics
 
What Does Interoperability Mean for the IoT?
What Does Interoperability Mean for the IoT?What Does Interoperability Mean for the IoT?
What Does Interoperability Mean for the IoT?
 
AndroidでARの夢を再び 〜ARCoreの導入から応用まで
AndroidでARの夢を再び 〜ARCoreの導入から応用までAndroidでARの夢を再び 〜ARCoreの導入から応用まで
AndroidでARの夢を再び 〜ARCoreの導入から応用まで
 
Protocols for internet of things
Protocols for internet of thingsProtocols for internet of things
Protocols for internet of things
 
【DL輪読会】NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics vi...
【DL輪読会】NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics vi...【DL輪読会】NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics vi...
【DL輪読会】NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics vi...
 
【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】マルチモーダル 基盤モデル【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】マルチモーダル 基盤モデル
 
Customizing LLMs
Customizing LLMsCustomizing LLMs
Customizing LLMs
 
非技術者でもわかる(?)コンピュータビジョン紹介資料
非技術者でもわかる(?)コンピュータビジョン紹介資料非技術者でもわかる(?)コンピュータビジョン紹介資料
非技術者でもわかる(?)コンピュータビジョン紹介資料
 

Similaire à Practical Guide to Deep Learning on Mobile

Siddha Ganju. Deep learning on mobile
Siddha Ganju. Deep learning on mobileSiddha Ganju. Deep learning on mobile
Siddha Ganju. Deep learning on mobileLviv Startup Club
 
Siddha Ganju, NVIDIA. Deep Learning for Mobile
Siddha Ganju, NVIDIA. Deep Learning for MobileSiddha Ganju, NVIDIA. Deep Learning for Mobile
Siddha Ganju, NVIDIA. Deep Learning for MobileIT Arena
 
Dog Breed Classification using PyTorch on Azure Machine Learning
Dog Breed Classification using PyTorch on Azure Machine LearningDog Breed Classification using PyTorch on Azure Machine Learning
Dog Breed Classification using PyTorch on Azure Machine LearningHeather Spetalnick
 
Data science lab enabling flexibility
Data science lab   enabling flexibilityData science lab   enabling flexibility
Data science lab enabling flexibilityKognitio
 
And Then There Are Algorithms
And Then There Are AlgorithmsAnd Then There Are Algorithms
And Then There Are AlgorithmsInfluxData
 
Kusto (Azure Data Explorer) Training for R&D - January 2019
Kusto (Azure Data Explorer) Training for R&D - January 2019 Kusto (Azure Data Explorer) Training for R&D - January 2019
Kusto (Azure Data Explorer) Training for R&D - January 2019 Tal Bar-Zvi
 
Deep learning for Industries
Deep learning for IndustriesDeep learning for Industries
Deep learning for IndustriesRahul Kumar
 
Scaling Face Recognition with Big Data
Scaling Face Recognition with Big DataScaling Face Recognition with Big Data
Scaling Face Recognition with Big DataBogdan Bocse
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
 
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Jason Dai
 
Going deep (learning) with tensor flow and quarkus
Going deep (learning) with tensor flow and quarkusGoing deep (learning) with tensor flow and quarkus
Going deep (learning) with tensor flow and quarkusRed Hat Developers
 
Real Time Processing Using Twitter Heron by Karthik Ramasamy
Real Time Processing Using Twitter Heron by Karthik RamasamyReal Time Processing Using Twitter Heron by Karthik Ramasamy
Real Time Processing Using Twitter Heron by Karthik RamasamyData Con LA
 
JAX London 22: Debugging Microservices "Remocally" in Kubernetes with Telepre...
JAX London 22: Debugging Microservices "Remocally" in Kubernetes with Telepre...JAX London 22: Debugging Microservices "Remocally" in Kubernetes with Telepre...
JAX London 22: Debugging Microservices "Remocally" in Kubernetes with Telepre...Daniel Bryant
 
BISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesBISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesSrinath Perera
 
Need to-know patterns building microservices - java one
Need to-know patterns building microservices - java oneNeed to-know patterns building microservices - java one
Need to-know patterns building microservices - java oneVincent Kok
 
Multiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezMultiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezBig Data Spain
 
Machine learning the next revolution or just another hype
Machine learning   the next revolution or just another hypeMachine learning   the next revolution or just another hype
Machine learning the next revolution or just another hypeJorge Ferrer
 

Similaire à Practical Guide to Deep Learning on Mobile (20)

Siddha Ganju. Deep learning on mobile
Siddha Ganju. Deep learning on mobileSiddha Ganju. Deep learning on mobile
Siddha Ganju. Deep learning on mobile
 
Siddha Ganju, NVIDIA. Deep Learning for Mobile
Siddha Ganju, NVIDIA. Deep Learning for MobileSiddha Ganju, NVIDIA. Deep Learning for Mobile
Siddha Ganju, NVIDIA. Deep Learning for Mobile
 
Dog Breed Classification using PyTorch on Azure Machine Learning
Dog Breed Classification using PyTorch on Azure Machine LearningDog Breed Classification using PyTorch on Azure Machine Learning
Dog Breed Classification using PyTorch on Azure Machine Learning
 
General Learning.pptx
General Learning.pptxGeneral Learning.pptx
General Learning.pptx
 
Data science lab enabling flexibility
Data science lab   enabling flexibilityData science lab   enabling flexibility
Data science lab enabling flexibility
 
And Then There Are Algorithms
And Then There Are AlgorithmsAnd Then There Are Algorithms
And Then There Are Algorithms
 
Kusto (Azure Data Explorer) Training for R&D - January 2019
Kusto (Azure Data Explorer) Training for R&D - January 2019 Kusto (Azure Data Explorer) Training for R&D - January 2019
Kusto (Azure Data Explorer) Training for R&D - January 2019
 
Deep learning for Industries
Deep learning for IndustriesDeep learning for Industries
Deep learning for Industries
 
Scaling Face Recognition with Big Data
Scaling Face Recognition with Big DataScaling Face Recognition with Big Data
Scaling Face Recognition with Big Data
 
Trends in DNN compression
Trends in DNN compressionTrends in DNN compression
Trends in DNN compression
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
 
Going deep (learning) with tensor flow and quarkus
Going deep (learning) with tensor flow and quarkusGoing deep (learning) with tensor flow and quarkus
Going deep (learning) with tensor flow and quarkus
 
Real Time Processing Using Twitter Heron by Karthik Ramasamy
Real Time Processing Using Twitter Heron by Karthik RamasamyReal Time Processing Using Twitter Heron by Karthik Ramasamy
Real Time Processing Using Twitter Heron by Karthik Ramasamy
 
JAX London 22: Debugging Microservices "Remocally" in Kubernetes with Telepre...
JAX London 22: Debugging Microservices "Remocally" in Kubernetes with Telepre...JAX London 22: Debugging Microservices "Remocally" in Kubernetes with Telepre...
JAX London 22: Debugging Microservices "Remocally" in Kubernetes with Telepre...
 
BISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesBISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple Spaces
 
Need to-know patterns building microservices - java one
Need to-know patterns building microservices - java oneNeed to-know patterns building microservices - java one
Need to-know patterns building microservices - java one
 
Multiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezMultiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier Dominguez
 
Microsoft Dryad
Microsoft DryadMicrosoft Dryad
Microsoft Dryad
 
Machine learning the next revolution or just another hype
Machine learning   the next revolution or just another hypeMachine learning   the next revolution or just another hype
Machine learning the next revolution or just another hype
 

Dernier

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 

Dernier (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 

Practical Guide to Deep Learning on Mobile

Notes de l'éditeur

  1. [M] Let’s just say that in the 21st century, the human attention span is a bit low.
  2. Amazon published the results of an A/B test they did way back in 2008! They found that every 100ms increase in latency, correlated with a 1% decrease in profits. Imagine what a couple of seconds can do? Actually we don’t need to imagine.
  3. We have the results from a Google study about a decade later where they found that a loading time of 3 seconds or more on a mobile website resulted in 53% probability of a user leaving the webpage.
  4. If you ever had to tell someone the Moore’s law equivalent for human attention span, you can refer them to these numbers that have been to be true as early as 1968 and as late as 1993. It’s very possible it’s gotten worse since push notifications became a thing. The study found that: 0.1 second is about the limit where the user feels that the system is reacting instantly, like typing characters on the keyboard and having them appear on the screen. no special feedback is necessary except to display the result. 1.0 second is about the limit for the user's flow of thought to stay uninterrupted, even though the user will notice the delay. Normally, no special feedback is necessary during delays of more than 0.1 but less than 1.0 second, but the user does lose the feeling of operating directly on the data. Or about the amount of time I spend thinking before buying the next Apple product. 10 seconds is about the limit for keeping the user's attention focused on the dialogue. For longer delays, users will want to perform other tasks while waiting for the computer to finish, so they should be given feedback indicating when the computer expects to be done. Feedback during the delay is especially important if the response time is likely to be highly variable, since users will then not know what to expect.
  5. At a high level, the recipe for a great deep learning app consists of two things- an efficient inference engine and an efficient model. So how do we train a model?
  6. You don't need Microsoft s or Google’s ocean boiling GPU cluster
  7. We’ve looked at how to do fine-tuning, but we figured it’d cool to demo it live. They say doing a live demo is a bad idea, but what are we if not full of bad ideas. In the spirit of adventure, let’s do it.
  8. [S] We’ve discussing fine-tuning in the context of images, but let’s try it out on a harder input – audio. In the interest of time, we won’t show the process of actually collecting the data, but all we did was to collect or record some audio files for each class. We collected applause because we know there’s going to be a lot of it in this session.
  9. It’s the same revolution that desktop gpus brought about. Core ML models run 38% faster on iOS 12 compared to iOS 11. We’re just at the beginning of an incredible wave of mobile experiences powered by on-device machine learning. Processors like the A12 are going to make it happen.
  10. Minimal dependencies -> Easier to package and deploy
  11. Just one line of code! On 2 billion devices. Google assistant is on 1 billion devices. Photos, Gboard, Gmail, Nest!
  12. Develop one tflite model for both ios and android apps
  13. [s] Even though an iPhone 10s looks smaller than a macbook air, guess what, its stronger.
  14. I could give you the numbers, but they say showing is better than telling. what's the result of all that powerful hardware
  15. Additionally also gives a layer by layer breakdown of how much processing power it needs
  16. 2015 was the year of gpu - Aim for 10 FPS on oldest phones for real-time UX
  17. [s]
  18. As we all have painfully experienced, in real life, what you really want, is not what you can always afford. And that's the same in machine learning, We all know deep learning works if you have large GPU servers, what about when you want to run it on a 3 year old device. What's the number 1 limitation, it turns out to be memory. If you look at image net in the last couple of years, it started with 240 megabytes. VGG was over half a gig. So the question we will solve now is how to get these neural networks do these amazing things yet have a very small memory footprint
  19. Pruning redundant, non-informative weights in a previously trained network reduces the size of the network at inference time. Take a network, prune, and then retrain the remaining connections Train, prune, retrain all in a loop
  20. Art Vandelay here asks a very important question.