A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning Pipelines by Jules Damji, Spark Community Evangelist, Databricks
We all know what they say – the bigger the data, the better. But when the data gets really big, how do you use it? This talk will cover three of the most popular deep learning frameworks: TensorFlow, Keras, and Deep Learning Pipelines, and when, where, and how to use them. We’ll also discuss their integration with distributed computing engines such as Apache Spark (which can handle massive amounts of data), as well as help you answer questions such as: – As a developer how do I pick the right deep learning framework for me? – Do I want to develop my own model or should I employ an existing one – How do I strike a trade-off between productivity and control through low-level APIs? In this session, we will show you how easy it is to build an image classifier with Tensorflow, Keras, and Deep Learning Pipelines in under 30 minutes. After this session, you will walk away with the confidence to evaluate which framework is best for you, and perhaps with a better sense for how to fool an image classifier!
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
Data Con LA 2018 - A Tale of DL Frameworks: TensorFlow, Keras, & Deep Learning by Jules Damji
1. A Tale of Three Deep
Learning Frameworks:
TensorFlow, Keras, and Deep
Learning Pipelines
Brooke Wenig
Jules S. Damji
Spark + AI Summit, SF 6/5/2018
2. About Us . . .
Databricks Machine LearningInstructor
Data ScienceSolution Consultant@ Databricks
Software Engineering @Splunk & MyFitnessPal
MS Machine Learning(UCLA)
Fluentin Chinese
https://www.linkedin.com/in/brookewenig/
Brooke WenigJules S. Damji
Apache Spark Developer& Community
Advocate @Databricks
Program Chair Spark + AI Summit
Software engineering @Sun Microsystems,
Netscape, @Home, VeriSign, Scalix, Centrify,
LoudCloud/Opsware, ProQuest
https://www.linkedin.com/in/dmatrix
@2twitme
3. Agenda for Today’s Talk
• Impact of Big Data
• Why Apache Spark?
• Short Survey of 3 DL Frameworks
• TensorFlow
• Keras
• Deep Learning Pipelines
• Demo
• Q&A
4. What has Big Data Done to Us?
Permeated our livesSource : MIT
5. Hardest Part of AI isn’t AI, it’s Data
ML
Code
Configuration
Data Collection
Data
Verification
Feature
Extraction
Machine
Resource
Management
Analysis Tools
Process
Management Tools
Serving
Infrastructure
Monitoring
“Hidden Technical Debt in Machine Learning Systems,” Google NIPS 2015
Figure 1: Onlya small fraction of real-world ML systems is composed of the ML
code. The required surrounding infrastructure is vast and complex.
9. What’s TensorFlow?
• Open source from Google, 2015
• Current v1.8 API
• Fast: Backend C/C++
• Data flow graphs
• Nodes are functions/operators
• Edges are input or data (tensors)
• Lazy execution
• Eager execution (1.7)
13. TensorFlow: We Get it … So What?
• Steep learning curve, but powerful!!
• Low-level APIs, butoffers control!!
• Expert in Machine Learning, justlearn!!
• Yet, high-level Estimators help, you bet!!
• Better, Keras integration helps, indeed!!
14. What’s Keras?
• Open source Python Library APIs for Deep Learning
• Current v2.1.6 APIs François Chollet (Google)
• API spec: TensorFlow, CNTKand Theano
• Easy to UseHigh-Level DeclarativeAPIs!
• Build layers
– Great for Neural Network Applications
• Fast Experimentation,Modular & Extensible!
15. Keras Programming Stack
CPU GPU Android iOS …TPU
Use canned estimators
Specific Impl
models
Keras API Specification
TF-Keras Theano-Keras CNTK
TensorFlow
Workflow
.....
16. Why Keras?
• Focuses on Developer Experience
• Popular & Broader Community
• Supports multiple backends
• Modularity
• Sequential Layers
• Multi-layer input networks
model = Sequential()
model.add(Dense(32, input_dim=784))
model.add(Activation('relu'))
model.add(Dense, 32, activation=’softmax’)
...
18. What’s Transfer Learning?
• Training from scratch requires
• Enormousamounts of data
• A lot of compute resources & time
Intermediate
representations learned
for one task may be
useful for other related
tasks
IDEA
21. When to use Transfer Learning?
• Dataset is small & similar
• Dataset is large & similar
• Dataset is small but different
• Dataset is large and different
Source: Andrej Karpathy’s Transfer Learning
22. What & Why Deep Learning Pipelines (DLP)?
• Open source from Databricks, 2017
• Current v1.0 APIs w/ Apache Spark 2.3
• Primarily in Python
• Ease of Use & Integration
• Spark MLlibPipelines & DataFrames
• TensorFlow & Keras
• SQL
– Deploying & Evaluating
• Distributed Hyperparameter Tuning
• Easy for Transfer Learning
25. TensorFlow Keras
Takeaways: When to Use TF, Keras or DLP
Deep Learning Pipelines
• Low-level APIs & Control
• Visualize with
TensorBoard
• Train Models or Transfer
Learning
• Model Serving
• High-level APIs
• TensorFlowBackend
• LovePython
• Train models or
transfer learning
• Integration with Spark
MLlib Pipelines &
DataFrames
• Integrated with TF &
Keras
• Transfer Learning
26. Resources
Blogposts Talk, & webinars (http://databricks.com/blog)
• Deep Learning Pipelines
• GPU acceleration in Databricks
• Deep Learning and ApacheSpark
• Build Scalable Deep LearningPipelines
• Deep Learning course:fast.ai
• TensorFlowTutorials
• TensorFlowDev Summit
• Keras/TensorFlowTutorials
• MLFlow.org
Docs for Deep Learning on Databricks (http://docs.databricks.com)
• Deep Learning Pipelines Example
• ApacheSpark integration