1. The document discusses the anatomy of a machine learning application from defining the problem as a machine learning task, collecting and preparing data, building and evaluating models, and applying predictions.
2. It provides examples of real-world machine learning applications like predicting the success of startups and reducing turbulence on flights.
3. A key point is that machine learning applications involve more than just algorithms and require properly formulating the problem, engineering features from raw data, and iterative evaluation and improvement of models.
4. BigML, Inc #MLSEV: Anatomy of an ML Application
Real-world ML Applications
!4
• Should you sign that NDA?
• Upload the NDA to the website
• The service uses Machine Learning to decide if the terms are fair
https://ndalynn.com/
5. BigML, Inc #MLSEV: Anatomy of an ML Application
Real-world ML Applications
!5
• Gathers over 500 features about companies:
• Crunchbase / Tweets / Patents / LinkedIn / etc.
• Creates a label for success/failure:
• IPO or acquisition = success
• Bankruptcy or irrelevance = failure
• Uses Machine Learning to build a model that predicts the success
or failure of startups
• And puts all of the information together into an investor dashboard
https://preseries.com
6. BigML, Inc #MLSEV: Anatomy of an ML Application
ML Adoption
!6
"The gap for most
companies isn’t that
machine learning
doesn’t work, but that
they struggle to actually
use it”
• Why?
• Too much focus on algorithms
• Not enough focus on applying Machine
7. BigML, Inc #MLSEV: Anatomy of an ML Application
Real-world ML Applications
!7
https://thepointsguy.com/news/this-is-the-reason-you-arent-feeling-as-much-turbulence-on-delta-flights/
…collecting and
analyzing “hundreds
of thousands of data
points,” with a plan
to boost that to
“millions,” creating a
model that forecasts
turbulence with a
level of confidence
heretofore unseen.
Not Important: the algorithm!
8. BigML, Inc #MLSEV: Anatomy of an ML Application
Machine Learning Evolution
!8
Genesis
Custom built
Product Service
Utility
Academics &
Researchers
Scientists
Developers
Analysts
Everyone
1950s
2000s 2011
2030
Commodity
2020
Ubiquity
CertaintyUnknown Defined
NovelCommon
Weka, Scikit
BigML, Azure
ML, Amazon
ML, Google
Cloud ML1st
Workshop on
Machine Learning
1980
1980
• Machine Learning algorithms are fun to talk about: GPUs, NNs, etc
• But the algorithms are largely a commodity already
• Difficulty is knowing how to apply ML
9. BigML, Inc #MLSEV: Anatomy of an ML Application
What is an ML Application
!9
AIRLINE ORIGIN DESTINATION
DEPARTURE
DELAY
DISTANCE
ARRIVAL
DELAY
AS ANC SEA -11 1448,0 -22
AA LAX PBI -8 2330,0 -9
US SFO CLT -2 2296,0 5
AA LAX MIA -5 2342,0 -9
AS SEA ANC -1 1448,0 -21
DL SFO MSP -5 1589 8
NK LAS MSP -6 1299 -17
US LAX CLT 14 2125,0 -10
AA SFO DFW -11 1464,0 -13
DL LAS ATL 3 1747,0 -15
Finding patterns in data that can be used to
make inferences…
Predictive Models
Consider: ML Definition
10. BigML, Inc #MLSEV: Anatomy of an ML Application
What is an ML Application
!10
AIRLINE ORIGIN DESTINATION
DEPARTURE
DELAY
DISTANCE
ARRIVAL
DELAY
AS ANC SEA -11 1448,0 -22
AA LAX PBI -8 2330,0 -9
US SFO CLT -2 2296,0 5
AA LAX MIA -5 2342,0 -9
AS SEA ANC -1 1448,0 -21
DL SFO MSP -5 1589 8
NK LAS MSP -6 1299 -17
US LAX CLT 14 2125,0 -10
AA SFO DFW -11 1464,0 -13
DL LAS ATL 3 1747,0 -15
Predictive Models
• Where does this data come from?
• How do you know what data?
• Is the data formatted correctly?
• What do you do with these models?
• How do you combine them?
• Will it work?
11. BigML, Inc #MLSEV: Anatomy of an ML Application
Reality of a ML Application
!11
Data
Transformations
Feature
Engineering
Data
Collection
Evaluation
& Retraining
Seen
Unseen
Predictive App
12. BigML, Inc #MLSEV: Anatomy of an ML Application
Where to Start?
!12
Step
1
Finish
Step
2
- - - - - - - -
???
“Let’s predict
customer churn!”
“Here are the
customers we predict
will leave our service”
13. BigML, Inc #MLSEV: Anatomy of an ML Application
Where to Start?
!13
Step
1
Finish
Step
2
- - - - - - - -
???
“Let’s detect
fraud!
“Here are the
transactions we should
stop immediately.
14. BigML, Inc #MLSEV: Anatomy of an ML Application
ML Application Guide
!14
• Remember: ML finds patterns in data enabling predictions about
future events
• This means you need data
• What data depends on what you want to predict
• And the data you have or can collect
• Data needs to have patterns related to what you want to predict
• Not magic: still can’t predict random events, lotteries, etc
• Your problem statement needs to be specific
• Not “Let’s predict churn”
• But “Let’s predict churn by looking at the profile data of all
previous customers of our service who have/have not
churned”
• This can be tricky…
State the problem as an ML Task
15. BigML, Inc #MLSEV: Anatomy of an ML Application
Where to Start?
!15
Step
1
Finish
“Let’s predict
the Oscars!”
“Here are the
predicted winners”
Step
2
- - - - - - - -
???
• Statement is not specific enough!!!
• What data can we collect that predicts Oscar wins?
16. BigML, Inc #MLSEV: Anatomy of an ML Application
Predicting the Oscars
!16
• 6 out of 6 right!
• 8 out of 8 actually, but
probability of the predictions
was “too low”
• Adapted Screenplay
• Original Screenplay
BigML Scoresheet
2018
• 4 our of 8 major awards
correctly predicted
• Probabilities were lower this
year
• This is still significantly
better than guessing
2019
How is this possible? Isn't the winner random?
17. BigML, Inc #MLSEV: Anatomy of an ML Application
How an Oscar is Won
!17
voting
intention?
7,000+ members
Insight: winning awards is not a random event!
18. BigML, Inc #MLSEV: Anatomy of an ML Application
Let’s Predict Best Picture
!18
Win
London
Critics
Lose
Writers
Guild
Win
Directors
Guild
Win
Golden
Globe
Win
Bafta
• These events are *not* independent
• Similar, but not identical, factors contribute to
each win…
• We can expect a higher probability for Shape of
Water to win
Oscar
?Win?
20. BigML, Inc #MLSEV: Anatomy of an ML Application
Oscars Dataset
!20
DATASET is publicly available:
https://bigml.com/user/academy_awards/gallery/dataset/
5a94302592fb565ed400103b
21. BigML, Inc #MLSEV: Anatomy of an ML Application
Oscars Example
!21
• When specifying the problem, be as specific as possible
• Not: “Let’s predict the Oscars”
• Instead: “Let’s Predict the Oscars by correlating a series
of award wins with the final Oscar win.”
• The statement of the problem will guide the data required
• Be aware of the cost of collecting the data versus the ROI:
Tidbits and Lessons Learned….
22. BigML, Inc #MLSEV: Anatomy of an ML Application
Ranking ML Applications
!22
FEASIBILITY
(incdataavailability/deccomplexity)
ROI
(impact and cost)
-
+
+
NO-BRAINERS
START HERE
NO-GO
POSTPONABLE
BRAINERS
Thinking about an ML Application?
23. BigML, Inc #MLSEV: Anatomy of an ML Application
Oscars Example
!23
• When specifying the problem, be as specific as possible
• Not: “Let’s predict the Oscars”
• Instead: “Let’s Predict the Oscars by correlating a series
of award wins with the final Oscar win.”
• The statement of the problem will guide the data required
• Be aware of the cost of collecting the data versus the ROI:
• IMDB data is readily availble
• We’re done right?
• Nope. You can’t escape Feature Engineering
• Items: BAFTA_won_categories = list of nominations
• Aggregations: Nomination and Award counts
• You can’t escape Feature Selection
• Full user reviews costly to collect and not useful
Tidbits and Lessons Learned….
Wait: How were you confident in the predictions?
24. BigML, Inc #MLSEV: Anatomy of an ML Application
2013
2016
119 variables
Evaluating the Model
!24
119 variables
2000
2016 119 variables
2000
2012Original Dataset
Test Dataset
Train Dataset
• Ultimately, we want to use all the history to predict the winner
for the current year
• In order to evaluate success, we use a model built from
2000-2012 data to predict the winners for 2013-2016
• Built a separate Deepnet for each award category
• Evaluation obtained a ROC AUC over 0.98 across all award
categories
Great: The model seems OK, what next?
25. BigML, Inc #MLSEV: Anatomy of an ML Application
Effort of a ML Application
!25
State the problem as an ML task
Data wrangling
Feature engineering
Modeling and Evaluations
Predictions
Measure Results
Data transformations ~80% effort
~5% effort
~5% effort
This is only such low
effort because of
platforms like
This is an area where
is currently
innovating
Task
~10% effort
Effort
26. BigML, Inc #MLSEV: Anatomy of an ML Application
Reality Check
!26
• All Machine Learned models are wrong
• Real-world Machine Learning is iterative
• End-to-end Machine Learning is compositional
Three Important Concepts in Applying ML…
27. BigML, Inc #MLSEV: Anatomy of an ML Application
End-to-end ML is Compositional
!27
• Real-world problems
• Solved by applying a combination of algorithms
• Very rarely is it one-and-done
28. BigML, Inc #MLSEV: Anatomy of an ML Application
Basic Workflow
!28
SOURCE DATASET MODEL PREDICTION
29. BigML, Inc #MLSEV: Anatomy of an ML Application
Feature Engineering
!29
MODEL
FILTERSOLD HOMES
BATCH
PREDICTION
NEW FEATURES
DATASET DEALS
DATASET
FILTERFORSALE HOMES NEW FEATURES
30. BigML, Inc #MLSEV: Anatomy of an ML Application
End-to-end ML is Compositional
!30
• Real-world problems
• Solved by applying a combination of algorithms
• Very rarely is it one-and-done
• Each “step” is often multi-stage as well
• Filtering/Cleaning data
31. BigML, Inc #MLSEV: Anatomy of an ML Application
Anomaly Filter and Evaluate
!31
DIABETES
SOURCE
DIABETES
DATASET
TRAIN SET
TEST SET
ALL
MODEL
CLEAN
DATASET
FILTER
ALL
MODEL
ALL
EVALUATION
CLEAN
EVALUATION
COMPARE
EVALUATIONS
ANAOMALY
DETECTOR
32. BigML, Inc #MLSEV: Anatomy of an ML Application
Fixing Missing Values
!32
Fix Missing Values in a “Meaningful” Way
Filter Zeros
Model
insulin
Predict
insulin
Select
insulin
Fixed
Dataset
Amended
Dataset
Original
Dataset
Clean
Dataset
33. BigML, Inc #MLSEV: Anatomy of an ML Application
End-to-end ML is Compositional
!33
• Real-world problems
• Solved by applying a combination of algorithms
• Very rarely is it one-and-done
• Each “step” is often multi-stage as well
• Filtering/Cleaning data
• Tuning a model for optimum performance
34. BigML, Inc #MLSEV: Anatomy of an ML Application
Ensemble Tuning
!34
ENSEMBLE
N=20
EVALUATION
SOURCE DATASET
TRAINING
TEST
EVALUATIONEVALUATION
ENSEMBLE
N=10
ENSEMBLE
N=1000
CHOOSE
35. BigML, Inc #MLSEV: Anatomy of an ML Application
End-to-end ML is Compositional
!35
• Real-world problems
• Solved by applying a combination of algorithms
• Very rarely is it one-and-done
• Each “step” is often multi-stage as well
• Filtering/Cleaning data
• Tuning a model for optimum performance
• Finding the best features
36. BigML, Inc #MLSEV: Anatomy of an ML Application
Best-first Feature Selection
!36
{F1}
CHOOSE BEST
S = {Fa}
{F2} {F3} {F4} Fn
S+{F1} S+{F2} S+{F3} S+{F4} S+{Fn-1}
CHOOSE BEST
S = {Fa, Fb}
S+{F1} S+{F2} S+{F3} S+{F4} S+{Fn-1}
CHOOSE BEST
S = {Fa, Fb, Fc}
37. BigML, Inc #MLSEV: Anatomy of an ML Application
End-to-end ML is Compositional
!37
• Real-world problems
• Solved by applying a combination of algorithms
• Very rarely is it one-and-done
• Each “step” is often multi-stage as well
• Filtering/Cleaning data
• Tuning a model for optimum performance
• Finding the best features
• May require models for several domains of knowledge
• Multiple Training / Scoring
38. BigML, Inc #MLSEV: Anatomy of an ML Application
AGGREGATED
BY CARD
AGGREGATED
BY USER
AGGREGATED
BY PROFILE
Multiple Domains
!38
TRANSACTIONS
ANOMALY
BY CARD
ANOMALY
BY USER
ANOMALY
BY PROFILE
ANOMALY
SCORE
ANOMALY
SCORE
ANOMALY
SCORE
NEW TRANSACTION
APPROVED?
39. BigML, Inc #MLSEV: Anatomy of an ML Application
End-to-end ML is Compositional
!39
• Real-world problems
• Solved by applying a combination of algorithms
• Very rarely is it one-and-done
• Each “step” is often multi-stage as well
• Filtering/Cleaning data
• Tuning a model for optimum performance
• Finding the best features
• May require models for several domains of knowledge
• Multiple Training / Scoring
• Even after deploying a model
• Workflow to monitor performance, know when to retrain
40. BigML, Inc #MLSEV: Anatomy of an ML Application
Model Retraining
!40
TRAINING
INPUT DATA
PREDICTIONS
ANOMALY
SCORES
OUTCOMES
RETRAIN DATA
41. BigML, Inc #MLSEV: Anatomy of an ML Application
Reality Check
!41
• All Machine Learned models are wrong
Three Important Concepts in Applying ML…
• Real-world Machine Learning is iterative
• End-to-end Machine Learning is compositional
42. BigML, Inc #MLSEV: Anatomy of an ML Application
• Better features always beat better algorithms
• Good algorithms already exist and are good enough
• Tools like OptiML exist which can help optimize performance
• The data is never good enough
Tenets of Machine Learning
!42
• All Machine Learned models are wrong
• Real-world Machine Learning is iterative
• End-to-end Machine Learning is compositional
• Automation is better than hand tuning - you need an API!
• When data changes quickly, training speed is more
important than accuracy
• Repeatability is superior to a single strong result
• Problems are solved with workflows of algorithms
• A ML solution is not real until it is in production
• ML is here: Now we need 100,000x people applying ML
, but some are useful