SlideShare une entreprise Scribd logo
1  sur  43
Télécharger pour lire hors ligne
1st edition
March 7-8, 2019
BigML, Inc
Anatomy of an ML Application
Machine Learning End-to-End
Poul Petersen
CIO, BigML
!2
BigML, Inc #MLSEV
Examples of ML Applications
!3
BigML, Inc #MLSEV: Anatomy of an ML Application
Real-world ML Applications
!4
• Should you sign that NDA?
• Upload the NDA to the website
• The service uses Machine Learning to decide if the terms are fair
https://ndalynn.com/
BigML, Inc #MLSEV: Anatomy of an ML Application
Real-world ML Applications
!5
• Gathers over 500 features about companies:
• Crunchbase / Tweets / Patents / LinkedIn / etc.
• Creates a label for success/failure:
• IPO or acquisition = success
• Bankruptcy or irrelevance = failure
• Uses Machine Learning to build a model that predicts the success
or failure of startups
• And puts all of the information together into an investor dashboard
https://preseries.com
BigML, Inc #MLSEV: Anatomy of an ML Application
ML Adoption
!6
"The gap for most
companies isn’t that
machine learning
doesn’t work, but that
they struggle to actually
use it”
• Why?
• Too much focus on algorithms
• Not enough focus on applying Machine
BigML, Inc #MLSEV: Anatomy of an ML Application
Real-world ML Applications
!7
https://thepointsguy.com/news/this-is-the-reason-you-arent-feeling-as-much-turbulence-on-delta-flights/
…collecting and
analyzing “hundreds
of thousands of data
points,” with a plan
to boost that to
“millions,” creating a
model that forecasts
turbulence with a
level of confidence
heretofore unseen.
Not Important: the algorithm!
BigML, Inc #MLSEV: Anatomy of an ML Application
Machine Learning Evolution
!8
Genesis
Custom built
Product Service
Utility
Academics &
Researchers
Scientists
Developers
Analysts
Everyone
1950s
2000s 2011
2030
Commodity
2020
Ubiquity
CertaintyUnknown Defined
NovelCommon
Weka, Scikit
BigML, Azure
ML, Amazon
ML, Google
Cloud ML1st
Workshop on
Machine Learning
1980
1980
• Machine Learning algorithms are fun to talk about: GPUs, NNs, etc
• But the algorithms are largely a commodity already
• Difficulty is knowing how to apply ML
BigML, Inc #MLSEV: Anatomy of an ML Application
What is an ML Application
!9
AIRLINE ORIGIN DESTINATION
DEPARTURE
DELAY
DISTANCE
ARRIVAL
DELAY
AS ANC SEA -11 1448,0 -22
AA LAX PBI -8 2330,0 -9
US SFO CLT -2 2296,0 5
AA LAX MIA -5 2342,0 -9
AS SEA ANC -1 1448,0 -21
DL SFO MSP -5 1589 8
NK LAS MSP -6 1299 -17
US LAX CLT 14 2125,0 -10
AA SFO DFW -11 1464,0 -13
DL LAS ATL 3 1747,0 -15
Finding patterns in data that can be used to
make inferences…
Predictive Models
Consider: ML Definition
BigML, Inc #MLSEV: Anatomy of an ML Application
What is an ML Application
!10
AIRLINE ORIGIN DESTINATION
DEPARTURE
DELAY
DISTANCE
ARRIVAL
DELAY
AS ANC SEA -11 1448,0 -22
AA LAX PBI -8 2330,0 -9
US SFO CLT -2 2296,0 5
AA LAX MIA -5 2342,0 -9
AS SEA ANC -1 1448,0 -21
DL SFO MSP -5 1589 8
NK LAS MSP -6 1299 -17
US LAX CLT 14 2125,0 -10
AA SFO DFW -11 1464,0 -13
DL LAS ATL 3 1747,0 -15
Predictive Models
• Where does this data come from?
• How do you know what data?
• Is the data formatted correctly?
• What do you do with these models?
• How do you combine them?
• Will it work?
BigML, Inc #MLSEV: Anatomy of an ML Application
Reality of a ML Application
!11
Data
Transformations
Feature
Engineering
Data
Collection
Evaluation
& Retraining
Seen
Unseen
Predictive App
BigML, Inc #MLSEV: Anatomy of an ML Application
Where to Start?
!12
Step
1
Finish
Step
2
- - - - - - - -
???
“Let’s predict 

customer churn!”
“Here are the
customers we predict
will leave our service”
BigML, Inc #MLSEV: Anatomy of an ML Application
Where to Start?
!13
Step
1
Finish
Step
2
- - - - - - - -
???
“Let’s detect 

fraud!
“Here are the
transactions we should
stop immediately.
BigML, Inc #MLSEV: Anatomy of an ML Application
ML Application Guide
!14
• Remember: ML finds patterns in data enabling predictions about
future events
• This means you need data
• What data depends on what you want to predict
• And the data you have or can collect
• Data needs to have patterns related to what you want to predict
• Not magic: still can’t predict random events, lotteries, etc
• Your problem statement needs to be specific
• Not “Let’s predict churn”
• But “Let’s predict churn by looking at the profile data of all
previous customers of our service who have/have not
churned”
• This can be tricky…
State the problem as an ML Task
BigML, Inc #MLSEV: Anatomy of an ML Application
Where to Start?
!15
Step
1
Finish
“Let’s predict 

the Oscars!”
“Here are the 

predicted winners”
Step
2
- - - - - - - -
???
• Statement is not specific enough!!!
• What data can we collect that predicts Oscar wins?
BigML, Inc #MLSEV: Anatomy of an ML Application
Predicting the Oscars
!16
• 6 out of 6 right!
• 8 out of 8 actually, but
probability of the predictions
was “too low”
• Adapted Screenplay
• Original Screenplay
BigML Scoresheet
2018
• 4 our of 8 major awards
correctly predicted
• Probabilities were lower this
year
• This is still significantly
better than guessing
2019
How is this possible? Isn't the winner random?
BigML, Inc #MLSEV: Anatomy of an ML Application
How an Oscar is Won
!17
voting

intention?
7,000+ members
Insight: winning awards is not a random event!
BigML, Inc #MLSEV: Anatomy of an ML Application
Let’s Predict Best Picture
!18
Win
London
Critics
Lose
Writers
Guild
Win
Directors
Guild
Win
Golden
Globe
Win
Bafta
• These events are *not* independent
• Similar, but not identical, factors contribute to
each win…
• We can expect a higher probability for Shape of
Water to win
Oscar
?Win?
BigML, Inc #MLSEV: Anatomy of an ML Application
The Features
!19
MOVIES AWARDS
OBJECTIVE
FIELDS
• year
• movie
• movie_id
• certificate
• duration
• genre
• rate
• metascore
• synopsis
• votes
• gross
• release_date
• user_reviews
• critic_reviews
• popularity
• awards_wins
• awards_nomination
s
• release_date.year
• release_date.mont
h
• release_date.day-
of-month
• release_date.day-
of-week
• Oscar_Best_Picture_nominated
• Oscar_Best_Director_nominated
• Oscar_Best_Actor_nominated
• Oscar_Best_Actress_nominated
• Oscar_Best_Supporting_Actor_nominated
• Oscar_Best_Supporting_Actress_nominated
• Oscar_Best_AdaScreen_nominated
• Oscar_Best_OriScreen_nominated
• Oscar_nominated
• Oscar_nominated_categories
• Golden_Globes_won
• Golden_Globes_won_categories
• Golden_Globes_nominated
• Golden_Globes_nominated_categories
• BAFTA_won
• BAFTA_won_categories
• BAFTA_nominated
• BAFTA_nominated_categories
• Screen_Actors_Guild_won
• Screen_Actors_Guild_won_categories
• Screen_Actors_Guild_nominated
• Screen_Actors_Guild_nominated_categories
• Critics_Choice_won
• Critics_Choice_won_categories
• Critics_Choice_nominated
• Critics_Choice_nominated_categories
• Directors_Guild_won
• Directors_Guild_won_categories
• Directors_Guild_nominated
• Directors_Guild_nominated_categories
• Producers_Guild_won
• Producers_Guild_won_categories
• Producers_Guild_nominated
• Producers_Guild_nominated_categories
• Art_Directors_Guild_won
• Art_Directors_Guild_won_categories
• Art_Directors_Guild_nominated
• Art_Directors_Guild_nominated_categories
• Writers_Guild_won
• Writers_Guild_won_categories
• Writers_Guild_nominated
• Writers_Guild_nominated_categories
• Costume_Designers_Guild_won
• Costume_Designers_Guild_won_categories
• Costume_Designers_Guild_nominated
• Costume_Designers_Guild_nominated_categories
• Online_Film_Television_Association_won
• Online_Film_Television_Association_won_categories
• Online_Film_Television_Association_nominated
• Online_Film_Television_Association_nominated_catego
ries
• Online_Film_Critics_Society_won
• Online_Film_Critics_Society_won_categories
• Online_Film_Critics_Society_nominated
• Online_Film_Critics_Society_nominated_categories
• People_Choice_won
• People_Choice_won_categories
• People_Choice_nominated
• People_Choice_nominated_categories
• London_Critics_Circle_Film_won
• London_Critics_Circle_Film_won_categories
• London_Critics_Circle_Film_nominated
• London_Critics_Circle_Film_nominated_categories
• American_Cinema_Editors_won
• American_Cinema_Editors_won_categories
• American_Cinema_Editors_nominated
• American_Cinema_Editors_nominated_categories
• Hollywood_Film_won
• Hollywood_Film_won_categories
• Hollywood_Film_nominated
• Hollywood_Film_nominated_categories
• Austin_Film_Critics_Association_won
• Austin_Film_Critics_Association_won_categories
• Austin_Film_Critics_Association_nominated
• Austin_Film_Critics_Association_nominated_categories
• Denver_Film_Critics_Society_won
• Denver_Film_Critics_Society_won_categories
• Denver_Film_Critics_Society_nominated
• Denver_Film_Critics_Society_nominated_categories
• Boston_Society_of_Film_Critics_won
• Boston_Society_of_Film_Critics_won_categories
• Boston_Society_of_Film_Critics_nominated
• Boston_Society_of_Film_Critics_nominated_categories
• New_York_Film_Critics_Circle_won
• New_York_Film_Critics_Circle_won_categories
• New_York_Film_Critics_Circle_nominated
• New_York_Film_Critics_Circle_nominated_categories
• Los_Angeles_Film_Critics_Association_won
• Los_Angeles_Film_Critics_Association_won_categorie
s
• Los_Angeles_Film_Critics_Association_nominated
• Los_Angeles_Film_Critics_Association_nominated_cat
egories
• Oscar_Best_Picture_wo
n
• Oscar_Best_Director_w
on
• Oscar_Best_Actor_won
• Oscar_Best_Actress_wo
n
• Oscar_Best_Supporting
_Actor_won
• Oscar_Best_Supporting
_Actress_won
Data pulled from IMDB…
Engineered Features:
Award items field

Nomination Counts

Awards Counts
BigML, Inc #MLSEV: Anatomy of an ML Application
Oscars Dataset
!20
DATASET is publicly available: 

https://bigml.com/user/academy_awards/gallery/dataset/
5a94302592fb565ed400103b
BigML, Inc #MLSEV: Anatomy of an ML Application
Oscars Example
!21
• When specifying the problem, be as specific as possible
• Not: “Let’s predict the Oscars”
• Instead: “Let’s Predict the Oscars by correlating a series
of award wins with the final Oscar win.”
• The statement of the problem will guide the data required
• Be aware of the cost of collecting the data versus the ROI:
Tidbits and Lessons Learned….
BigML, Inc #MLSEV: Anatomy of an ML Application
Ranking ML Applications
!22
FEASIBILITY
(incdataavailability/deccomplexity)
ROI
(impact and cost)
-
+
+
NO-BRAINERS

START HERE
NO-GO
POSTPONABLE
BRAINERS
Thinking about an ML Application?
BigML, Inc #MLSEV: Anatomy of an ML Application
Oscars Example
!23
• When specifying the problem, be as specific as possible
• Not: “Let’s predict the Oscars”
• Instead: “Let’s Predict the Oscars by correlating a series
of award wins with the final Oscar win.”
• The statement of the problem will guide the data required
• Be aware of the cost of collecting the data versus the ROI:
• IMDB data is readily availble
• We’re done right?
• Nope. You can’t escape Feature Engineering
• Items: BAFTA_won_categories = list of nominations
• Aggregations: Nomination and Award counts
• You can’t escape Feature Selection
• Full user reviews costly to collect and not useful
Tidbits and Lessons Learned….
Wait: How were you confident in the predictions?
BigML, Inc #MLSEV: Anatomy of an ML Application
2013

2016
119 variables
Evaluating the Model
!24
119 variables
2000

2016 119 variables
2000

2012Original Dataset
Test Dataset
Train Dataset
• Ultimately, we want to use all the history to predict the winner
for the current year
• In order to evaluate success, we use a model built from
2000-2012 data to predict the winners for 2013-2016
• Built a separate Deepnet for each award category
• Evaluation obtained a ROC AUC over 0.98 across all award
categories
Great: The model seems OK, what next?
BigML, Inc #MLSEV: Anatomy of an ML Application
Effort of a ML Application
!25
State the problem as an ML task
Data wrangling
Feature engineering
Modeling and Evaluations
Predictions
Measure Results
Data transformations ~80% effort
~5% effort
~5% effort
This is only such low
effort because of
platforms like
This is an area where
is currently
innovating
Task
~10% effort
Effort
BigML, Inc #MLSEV: Anatomy of an ML Application
Reality Check
!26
• All Machine Learned models are wrong
• Real-world Machine Learning is iterative
• End-to-end Machine Learning is compositional
Three Important Concepts in Applying ML…
BigML, Inc #MLSEV: Anatomy of an ML Application
End-to-end ML is Compositional
!27
• Real-world problems
• Solved by applying a combination of algorithms
• Very rarely is it one-and-done
BigML, Inc #MLSEV: Anatomy of an ML Application
Basic Workflow
!28
SOURCE DATASET MODEL PREDICTION
BigML, Inc #MLSEV: Anatomy of an ML Application
Feature Engineering
!29
MODEL
FILTERSOLD HOMES
BATCH
PREDICTION
NEW FEATURES
DATASET DEALS
DATASET
FILTERFORSALE HOMES NEW FEATURES
BigML, Inc #MLSEV: Anatomy of an ML Application
End-to-end ML is Compositional
!30
• Real-world problems
• Solved by applying a combination of algorithms
• Very rarely is it one-and-done
• Each “step” is often multi-stage as well
• Filtering/Cleaning data
BigML, Inc #MLSEV: Anatomy of an ML Application
Anomaly Filter and Evaluate
!31
DIABETES
SOURCE
DIABETES
DATASET
TRAIN SET
TEST SET
ALL
MODEL
CLEAN
DATASET
FILTER
ALL
MODEL
ALL
EVALUATION
CLEAN
EVALUATION
COMPARE
EVALUATIONS
ANAOMALY
DETECTOR
BigML, Inc #MLSEV: Anatomy of an ML Application
Fixing Missing Values
!32
Fix Missing Values in a “Meaningful” Way
Filter Zeros
Model 

insulin
Predict 

insulin
Select 

insulin
Fixed

Dataset
Amended

Dataset
Original

Dataset
Clean

Dataset
BigML, Inc #MLSEV: Anatomy of an ML Application
End-to-end ML is Compositional
!33
• Real-world problems
• Solved by applying a combination of algorithms
• Very rarely is it one-and-done
• Each “step” is often multi-stage as well
• Filtering/Cleaning data
• Tuning a model for optimum performance
BigML, Inc #MLSEV: Anatomy of an ML Application
Ensemble Tuning
!34
ENSEMBLE
N=20
EVALUATION
SOURCE DATASET
TRAINING
TEST
EVALUATIONEVALUATION
ENSEMBLE
N=10
ENSEMBLE
N=1000
CHOOSE
BigML, Inc #MLSEV: Anatomy of an ML Application
End-to-end ML is Compositional
!35
• Real-world problems
• Solved by applying a combination of algorithms
• Very rarely is it one-and-done
• Each “step” is often multi-stage as well
• Filtering/Cleaning data
• Tuning a model for optimum performance
• Finding the best features
BigML, Inc #MLSEV: Anatomy of an ML Application
Best-first Feature Selection
!36
{F1}
CHOOSE BEST
S = {Fa}
{F2} {F3} {F4} Fn
S+{F1} S+{F2} S+{F3} S+{F4} S+{Fn-1}
CHOOSE BEST
S = {Fa, Fb}
S+{F1} S+{F2} S+{F3} S+{F4} S+{Fn-1}
CHOOSE BEST
S = {Fa, Fb, Fc}
BigML, Inc #MLSEV: Anatomy of an ML Application
End-to-end ML is Compositional
!37
• Real-world problems
• Solved by applying a combination of algorithms
• Very rarely is it one-and-done
• Each “step” is often multi-stage as well
• Filtering/Cleaning data
• Tuning a model for optimum performance
• Finding the best features
• May require models for several domains of knowledge
• Multiple Training / Scoring
BigML, Inc #MLSEV: Anatomy of an ML Application
AGGREGATED
BY CARD
AGGREGATED
BY USER
AGGREGATED
BY PROFILE
Multiple Domains
!38
TRANSACTIONS
ANOMALY

BY CARD
ANOMALY

BY USER
ANOMALY

BY PROFILE
ANOMALY

SCORE
ANOMALY

SCORE
ANOMALY

SCORE
NEW TRANSACTION
APPROVED?
BigML, Inc #MLSEV: Anatomy of an ML Application
End-to-end ML is Compositional
!39
• Real-world problems
• Solved by applying a combination of algorithms
• Very rarely is it one-and-done
• Each “step” is often multi-stage as well
• Filtering/Cleaning data
• Tuning a model for optimum performance
• Finding the best features
• May require models for several domains of knowledge
• Multiple Training / Scoring
• Even after deploying a model
• Workflow to monitor performance, know when to retrain
BigML, Inc #MLSEV: Anatomy of an ML Application
Model Retraining
!40
TRAINING
INPUT DATA
PREDICTIONS
ANOMALY

SCORES
OUTCOMES
RETRAIN DATA
BigML, Inc #MLSEV: Anatomy of an ML Application
Reality Check
!41
• All Machine Learned models are wrong
Three Important Concepts in Applying ML…
• Real-world Machine Learning is iterative
• End-to-end Machine Learning is compositional
BigML, Inc #MLSEV: Anatomy of an ML Application
• Better features always beat better algorithms
• Good algorithms already exist and are good enough
• Tools like OptiML exist which can help optimize performance
• The data is never good enough
Tenets of Machine Learning
!42
• All Machine Learned models are wrong
• Real-world Machine Learning is iterative
• End-to-end Machine Learning is compositional
• Automation is better than hand tuning - you need an API!
• When data changes quickly, training speed is more
important than accuracy
• Repeatability is superior to a single strong result
• Problems are solved with workflows of algorithms
• A ML solution is not real until it is in production
• ML is here: Now we need 100,000x people applying ML
, but some are useful
Anatomy of an ML Application: From Problem to Prediction

Contenu connexe

Tendances

MLSEV. Association Discovery and Topic Modeling
MLSEV. Association Discovery and Topic ModelingMLSEV. Association Discovery and Topic Modeling
MLSEV. Association Discovery and Topic ModelingBigML, Inc
 
MLSD18. End-to-End Machine Learning
MLSD18. End-to-End Machine LearningMLSD18. End-to-End Machine Learning
MLSD18. End-to-End Machine LearningBigML, Inc
 
The Past, Present, and Future of Machine Learning APIs
The Past, Present, and Future of Machine Learning APIsThe Past, Present, and Future of Machine Learning APIs
The Past, Present, and Future of Machine Learning APIsBigML, Inc
 
Machine Learning: Past, Present and Future - by Tom Dietterich
Machine Learning: Past, Present and Future - by Tom DietterichMachine Learning: Past, Present and Future - by Tom Dietterich
Machine Learning: Past, Present and Future - by Tom DietterichBigML, Inc
 
DataRobot - 머신러닝 자동화 플랫폼
DataRobot - 머신러닝 자동화 플랫폼DataRobot - 머신러닝 자동화 플랫폼
DataRobot - 머신러닝 자동화 플랫폼Sutaek Kim
 
Towards Human-Centered Machine Learning
Towards Human-Centered Machine LearningTowards Human-Centered Machine Learning
Towards Human-Centered Machine LearningSri Ambati
 
MLSD18. Automating Machine Learning Workflows
MLSD18. Automating Machine Learning WorkflowsMLSD18. Automating Machine Learning Workflows
MLSD18. Automating Machine Learning WorkflowsBigML, Inc
 
MLSD18. Supervised Workshop
MLSD18. Supervised WorkshopMLSD18. Supervised Workshop
MLSD18. Supervised WorkshopBigML, Inc
 
FrugalML: Using ML APIs More Accurately and Cheaply
FrugalML: Using ML APIs More Accurately and CheaplyFrugalML: Using ML APIs More Accurately and Cheaply
FrugalML: Using ML APIs More Accurately and CheaplyDatabricks
 
AIIA - Charting the Path to Intelligent Operations with Machine Learning - At...
AIIA - Charting the Path to Intelligent Operations with Machine Learning - At...AIIA - Charting the Path to Intelligent Operations with Machine Learning - At...
AIIA - Charting the Path to Intelligent Operations with Machine Learning - At...BigML, Inc
 
Explainability for Natural Language Processing
Explainability for Natural Language ProcessingExplainability for Natural Language Processing
Explainability for Natural Language ProcessingYunyao Li
 
MLSD18. OptiML and Fusions
MLSD18. OptiML and FusionsMLSD18. OptiML and Fusions
MLSD18. OptiML and FusionsBigML, Inc
 
MLSD18. Supervised Summary
MLSD18. Supervised SummaryMLSD18. Supervised Summary
MLSD18. Supervised SummaryBigML, Inc
 
Building A Feature Factory
Building A Feature FactoryBuilding A Feature Factory
Building A Feature FactoryDatabricks
 
MLSEV Virtual. Applying Topic Modelling to improve Operations
MLSEV Virtual. Applying Topic Modelling to improve OperationsMLSEV Virtual. Applying Topic Modelling to improve Operations
MLSEV Virtual. Applying Topic Modelling to improve OperationsBigML, Inc
 

Tendances (15)

MLSEV. Association Discovery and Topic Modeling
MLSEV. Association Discovery and Topic ModelingMLSEV. Association Discovery and Topic Modeling
MLSEV. Association Discovery and Topic Modeling
 
MLSD18. End-to-End Machine Learning
MLSD18. End-to-End Machine LearningMLSD18. End-to-End Machine Learning
MLSD18. End-to-End Machine Learning
 
The Past, Present, and Future of Machine Learning APIs
The Past, Present, and Future of Machine Learning APIsThe Past, Present, and Future of Machine Learning APIs
The Past, Present, and Future of Machine Learning APIs
 
Machine Learning: Past, Present and Future - by Tom Dietterich
Machine Learning: Past, Present and Future - by Tom DietterichMachine Learning: Past, Present and Future - by Tom Dietterich
Machine Learning: Past, Present and Future - by Tom Dietterich
 
DataRobot - 머신러닝 자동화 플랫폼
DataRobot - 머신러닝 자동화 플랫폼DataRobot - 머신러닝 자동화 플랫폼
DataRobot - 머신러닝 자동화 플랫폼
 
Towards Human-Centered Machine Learning
Towards Human-Centered Machine LearningTowards Human-Centered Machine Learning
Towards Human-Centered Machine Learning
 
MLSD18. Automating Machine Learning Workflows
MLSD18. Automating Machine Learning WorkflowsMLSD18. Automating Machine Learning Workflows
MLSD18. Automating Machine Learning Workflows
 
MLSD18. Supervised Workshop
MLSD18. Supervised WorkshopMLSD18. Supervised Workshop
MLSD18. Supervised Workshop
 
FrugalML: Using ML APIs More Accurately and Cheaply
FrugalML: Using ML APIs More Accurately and CheaplyFrugalML: Using ML APIs More Accurately and Cheaply
FrugalML: Using ML APIs More Accurately and Cheaply
 
AIIA - Charting the Path to Intelligent Operations with Machine Learning - At...
AIIA - Charting the Path to Intelligent Operations with Machine Learning - At...AIIA - Charting the Path to Intelligent Operations with Machine Learning - At...
AIIA - Charting the Path to Intelligent Operations with Machine Learning - At...
 
Explainability for Natural Language Processing
Explainability for Natural Language ProcessingExplainability for Natural Language Processing
Explainability for Natural Language Processing
 
MLSD18. OptiML and Fusions
MLSD18. OptiML and FusionsMLSD18. OptiML and Fusions
MLSD18. OptiML and Fusions
 
MLSD18. Supervised Summary
MLSD18. Supervised SummaryMLSD18. Supervised Summary
MLSD18. Supervised Summary
 
Building A Feature Factory
Building A Feature FactoryBuilding A Feature Factory
Building A Feature Factory
 
MLSEV Virtual. Applying Topic Modelling to improve Operations
MLSEV Virtual. Applying Topic Modelling to improve OperationsMLSEV Virtual. Applying Topic Modelling to improve Operations
MLSEV Virtual. Applying Topic Modelling to improve Operations
 

Similaire à Anatomy of an ML Application: From Problem to Prediction

DutchMLSchool. Machine Learning End-to-End
DutchMLSchool. Machine Learning End-to-EndDutchMLSchool. Machine Learning End-to-End
DutchMLSchool. Machine Learning End-to-EndBigML, Inc
 
BSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, EvaluationsBSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, EvaluationsBigML, Inc
 
VSSML18 Introduction to Supervised Learning
VSSML18 Introduction to Supervised LearningVSSML18 Introduction to Supervised Learning
VSSML18 Introduction to Supervised LearningBigML, Inc
 
DutchMLSchool. Machine Learning: Why Now?
DutchMLSchool. Machine Learning: Why Now? DutchMLSchool. Machine Learning: Why Now?
DutchMLSchool. Machine Learning: Why Now? BigML, Inc
 
STQA-Vol9-Issue2-March-2012-Software-Testing-Magazine
STQA-Vol9-Issue2-March-2012-Software-Testing-MagazineSTQA-Vol9-Issue2-March-2012-Software-Testing-Magazine
STQA-Vol9-Issue2-March-2012-Software-Testing-MagazineAlbert Gareev
 
Outside the Comfort Zone: Cross Industry Use Cases in Big Data Analytics
Outside the Comfort Zone: Cross Industry Use Cases in Big Data AnalyticsOutside the Comfort Zone: Cross Industry Use Cases in Big Data Analytics
Outside the Comfort Zone: Cross Industry Use Cases in Big Data AnalyticsRising Media Ltd.
 
The Sky’s the Limit – The Rise of Machine Learnin
The Sky’s the Limit – The Rise of Machine LearninThe Sky’s the Limit – The Rise of Machine Learnin
The Sky’s the Limit – The Rise of Machine LearninInside Analysis
 
Vsm Voc Brownbag Webinar 0610009
Vsm Voc Brownbag Webinar 0610009Vsm Voc Brownbag Webinar 0610009
Vsm Voc Brownbag Webinar 0610009Daniel Walker
 
User Story Maps: Secrets for Better Backlogs and Planning
 User Story Maps: Secrets for Better Backlogs and Planning User Story Maps: Secrets for Better Backlogs and Planning
User Story Maps: Secrets for Better Backlogs and PlanningAaron Sanders
 
Spark and the Enterprise by Tony Baer
Spark and the Enterprise by Tony BaerSpark and the Enterprise by Tony Baer
Spark and the Enterprise by Tony BaerSpark Summit
 
Competitive Analysis for SEO - SEMNE
Competitive Analysis for SEO - SEMNE Competitive Analysis for SEO - SEMNE
Competitive Analysis for SEO - SEMNE Casie Gillette
 
Brighttalk what should we be monitoring - final
Brighttalk   what should we be monitoring - finalBrighttalk   what should we be monitoring - final
Brighttalk what should we be monitoring - finalAndrew White
 
Future Lawyers Speak Data
Future Lawyers Speak DataFuture Lawyers Speak Data
Future Lawyers Speak DataIFLP
 
Birst Webinar Slides: "Build vs. Buy - Making the Right Choice for a Great Da...
Birst Webinar Slides: "Build vs. Buy - Making the Right Choice for a Great Da...Birst Webinar Slides: "Build vs. Buy - Making the Right Choice for a Great Da...
Birst Webinar Slides: "Build vs. Buy - Making the Right Choice for a Great Da...Birst
 
Stop the Blame Game with Increased Visibility of your Mobile-to-Mainframe IT ...
Stop the Blame Game with Increased Visibility of your Mobile-to-Mainframe IT ...Stop the Blame Game with Increased Visibility of your Mobile-to-Mainframe IT ...
Stop the Blame Game with Increased Visibility of your Mobile-to-Mainframe IT ...CA Technologies
 
Your API: A Big Enough Box of Crayons?
Your API: A Big Enough Box of Crayons?Your API: A Big Enough Box of Crayons?
Your API: A Big Enough Box of Crayons?Peter Coffee
 
Design Patterns Every ISV Needs to Know (October 15, 2014)
Design Patterns Every ISV Needs to Know (October 15, 2014)Design Patterns Every ISV Needs to Know (October 15, 2014)
Design Patterns Every ISV Needs to Know (October 15, 2014)Salesforce Partners
 

Similaire à Anatomy of an ML Application: From Problem to Prediction (20)

DutchMLSchool. Machine Learning End-to-End
DutchMLSchool. Machine Learning End-to-EndDutchMLSchool. Machine Learning End-to-End
DutchMLSchool. Machine Learning End-to-End
 
BSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, EvaluationsBSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, Evaluations
 
VSSML18 Introduction to Supervised Learning
VSSML18 Introduction to Supervised LearningVSSML18 Introduction to Supervised Learning
VSSML18 Introduction to Supervised Learning
 
DutchMLSchool. Machine Learning: Why Now?
DutchMLSchool. Machine Learning: Why Now? DutchMLSchool. Machine Learning: Why Now?
DutchMLSchool. Machine Learning: Why Now?
 
How to write maintainable software
How to write maintainable softwareHow to write maintainable software
How to write maintainable software
 
STQA-Vol9-Issue2-March-2012-Software-Testing-Magazine
STQA-Vol9-Issue2-March-2012-Software-Testing-MagazineSTQA-Vol9-Issue2-March-2012-Software-Testing-Magazine
STQA-Vol9-Issue2-March-2012-Software-Testing-Magazine
 
AI & AWS DeepComposer
AI & AWS DeepComposerAI & AWS DeepComposer
AI & AWS DeepComposer
 
Outside the Comfort Zone: Cross Industry Use Cases in Big Data Analytics
Outside the Comfort Zone: Cross Industry Use Cases in Big Data AnalyticsOutside the Comfort Zone: Cross Industry Use Cases in Big Data Analytics
Outside the Comfort Zone: Cross Industry Use Cases in Big Data Analytics
 
The Sky’s the Limit – The Rise of Machine Learnin
The Sky’s the Limit – The Rise of Machine LearninThe Sky’s the Limit – The Rise of Machine Learnin
The Sky’s the Limit – The Rise of Machine Learnin
 
LTV Predictions: How do real-life companies use them & what can you learn fro...
LTV Predictions: How do real-life companies use them & what can you learn fro...LTV Predictions: How do real-life companies use them & what can you learn fro...
LTV Predictions: How do real-life companies use them & what can you learn fro...
 
Vsm Voc Brownbag Webinar 0610009
Vsm Voc Brownbag Webinar 0610009Vsm Voc Brownbag Webinar 0610009
Vsm Voc Brownbag Webinar 0610009
 
User Story Maps: Secrets for Better Backlogs and Planning
 User Story Maps: Secrets for Better Backlogs and Planning User Story Maps: Secrets for Better Backlogs and Planning
User Story Maps: Secrets for Better Backlogs and Planning
 
Spark and the Enterprise by Tony Baer
Spark and the Enterprise by Tony BaerSpark and the Enterprise by Tony Baer
Spark and the Enterprise by Tony Baer
 
Competitive Analysis for SEO - SEMNE
Competitive Analysis for SEO - SEMNE Competitive Analysis for SEO - SEMNE
Competitive Analysis for SEO - SEMNE
 
Brighttalk what should we be monitoring - final
Brighttalk   what should we be monitoring - finalBrighttalk   what should we be monitoring - final
Brighttalk what should we be monitoring - final
 
Future Lawyers Speak Data
Future Lawyers Speak DataFuture Lawyers Speak Data
Future Lawyers Speak Data
 
Birst Webinar Slides: "Build vs. Buy - Making the Right Choice for a Great Da...
Birst Webinar Slides: "Build vs. Buy - Making the Right Choice for a Great Da...Birst Webinar Slides: "Build vs. Buy - Making the Right Choice for a Great Da...
Birst Webinar Slides: "Build vs. Buy - Making the Right Choice for a Great Da...
 
Stop the Blame Game with Increased Visibility of your Mobile-to-Mainframe IT ...
Stop the Blame Game with Increased Visibility of your Mobile-to-Mainframe IT ...Stop the Blame Game with Increased Visibility of your Mobile-to-Mainframe IT ...
Stop the Blame Game with Increased Visibility of your Mobile-to-Mainframe IT ...
 
Your API: A Big Enough Box of Crayons?
Your API: A Big Enough Box of Crayons?Your API: A Big Enough Box of Crayons?
Your API: A Big Enough Box of Crayons?
 
Design Patterns Every ISV Needs to Know (October 15, 2014)
Design Patterns Every ISV Needs to Know (October 15, 2014)Design Patterns Every ISV Needs to Know (October 15, 2014)
Design Patterns Every ISV Needs to Know (October 15, 2014)
 

Plus de BigML, Inc

Digital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingDigital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingBigML, Inc
 
DutchMLSchool 2022 - Automation
DutchMLSchool 2022 - AutomationDutchMLSchool 2022 - Automation
DutchMLSchool 2022 - AutomationBigML, Inc
 
DutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML ComplianceDutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML ComplianceBigML, Inc
 
DutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective AnomaliesDutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective AnomaliesBigML, Inc
 
DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector BigML, Inc
 
DutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly DetectionDutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly DetectionBigML, Inc
 
DutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLDutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLBigML, Inc
 
DutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End MLDutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End MLBigML, Inc
 
DutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven CompanyDutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven CompanyBigML, Inc
 
DutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal SectorDutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal SectorBigML, Inc
 
DutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe StadiumsDutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe StadiumsBigML, Inc
 
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing PlantsDutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing PlantsBigML, Inc
 
DutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at ScaleDutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at ScaleBigML, Inc
 
DutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AIDutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AIBigML, Inc
 
Democratizing Object Detection
Democratizing Object DetectionDemocratizing Object Detection
Democratizing Object DetectionBigML, Inc
 
BigML Release: Image Processing
BigML Release: Image ProcessingBigML Release: Image Processing
BigML Release: Image ProcessingBigML, Inc
 
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your FutureMachine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your FutureBigML, Inc
 
Machine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail SectorMachine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail SectorBigML, Inc
 
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a LawyerbotML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a LawyerbotBigML, Inc
 
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...BigML, Inc
 

Plus de BigML, Inc (20)

Digital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingDigital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in Manufacturing
 
DutchMLSchool 2022 - Automation
DutchMLSchool 2022 - AutomationDutchMLSchool 2022 - Automation
DutchMLSchool 2022 - Automation
 
DutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML ComplianceDutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML Compliance
 
DutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective AnomaliesDutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective Anomalies
 
DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector
 
DutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly DetectionDutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly Detection
 
DutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLDutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in ML
 
DutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End MLDutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End ML
 
DutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven CompanyDutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven Company
 
DutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal SectorDutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal Sector
 
DutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe StadiumsDutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe Stadiums
 
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing PlantsDutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
 
DutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at ScaleDutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at Scale
 
DutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AIDutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AI
 
Democratizing Object Detection
Democratizing Object DetectionDemocratizing Object Detection
Democratizing Object Detection
 
BigML Release: Image Processing
BigML Release: Image ProcessingBigML Release: Image Processing
BigML Release: Image Processing
 
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your FutureMachine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
 
Machine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail SectorMachine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail Sector
 
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a LawyerbotML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
 
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
 

Dernier

Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 

Dernier (20)

Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 

Anatomy of an ML Application: From Problem to Prediction

  • 2. BigML, Inc Anatomy of an ML Application Machine Learning End-to-End Poul Petersen CIO, BigML !2
  • 3. BigML, Inc #MLSEV Examples of ML Applications !3
  • 4. BigML, Inc #MLSEV: Anatomy of an ML Application Real-world ML Applications !4 • Should you sign that NDA? • Upload the NDA to the website • The service uses Machine Learning to decide if the terms are fair https://ndalynn.com/
  • 5. BigML, Inc #MLSEV: Anatomy of an ML Application Real-world ML Applications !5 • Gathers over 500 features about companies: • Crunchbase / Tweets / Patents / LinkedIn / etc. • Creates a label for success/failure: • IPO or acquisition = success • Bankruptcy or irrelevance = failure • Uses Machine Learning to build a model that predicts the success or failure of startups • And puts all of the information together into an investor dashboard https://preseries.com
  • 6. BigML, Inc #MLSEV: Anatomy of an ML Application ML Adoption !6 "The gap for most companies isn’t that machine learning doesn’t work, but that they struggle to actually use it” • Why? • Too much focus on algorithms • Not enough focus on applying Machine
  • 7. BigML, Inc #MLSEV: Anatomy of an ML Application Real-world ML Applications !7 https://thepointsguy.com/news/this-is-the-reason-you-arent-feeling-as-much-turbulence-on-delta-flights/ …collecting and analyzing “hundreds of thousands of data points,” with a plan to boost that to “millions,” creating a model that forecasts turbulence with a level of confidence heretofore unseen. Not Important: the algorithm!
  • 8. BigML, Inc #MLSEV: Anatomy of an ML Application Machine Learning Evolution !8 Genesis Custom built Product Service Utility Academics & Researchers Scientists Developers Analysts Everyone 1950s 2000s 2011 2030 Commodity 2020 Ubiquity CertaintyUnknown Defined NovelCommon Weka, Scikit BigML, Azure ML, Amazon ML, Google Cloud ML1st Workshop on Machine Learning 1980 1980 • Machine Learning algorithms are fun to talk about: GPUs, NNs, etc • But the algorithms are largely a commodity already • Difficulty is knowing how to apply ML
  • 9. BigML, Inc #MLSEV: Anatomy of an ML Application What is an ML Application !9 AIRLINE ORIGIN DESTINATION DEPARTURE DELAY DISTANCE ARRIVAL DELAY AS ANC SEA -11 1448,0 -22 AA LAX PBI -8 2330,0 -9 US SFO CLT -2 2296,0 5 AA LAX MIA -5 2342,0 -9 AS SEA ANC -1 1448,0 -21 DL SFO MSP -5 1589 8 NK LAS MSP -6 1299 -17 US LAX CLT 14 2125,0 -10 AA SFO DFW -11 1464,0 -13 DL LAS ATL 3 1747,0 -15 Finding patterns in data that can be used to make inferences… Predictive Models Consider: ML Definition
  • 10. BigML, Inc #MLSEV: Anatomy of an ML Application What is an ML Application !10 AIRLINE ORIGIN DESTINATION DEPARTURE DELAY DISTANCE ARRIVAL DELAY AS ANC SEA -11 1448,0 -22 AA LAX PBI -8 2330,0 -9 US SFO CLT -2 2296,0 5 AA LAX MIA -5 2342,0 -9 AS SEA ANC -1 1448,0 -21 DL SFO MSP -5 1589 8 NK LAS MSP -6 1299 -17 US LAX CLT 14 2125,0 -10 AA SFO DFW -11 1464,0 -13 DL LAS ATL 3 1747,0 -15 Predictive Models • Where does this data come from? • How do you know what data? • Is the data formatted correctly? • What do you do with these models? • How do you combine them? • Will it work?
  • 11. BigML, Inc #MLSEV: Anatomy of an ML Application Reality of a ML Application !11 Data Transformations Feature Engineering Data Collection Evaluation & Retraining Seen Unseen Predictive App
  • 12. BigML, Inc #MLSEV: Anatomy of an ML Application Where to Start? !12 Step 1 Finish Step 2 - - - - - - - - ??? “Let’s predict 
 customer churn!” “Here are the customers we predict will leave our service”
  • 13. BigML, Inc #MLSEV: Anatomy of an ML Application Where to Start? !13 Step 1 Finish Step 2 - - - - - - - - ??? “Let’s detect 
 fraud! “Here are the transactions we should stop immediately.
  • 14. BigML, Inc #MLSEV: Anatomy of an ML Application ML Application Guide !14 • Remember: ML finds patterns in data enabling predictions about future events • This means you need data • What data depends on what you want to predict • And the data you have or can collect • Data needs to have patterns related to what you want to predict • Not magic: still can’t predict random events, lotteries, etc • Your problem statement needs to be specific • Not “Let’s predict churn” • But “Let’s predict churn by looking at the profile data of all previous customers of our service who have/have not churned” • This can be tricky… State the problem as an ML Task
  • 15. BigML, Inc #MLSEV: Anatomy of an ML Application Where to Start? !15 Step 1 Finish “Let’s predict 
 the Oscars!” “Here are the 
 predicted winners” Step 2 - - - - - - - - ??? • Statement is not specific enough!!! • What data can we collect that predicts Oscar wins?
  • 16. BigML, Inc #MLSEV: Anatomy of an ML Application Predicting the Oscars !16 • 6 out of 6 right! • 8 out of 8 actually, but probability of the predictions was “too low” • Adapted Screenplay • Original Screenplay BigML Scoresheet 2018 • 4 our of 8 major awards correctly predicted • Probabilities were lower this year • This is still significantly better than guessing 2019 How is this possible? Isn't the winner random?
  • 17. BigML, Inc #MLSEV: Anatomy of an ML Application How an Oscar is Won !17 voting intention? 7,000+ members Insight: winning awards is not a random event!
  • 18. BigML, Inc #MLSEV: Anatomy of an ML Application Let’s Predict Best Picture !18 Win London Critics Lose Writers Guild Win Directors Guild Win Golden Globe Win Bafta • These events are *not* independent • Similar, but not identical, factors contribute to each win… • We can expect a higher probability for Shape of Water to win Oscar ?Win?
  • 19. BigML, Inc #MLSEV: Anatomy of an ML Application The Features !19 MOVIES AWARDS OBJECTIVE FIELDS • year • movie • movie_id • certificate • duration • genre • rate • metascore • synopsis • votes • gross • release_date • user_reviews • critic_reviews • popularity • awards_wins • awards_nomination s • release_date.year • release_date.mont h • release_date.day- of-month • release_date.day- of-week • Oscar_Best_Picture_nominated • Oscar_Best_Director_nominated • Oscar_Best_Actor_nominated • Oscar_Best_Actress_nominated • Oscar_Best_Supporting_Actor_nominated • Oscar_Best_Supporting_Actress_nominated • Oscar_Best_AdaScreen_nominated • Oscar_Best_OriScreen_nominated • Oscar_nominated • Oscar_nominated_categories • Golden_Globes_won • Golden_Globes_won_categories • Golden_Globes_nominated • Golden_Globes_nominated_categories • BAFTA_won • BAFTA_won_categories • BAFTA_nominated • BAFTA_nominated_categories • Screen_Actors_Guild_won • Screen_Actors_Guild_won_categories • Screen_Actors_Guild_nominated • Screen_Actors_Guild_nominated_categories • Critics_Choice_won • Critics_Choice_won_categories • Critics_Choice_nominated • Critics_Choice_nominated_categories • Directors_Guild_won • Directors_Guild_won_categories • Directors_Guild_nominated • Directors_Guild_nominated_categories • Producers_Guild_won • Producers_Guild_won_categories • Producers_Guild_nominated • Producers_Guild_nominated_categories • Art_Directors_Guild_won • Art_Directors_Guild_won_categories • Art_Directors_Guild_nominated • Art_Directors_Guild_nominated_categories • Writers_Guild_won • Writers_Guild_won_categories • Writers_Guild_nominated • Writers_Guild_nominated_categories • Costume_Designers_Guild_won • Costume_Designers_Guild_won_categories • Costume_Designers_Guild_nominated • Costume_Designers_Guild_nominated_categories • Online_Film_Television_Association_won • Online_Film_Television_Association_won_categories • Online_Film_Television_Association_nominated • Online_Film_Television_Association_nominated_catego ries • Online_Film_Critics_Society_won • Online_Film_Critics_Society_won_categories • Online_Film_Critics_Society_nominated • Online_Film_Critics_Society_nominated_categories • People_Choice_won • People_Choice_won_categories • People_Choice_nominated • People_Choice_nominated_categories • London_Critics_Circle_Film_won • London_Critics_Circle_Film_won_categories • London_Critics_Circle_Film_nominated • London_Critics_Circle_Film_nominated_categories • American_Cinema_Editors_won • American_Cinema_Editors_won_categories • American_Cinema_Editors_nominated • American_Cinema_Editors_nominated_categories • Hollywood_Film_won • Hollywood_Film_won_categories • Hollywood_Film_nominated • Hollywood_Film_nominated_categories • Austin_Film_Critics_Association_won • Austin_Film_Critics_Association_won_categories • Austin_Film_Critics_Association_nominated • Austin_Film_Critics_Association_nominated_categories • Denver_Film_Critics_Society_won • Denver_Film_Critics_Society_won_categories • Denver_Film_Critics_Society_nominated • Denver_Film_Critics_Society_nominated_categories • Boston_Society_of_Film_Critics_won • Boston_Society_of_Film_Critics_won_categories • Boston_Society_of_Film_Critics_nominated • Boston_Society_of_Film_Critics_nominated_categories • New_York_Film_Critics_Circle_won • New_York_Film_Critics_Circle_won_categories • New_York_Film_Critics_Circle_nominated • New_York_Film_Critics_Circle_nominated_categories • Los_Angeles_Film_Critics_Association_won • Los_Angeles_Film_Critics_Association_won_categorie s • Los_Angeles_Film_Critics_Association_nominated • Los_Angeles_Film_Critics_Association_nominated_cat egories • Oscar_Best_Picture_wo n • Oscar_Best_Director_w on • Oscar_Best_Actor_won • Oscar_Best_Actress_wo n • Oscar_Best_Supporting _Actor_won • Oscar_Best_Supporting _Actress_won Data pulled from IMDB… Engineered Features: Award items field Nomination Counts Awards Counts
  • 20. BigML, Inc #MLSEV: Anatomy of an ML Application Oscars Dataset !20 DATASET is publicly available: https://bigml.com/user/academy_awards/gallery/dataset/ 5a94302592fb565ed400103b
  • 21. BigML, Inc #MLSEV: Anatomy of an ML Application Oscars Example !21 • When specifying the problem, be as specific as possible • Not: “Let’s predict the Oscars” • Instead: “Let’s Predict the Oscars by correlating a series of award wins with the final Oscar win.” • The statement of the problem will guide the data required • Be aware of the cost of collecting the data versus the ROI: Tidbits and Lessons Learned….
  • 22. BigML, Inc #MLSEV: Anatomy of an ML Application Ranking ML Applications !22 FEASIBILITY (incdataavailability/deccomplexity) ROI (impact and cost) - + + NO-BRAINERS START HERE NO-GO POSTPONABLE BRAINERS Thinking about an ML Application?
  • 23. BigML, Inc #MLSEV: Anatomy of an ML Application Oscars Example !23 • When specifying the problem, be as specific as possible • Not: “Let’s predict the Oscars” • Instead: “Let’s Predict the Oscars by correlating a series of award wins with the final Oscar win.” • The statement of the problem will guide the data required • Be aware of the cost of collecting the data versus the ROI: • IMDB data is readily availble • We’re done right? • Nope. You can’t escape Feature Engineering • Items: BAFTA_won_categories = list of nominations • Aggregations: Nomination and Award counts • You can’t escape Feature Selection • Full user reviews costly to collect and not useful Tidbits and Lessons Learned…. Wait: How were you confident in the predictions?
  • 24. BigML, Inc #MLSEV: Anatomy of an ML Application 2013 2016 119 variables Evaluating the Model !24 119 variables 2000 2016 119 variables 2000 2012Original Dataset Test Dataset Train Dataset • Ultimately, we want to use all the history to predict the winner for the current year • In order to evaluate success, we use a model built from 2000-2012 data to predict the winners for 2013-2016 • Built a separate Deepnet for each award category • Evaluation obtained a ROC AUC over 0.98 across all award categories Great: The model seems OK, what next?
  • 25. BigML, Inc #MLSEV: Anatomy of an ML Application Effort of a ML Application !25 State the problem as an ML task Data wrangling Feature engineering Modeling and Evaluations Predictions Measure Results Data transformations ~80% effort ~5% effort ~5% effort This is only such low effort because of platforms like This is an area where is currently innovating Task ~10% effort Effort
  • 26. BigML, Inc #MLSEV: Anatomy of an ML Application Reality Check !26 • All Machine Learned models are wrong • Real-world Machine Learning is iterative • End-to-end Machine Learning is compositional Three Important Concepts in Applying ML…
  • 27. BigML, Inc #MLSEV: Anatomy of an ML Application End-to-end ML is Compositional !27 • Real-world problems • Solved by applying a combination of algorithms • Very rarely is it one-and-done
  • 28. BigML, Inc #MLSEV: Anatomy of an ML Application Basic Workflow !28 SOURCE DATASET MODEL PREDICTION
  • 29. BigML, Inc #MLSEV: Anatomy of an ML Application Feature Engineering !29 MODEL FILTERSOLD HOMES BATCH PREDICTION NEW FEATURES DATASET DEALS DATASET FILTERFORSALE HOMES NEW FEATURES
  • 30. BigML, Inc #MLSEV: Anatomy of an ML Application End-to-end ML is Compositional !30 • Real-world problems • Solved by applying a combination of algorithms • Very rarely is it one-and-done • Each “step” is often multi-stage as well • Filtering/Cleaning data
  • 31. BigML, Inc #MLSEV: Anatomy of an ML Application Anomaly Filter and Evaluate !31 DIABETES SOURCE DIABETES DATASET TRAIN SET TEST SET ALL MODEL CLEAN DATASET FILTER ALL MODEL ALL EVALUATION CLEAN EVALUATION COMPARE EVALUATIONS ANAOMALY DETECTOR
  • 32. BigML, Inc #MLSEV: Anatomy of an ML Application Fixing Missing Values !32 Fix Missing Values in a “Meaningful” Way Filter Zeros Model 
 insulin Predict 
 insulin Select 
 insulin Fixed
 Dataset Amended
 Dataset Original
 Dataset Clean
 Dataset
  • 33. BigML, Inc #MLSEV: Anatomy of an ML Application End-to-end ML is Compositional !33 • Real-world problems • Solved by applying a combination of algorithms • Very rarely is it one-and-done • Each “step” is often multi-stage as well • Filtering/Cleaning data • Tuning a model for optimum performance
  • 34. BigML, Inc #MLSEV: Anatomy of an ML Application Ensemble Tuning !34 ENSEMBLE N=20 EVALUATION SOURCE DATASET TRAINING TEST EVALUATIONEVALUATION ENSEMBLE N=10 ENSEMBLE N=1000 CHOOSE
  • 35. BigML, Inc #MLSEV: Anatomy of an ML Application End-to-end ML is Compositional !35 • Real-world problems • Solved by applying a combination of algorithms • Very rarely is it one-and-done • Each “step” is often multi-stage as well • Filtering/Cleaning data • Tuning a model for optimum performance • Finding the best features
  • 36. BigML, Inc #MLSEV: Anatomy of an ML Application Best-first Feature Selection !36 {F1} CHOOSE BEST S = {Fa} {F2} {F3} {F4} Fn S+{F1} S+{F2} S+{F3} S+{F4} S+{Fn-1} CHOOSE BEST S = {Fa, Fb} S+{F1} S+{F2} S+{F3} S+{F4} S+{Fn-1} CHOOSE BEST S = {Fa, Fb, Fc}
  • 37. BigML, Inc #MLSEV: Anatomy of an ML Application End-to-end ML is Compositional !37 • Real-world problems • Solved by applying a combination of algorithms • Very rarely is it one-and-done • Each “step” is often multi-stage as well • Filtering/Cleaning data • Tuning a model for optimum performance • Finding the best features • May require models for several domains of knowledge • Multiple Training / Scoring
  • 38. BigML, Inc #MLSEV: Anatomy of an ML Application AGGREGATED BY CARD AGGREGATED BY USER AGGREGATED BY PROFILE Multiple Domains !38 TRANSACTIONS ANOMALY BY CARD ANOMALY BY USER ANOMALY BY PROFILE ANOMALY SCORE ANOMALY SCORE ANOMALY SCORE NEW TRANSACTION APPROVED?
  • 39. BigML, Inc #MLSEV: Anatomy of an ML Application End-to-end ML is Compositional !39 • Real-world problems • Solved by applying a combination of algorithms • Very rarely is it one-and-done • Each “step” is often multi-stage as well • Filtering/Cleaning data • Tuning a model for optimum performance • Finding the best features • May require models for several domains of knowledge • Multiple Training / Scoring • Even after deploying a model • Workflow to monitor performance, know when to retrain
  • 40. BigML, Inc #MLSEV: Anatomy of an ML Application Model Retraining !40 TRAINING INPUT DATA PREDICTIONS ANOMALY SCORES OUTCOMES RETRAIN DATA
  • 41. BigML, Inc #MLSEV: Anatomy of an ML Application Reality Check !41 • All Machine Learned models are wrong Three Important Concepts in Applying ML… • Real-world Machine Learning is iterative • End-to-end Machine Learning is compositional
  • 42. BigML, Inc #MLSEV: Anatomy of an ML Application • Better features always beat better algorithms • Good algorithms already exist and are good enough • Tools like OptiML exist which can help optimize performance • The data is never good enough Tenets of Machine Learning !42 • All Machine Learned models are wrong • Real-world Machine Learning is iterative • End-to-end Machine Learning is compositional • Automation is better than hand tuning - you need an API! • When data changes quickly, training speed is more important than accuracy • Repeatability is superior to a single strong result • Problems are solved with workflows of algorithms • A ML solution is not real until it is in production • ML is here: Now we need 100,000x people applying ML , but some are useful