SlideShare a Scribd company logo
1 of 39
Cohort Analysis
at Scale
BLAKE IRVINE
STRATA SAN JOSE
2019.03.06
World Markets
World Markets
World Markets
Partners help us Grow
Partners are companies that make it easier
for people to sign up and engage with our
service and help us retain members.
BLAKE IRVINE | STRATA SAN JOSE 2018
BLAKE IRVINE | STRATA SAN JOSE 2018
Let’s take a (virtual) trip!
Trip to Strata San Jose
Trip to Strata San Jose
Trip to Strata San Jose
Multiple Partner Associations
Trip to Strata San Jose
● Are partners helping us acquire members?
● Through which channels?
● How does a partner impact the regional market?
● Do members use our service differently on partner devices?
● How do partners compare to each other?
Evaluating Partners
BLAKE IRVINE | STRATA SAN JOSE 2018
Cohorts are the collections of members
associated with a partner that are relevant to
the business question.
BLAKE IRVINE | STRATA SAN JOSE 2018
● Our trip example showed how one member can be associated
with many partners.
● Business teams want to explore the nuance of cohorts...
Cohorts can be complex
BLAKE IRVINE | STRATA SAN JOSE 2018
… for example
BLAKE IRVINE | STRATA SAN JOSE 2018
● 100+ million members
● Dozens of partners
● Many combinations of members and partners
● Leads to…
○ High dimensionality, high cardinality datasets
○ Very large datasets of member-level time-series activity
Evaluating Cohorts
BLAKE IRVINE | STRATA SAN JOSE 2018
Cohort Analysis at Scale
● Data platform
● Data construction
● Data product for cohort analysis
Cohort Analysis at Scale
BLAKE IRVINE | STRATA SAN JOSE 2018
Data Platform
Simplified Overview
Big Data
Portal
Data construction: Data Model
member
device
partner
playback
events
isp
billing
events
billing
processor
BLAKE IRVINE | STRATA SAN JOSE 2018
Data construction: Cohort Dataset
signup_events
cohort
playback_events
billing_events
x_events
BLAKE IRVINE | STRATA SAN JOSE 2018
Data construction: Flat Tables
playback_f cohort_playback_s
device_d
isp_d
geo_d
partner_d
cohort_d
BLAKE IRVINE | STRATA SAN JOSE 2018
Data for consumption: Flat Table
key memb
er_id
device
_id
device_name device_categor
y
partner_name country region data_payload
1 1213 674 Amazon Fire
TV
Set Top Box Amazon US Americas [{"id":5025945823792539,"sequen
ce":41,"time":1491962092955},
{"id":5025947899236389,"sequen
ce":95,"time":1491962104824}]
2 7623 1172 Chromecast Streaming
Stick
Google DE EMEA …
3 4291 129 PS3 Game
Console
Sony ES EMEA …
4 9013 447 iPad 4 Tablet Apple CA Americas …
BLAKE IRVINE | STRATA SAN JOSE 2018
Data construction: Copy
forward
BLAKE IRVINE | STRATA SAN JOSE 2018
cohort_playback_s
Big Data
Portal
● Goals
○ Serve dozens of users
○ Provide interactive / low-latency tool
○ Provide many different perspectives
● Challenges
○ Manage high dimensionality
○ Very large time-series datasets
Data Product for Cohort Analysis
BLAKE IRVINE | STRATA SAN JOSE 2018
Analytic Tool Choices
Choice 1 Choice 2 Choice 3
Analytic Tool Tool 1 Tool 2 Tool 3
Data Engine MPP Cloud In memory
Data Size 1B rows 10B rows 100M rows
Performance
(SWAG)
Up to many
minutes
Many
minutes
Several
Seconds
BLAKE IRVINE | STRATA SAN JOSE 2018
● Data stored in Druid
● Custom app built with Javascript
Choice 4...
BLAKE IRVINE | STRATA SAN JOSE 2018
● An open source data store for analytic applications
● Distributed, column-oriented, indexed architecture
● Well suited to serve our “flat” tables
Druid white paper: http://static.druid.io/docs/druid.pdf
BLAKE IRVINE | STRATA SAN JOSE 2018
● Built with Express, React, Redux, D3
● Custom UX / UI to manage views and dimensionality
● Enabled access to data served by Druid
● Enabled management of query execution and caching
BLAKE IRVINE | STRATA SAN JOSE 2018
● Video demo with simulated data
BLAKE IRVINE | STRATA SAN JOSE 2018
● PED DEMO
BLAKE IRVINE | STRATA SAN JOSE 2018
Challenges
● Dimensions (aka slice-n-dice)
○ More is always better
○ Changes require restatement
● “Typical” use cases must be met also
○ Not a solution for every data question
○ Analysts and other tools are still needed
Challenges
BLAKE IRVINE | STRATA SAN JOSE 2018
● Data volume always increases…
○ More members, more partners, more devices, more metrics
● Custom app development time is longer, and ongoing
○ But for the right use cases, worthwhile
Challenges
BLAKE IRVINE | STRATA SAN JOSE 2018
Partners help Netflix grow.
We measure partner value through cohorts.
Big data tools enable efficient analysis.
BLAKE IRVINE | STRATA SAN JOSE 2018
Thank you!
Blake Irvine - Growth Data Products
birvine@netflix.com
@blakeirvine
linkedin.com/in/blakeirvine/

More Related Content

What's hot

Contextualization at Netflix
Contextualization at NetflixContextualization at Netflix
Contextualization at NetflixLinas Baltrunas
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated RecommendationsHarald Steck
 
Homepage Personalization at Spotify
Homepage Personalization at SpotifyHomepage Personalization at Spotify
Homepage Personalization at SpotifyOguz Semerci
 
Context Aware Recommendations at Netflix
Context Aware Recommendations at NetflixContext Aware Recommendations at Netflix
Context Aware Recommendations at NetflixLinas Baltrunas
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsYves Raimond
 
Personalizing the listening experience
Personalizing the listening experiencePersonalizing the listening experience
Personalizing the listening experienceMounia Lalmas-Roelleke
 
Knowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender SystemsKnowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender SystemsEnrico Palumbo
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixJaya Kawale
 
Personalizing "The Netflix Experience" with Deep Learning
Personalizing "The Netflix Experience" with Deep LearningPersonalizing "The Netflix Experience" with Deep Learning
Personalizing "The Netflix Experience" with Deep LearningAnoop Deoras
 
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Anoop Deoras
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsJustin Basilico
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectiveJustin Basilico
 
Recommending for the World
Recommending for the WorldRecommending for the World
Recommending for the WorldYves Raimond
 
Personalization at Netflix - Making Stories Travel
Personalization at Netflix -  Making Stories Travel Personalization at Netflix -  Making Stories Travel
Personalization at Netflix - Making Stories Travel Sudeep Das, Ph.D.
 
Data council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at NetflixData council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at NetflixGrace T. Huang
 
Recommendation at Netflix Scale
Recommendation at Netflix ScaleRecommendation at Netflix Scale
Recommendation at Netflix ScaleJustin Basilico
 
Netflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupNetflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupBlake Irvine
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsJustin Basilico
 

What's hot (20)

Contextualization at Netflix
Contextualization at NetflixContextualization at Netflix
Contextualization at Netflix
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
 
Homepage Personalization at Spotify
Homepage Personalization at SpotifyHomepage Personalization at Spotify
Homepage Personalization at Spotify
 
Context Aware Recommendations at Netflix
Context Aware Recommendations at NetflixContext Aware Recommendations at Netflix
Context Aware Recommendations at Netflix
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Personalizing the listening experience
Personalizing the listening experiencePersonalizing the listening experience
Personalizing the listening experience
 
Knowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender SystemsKnowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender Systems
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at Netflix
 
Personalizing "The Netflix Experience" with Deep Learning
Personalizing "The Netflix Experience" with Deep LearningPersonalizing "The Netflix Experience" with Deep Learning
Personalizing "The Netflix Experience" with Deep Learning
 
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender Systems
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry Perspective
 
Recommending for the World
Recommending for the WorldRecommending for the World
Recommending for the World
 
Personalization at Netflix - Making Stories Travel
Personalization at Netflix -  Making Stories Travel Personalization at Netflix -  Making Stories Travel
Personalization at Netflix - Making Stories Travel
 
Data council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at NetflixData council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at Netflix
 
Recommendation at Netflix Scale
Recommendation at Netflix ScaleRecommendation at Netflix Scale
Recommendation at Netflix Scale
 
Session-Based Recommender Systems
Session-Based Recommender SystemsSession-Based Recommender Systems
Session-Based Recommender Systems
 
Netflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupNetflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering Meetup
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 

Similar to Cohort Analysis at Scale

GiveSignup | RunSignup CRM Integrations
GiveSignup | RunSignup CRM IntegrationsGiveSignup | RunSignup CRM Integrations
GiveSignup | RunSignup CRM Integrationsrunsignup
 
Twin Cities Eloqua User Group 092413
Twin Cities Eloqua User Group 092413Twin Cities Eloqua User Group 092413
Twin Cities Eloqua User Group 092413Ron Corbisier
 
Before vs After: Redesigning a Website to be Useful and Informative for Devel...
Before vs After: Redesigning a Website to be Useful and Informative for Devel...Before vs After: Redesigning a Website to be Useful and Informative for Devel...
Before vs After: Redesigning a Website to be Useful and Informative for Devel...Teresa Giacomini
 
How to Convert Your SAP BusinessObjects Unused Licenses to SAP Analytics Cloud
How to Convert Your SAP BusinessObjects Unused Licenses to SAP Analytics CloudHow to Convert Your SAP BusinessObjects Unused Licenses to SAP Analytics Cloud
How to Convert Your SAP BusinessObjects Unused Licenses to SAP Analytics CloudWiiisdom
 
Measuring Results And Demonstrating Value.V1
Measuring Results And Demonstrating Value.V1Measuring Results And Demonstrating Value.V1
Measuring Results And Demonstrating Value.V1TechSoup Canada
 
Irina Pashina - UX Strategy Spanning Marketing and Technical Content at SAP
Irina Pashina - UX Strategy Spanning Marketing and Technical Content at SAPIrina Pashina - UX Strategy Spanning Marketing and Technical Content at SAP
Irina Pashina - UX Strategy Spanning Marketing and Technical Content at SAPLavaConConference
 
The Scout24 Data Landscape Manifesto: Building an Opinionated Data Platform
The Scout24 Data Landscape Manifesto: Building an Opinionated Data PlatformThe Scout24 Data Landscape Manifesto: Building an Opinionated Data Platform
The Scout24 Data Landscape Manifesto: Building an Opinionated Data PlatformRising Media Ltd.
 
SAP Process Mining in Action: Hear from Two Customers
SAP Process Mining in Action: Hear from Two CustomersSAP Process Mining in Action: Hear from Two Customers
SAP Process Mining in Action: Hear from Two CustomersCelonis
 
[Webinar Deck] Google Data Studio for Mastering the Art of Data Visualizations
[Webinar Deck] Google Data Studio for Mastering the Art of Data Visualizations[Webinar Deck] Google Data Studio for Mastering the Art of Data Visualizations
[Webinar Deck] Google Data Studio for Mastering the Art of Data VisualizationsTatvic Analytics
 
How to design web intelligence reports that behave like real dashboards
How to design web intelligence reports that behave like real dashboardsHow to design web intelligence reports that behave like real dashboards
How to design web intelligence reports that behave like real dashboardsWiiisdom
 
Discover SAP BusinessObjects BI 4.3
Discover SAP BusinessObjects BI 4.3Discover SAP BusinessObjects BI 4.3
Discover SAP BusinessObjects BI 4.3Wiiisdom
 
#askSAP Analytics Innovations Community Call: SAP 2018 strategy and Roadmap f...
#askSAP Analytics Innovations Community Call: SAP 2018 strategy and Roadmap f...#askSAP Analytics Innovations Community Call: SAP 2018 strategy and Roadmap f...
#askSAP Analytics Innovations Community Call: SAP 2018 strategy and Roadmap f...SAP Analytics
 
SPS Cambs 07-09-18 - Getting started with Dodel Driven PowerApps
SPS Cambs 07-09-18 - Getting started with Dodel Driven PowerAppsSPS Cambs 07-09-18 - Getting started with Dodel Driven PowerApps
SPS Cambs 07-09-18 - Getting started with Dodel Driven PowerAppsPeter Baddeley
 
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...Using Kafka in Your Organization with Real-Time User Insights for a Customer ...
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...confluent
 
Analytics in Your Enterprise
Analytics in Your EnterpriseAnalytics in Your Enterprise
Analytics in Your EnterpriseWSO2
 
BI4.2 SP06 and Beyond: The Future of SAP BusinessObjects Webi
BI4.2 SP06 and Beyond: The Future of SAP BusinessObjects WebiBI4.2 SP06 and Beyond: The Future of SAP BusinessObjects Webi
BI4.2 SP06 and Beyond: The Future of SAP BusinessObjects WebiWiiisdom
 
Supplier Success on the Ariba Network
Supplier Success on the Ariba NetworkSupplier Success on the Ariba Network
Supplier Success on the Ariba NetworkSAP Ariba
 
PayPal Real Time Analytics
PayPal  Real Time AnalyticsPayPal  Real Time Analytics
PayPal Real Time AnalyticsAnil Madan
 
UNIT 4 Social-Media-Marketing
UNIT 4 Social-Media-MarketingUNIT 4 Social-Media-Marketing
UNIT 4 Social-Media-MarketingMohammadAsim91
 
#askSAP Analytics Innovations Community Call: Become an Intelligent Enterpris...
#askSAP Analytics Innovations Community Call: Become an Intelligent Enterpris...#askSAP Analytics Innovations Community Call: Become an Intelligent Enterpris...
#askSAP Analytics Innovations Community Call: Become an Intelligent Enterpris...SAP Analytics
 

Similar to Cohort Analysis at Scale (20)

GiveSignup | RunSignup CRM Integrations
GiveSignup | RunSignup CRM IntegrationsGiveSignup | RunSignup CRM Integrations
GiveSignup | RunSignup CRM Integrations
 
Twin Cities Eloqua User Group 092413
Twin Cities Eloqua User Group 092413Twin Cities Eloqua User Group 092413
Twin Cities Eloqua User Group 092413
 
Before vs After: Redesigning a Website to be Useful and Informative for Devel...
Before vs After: Redesigning a Website to be Useful and Informative for Devel...Before vs After: Redesigning a Website to be Useful and Informative for Devel...
Before vs After: Redesigning a Website to be Useful and Informative for Devel...
 
How to Convert Your SAP BusinessObjects Unused Licenses to SAP Analytics Cloud
How to Convert Your SAP BusinessObjects Unused Licenses to SAP Analytics CloudHow to Convert Your SAP BusinessObjects Unused Licenses to SAP Analytics Cloud
How to Convert Your SAP BusinessObjects Unused Licenses to SAP Analytics Cloud
 
Measuring Results And Demonstrating Value.V1
Measuring Results And Demonstrating Value.V1Measuring Results And Demonstrating Value.V1
Measuring Results And Demonstrating Value.V1
 
Irina Pashina - UX Strategy Spanning Marketing and Technical Content at SAP
Irina Pashina - UX Strategy Spanning Marketing and Technical Content at SAPIrina Pashina - UX Strategy Spanning Marketing and Technical Content at SAP
Irina Pashina - UX Strategy Spanning Marketing and Technical Content at SAP
 
The Scout24 Data Landscape Manifesto: Building an Opinionated Data Platform
The Scout24 Data Landscape Manifesto: Building an Opinionated Data PlatformThe Scout24 Data Landscape Manifesto: Building an Opinionated Data Platform
The Scout24 Data Landscape Manifesto: Building an Opinionated Data Platform
 
SAP Process Mining in Action: Hear from Two Customers
SAP Process Mining in Action: Hear from Two CustomersSAP Process Mining in Action: Hear from Two Customers
SAP Process Mining in Action: Hear from Two Customers
 
[Webinar Deck] Google Data Studio for Mastering the Art of Data Visualizations
[Webinar Deck] Google Data Studio for Mastering the Art of Data Visualizations[Webinar Deck] Google Data Studio for Mastering the Art of Data Visualizations
[Webinar Deck] Google Data Studio for Mastering the Art of Data Visualizations
 
How to design web intelligence reports that behave like real dashboards
How to design web intelligence reports that behave like real dashboardsHow to design web intelligence reports that behave like real dashboards
How to design web intelligence reports that behave like real dashboards
 
Discover SAP BusinessObjects BI 4.3
Discover SAP BusinessObjects BI 4.3Discover SAP BusinessObjects BI 4.3
Discover SAP BusinessObjects BI 4.3
 
#askSAP Analytics Innovations Community Call: SAP 2018 strategy and Roadmap f...
#askSAP Analytics Innovations Community Call: SAP 2018 strategy and Roadmap f...#askSAP Analytics Innovations Community Call: SAP 2018 strategy and Roadmap f...
#askSAP Analytics Innovations Community Call: SAP 2018 strategy and Roadmap f...
 
SPS Cambs 07-09-18 - Getting started with Dodel Driven PowerApps
SPS Cambs 07-09-18 - Getting started with Dodel Driven PowerAppsSPS Cambs 07-09-18 - Getting started with Dodel Driven PowerApps
SPS Cambs 07-09-18 - Getting started with Dodel Driven PowerApps
 
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...Using Kafka in Your Organization with Real-Time User Insights for a Customer ...
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...
 
Analytics in Your Enterprise
Analytics in Your EnterpriseAnalytics in Your Enterprise
Analytics in Your Enterprise
 
BI4.2 SP06 and Beyond: The Future of SAP BusinessObjects Webi
BI4.2 SP06 and Beyond: The Future of SAP BusinessObjects WebiBI4.2 SP06 and Beyond: The Future of SAP BusinessObjects Webi
BI4.2 SP06 and Beyond: The Future of SAP BusinessObjects Webi
 
Supplier Success on the Ariba Network
Supplier Success on the Ariba NetworkSupplier Success on the Ariba Network
Supplier Success on the Ariba Network
 
PayPal Real Time Analytics
PayPal  Real Time AnalyticsPayPal  Real Time Analytics
PayPal Real Time Analytics
 
UNIT 4 Social-Media-Marketing
UNIT 4 Social-Media-MarketingUNIT 4 Social-Media-Marketing
UNIT 4 Social-Media-Marketing
 
#askSAP Analytics Innovations Community Call: Become an Intelligent Enterpris...
#askSAP Analytics Innovations Community Call: Become an Intelligent Enterpris...#askSAP Analytics Innovations Community Call: Become an Intelligent Enterpris...
#askSAP Analytics Innovations Community Call: Become an Intelligent Enterpris...
 

Recently uploaded

Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...KarteekMane1
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 

Recently uploaded (20)

Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 

Cohort Analysis at Scale

  • 1. Cohort Analysis at Scale BLAKE IRVINE STRATA SAN JOSE 2019.03.06
  • 5.
  • 6.
  • 8. Partners are companies that make it easier for people to sign up and engage with our service and help us retain members. BLAKE IRVINE | STRATA SAN JOSE 2018
  • 9. BLAKE IRVINE | STRATA SAN JOSE 2018
  • 10. Let’s take a (virtual) trip!
  • 11. Trip to Strata San Jose
  • 12. Trip to Strata San Jose
  • 13. Trip to Strata San Jose
  • 15. ● Are partners helping us acquire members? ● Through which channels? ● How does a partner impact the regional market? ● Do members use our service differently on partner devices? ● How do partners compare to each other? Evaluating Partners BLAKE IRVINE | STRATA SAN JOSE 2018
  • 16. Cohorts are the collections of members associated with a partner that are relevant to the business question. BLAKE IRVINE | STRATA SAN JOSE 2018
  • 17. ● Our trip example showed how one member can be associated with many partners. ● Business teams want to explore the nuance of cohorts... Cohorts can be complex BLAKE IRVINE | STRATA SAN JOSE 2018
  • 18. … for example BLAKE IRVINE | STRATA SAN JOSE 2018
  • 19. ● 100+ million members ● Dozens of partners ● Many combinations of members and partners ● Leads to… ○ High dimensionality, high cardinality datasets ○ Very large datasets of member-level time-series activity Evaluating Cohorts BLAKE IRVINE | STRATA SAN JOSE 2018
  • 21. ● Data platform ● Data construction ● Data product for cohort analysis Cohort Analysis at Scale BLAKE IRVINE | STRATA SAN JOSE 2018
  • 23. Data construction: Data Model member device partner playback events isp billing events billing processor BLAKE IRVINE | STRATA SAN JOSE 2018
  • 24. Data construction: Cohort Dataset signup_events cohort playback_events billing_events x_events BLAKE IRVINE | STRATA SAN JOSE 2018
  • 25. Data construction: Flat Tables playback_f cohort_playback_s device_d isp_d geo_d partner_d cohort_d BLAKE IRVINE | STRATA SAN JOSE 2018
  • 26. Data for consumption: Flat Table key memb er_id device _id device_name device_categor y partner_name country region data_payload 1 1213 674 Amazon Fire TV Set Top Box Amazon US Americas [{"id":5025945823792539,"sequen ce":41,"time":1491962092955}, {"id":5025947899236389,"sequen ce":95,"time":1491962104824}] 2 7623 1172 Chromecast Streaming Stick Google DE EMEA … 3 4291 129 PS3 Game Console Sony ES EMEA … 4 9013 447 iPad 4 Tablet Apple CA Americas … BLAKE IRVINE | STRATA SAN JOSE 2018
  • 27. Data construction: Copy forward BLAKE IRVINE | STRATA SAN JOSE 2018 cohort_playback_s Big Data Portal
  • 28. ● Goals ○ Serve dozens of users ○ Provide interactive / low-latency tool ○ Provide many different perspectives ● Challenges ○ Manage high dimensionality ○ Very large time-series datasets Data Product for Cohort Analysis BLAKE IRVINE | STRATA SAN JOSE 2018
  • 29. Analytic Tool Choices Choice 1 Choice 2 Choice 3 Analytic Tool Tool 1 Tool 2 Tool 3 Data Engine MPP Cloud In memory Data Size 1B rows 10B rows 100M rows Performance (SWAG) Up to many minutes Many minutes Several Seconds BLAKE IRVINE | STRATA SAN JOSE 2018
  • 30. ● Data stored in Druid ● Custom app built with Javascript Choice 4... BLAKE IRVINE | STRATA SAN JOSE 2018
  • 31. ● An open source data store for analytic applications ● Distributed, column-oriented, indexed architecture ● Well suited to serve our “flat” tables Druid white paper: http://static.druid.io/docs/druid.pdf BLAKE IRVINE | STRATA SAN JOSE 2018
  • 32. ● Built with Express, React, Redux, D3 ● Custom UX / UI to manage views and dimensionality ● Enabled access to data served by Druid ● Enabled management of query execution and caching BLAKE IRVINE | STRATA SAN JOSE 2018
  • 33. ● Video demo with simulated data BLAKE IRVINE | STRATA SAN JOSE 2018
  • 34. ● PED DEMO BLAKE IRVINE | STRATA SAN JOSE 2018
  • 36. ● Dimensions (aka slice-n-dice) ○ More is always better ○ Changes require restatement ● “Typical” use cases must be met also ○ Not a solution for every data question ○ Analysts and other tools are still needed Challenges BLAKE IRVINE | STRATA SAN JOSE 2018
  • 37. ● Data volume always increases… ○ More members, more partners, more devices, more metrics ● Custom app development time is longer, and ongoing ○ But for the right use cases, worthwhile Challenges BLAKE IRVINE | STRATA SAN JOSE 2018
  • 38. Partners help Netflix grow. We measure partner value through cohorts. Big data tools enable efficient analysis. BLAKE IRVINE | STRATA SAN JOSE 2018 Thank you!
  • 39. Blake Irvine - Growth Data Products birvine@netflix.com @blakeirvine linkedin.com/in/blakeirvine/

Editor's Notes

  1. In 2016, Netflix became available globally in almost all countries!