SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
What Are OpenTable 
Diners Talking About? 
Sudeep Das, Ph.D. @datamusing 
Data Scientist 
OpenTable
OpenTable in numbers 
• Our network connects diners with more than 31,000 
restaurants worldwide. 
! 
• Our diners have spent more than $20 billion at our 
partner restaurants. 
! 
• OpenTable seats more than 15 million diners each 
month. 
! 
• Every month, OpenTable diners write more than 400,000 
restaurant reviews. 
Sudeep Das
We’ve got reviews! 
• Our network connects diners with 
more than 31,000 restaurants 
worldwide. 
• Our diners have provided more 
than 25 million reviews. 
• For our first set of reviews, we 
looked at the San Francisco Bay 
Area where we have technology in 
about half of all reservation-taking 
restaurants and seated about 40% 
of all diners in this market. 
Sudeep Das
Reviews come in all shapes 
Sudeep Das 
and sizes! 
This really is a hidden gem and I'm not sure I want to share but I will. :) The owner, Claude, has 
been here for 47 years and is all about quality, taste, and not overcharging for what he loves. My 
husband and I don't often get into the city at night, but when we do this is THE place. The Grand 
Marnier Souffle' is the best I've had in my life - and I have a few years on the life meter. The custard 
is not over the top and the texture of the entire dessert is superb. This is the only family style French 
restaurant I'm aware of in SF. It also doesn't charge you an arm and a leg for their excellent quality 
and that also goes for the wine list. Soup, salad, choice of main (try the lamb shank) and choice of 
“SUPERB!” 
Bay Area Reviews 
Post Jan 2013
Learning Topics 
Sudeep Das
From diner-talk to insights! 
Analyze the global corpus of 
reviews in a geographic region to 
learn topics 
Classify topics into categories (food, 
ambiance, service etc…) 
Sudeep Das 
Map topics back to restaurants 
For each restaurant and a topic, 
surface relevant reviews 
Use learned topic weights to ! 
find restaurants relevant to a topic/category
Topics can be leveraged to power 
global to local feature generation 
ULTRA LOCAL LOCAL GLOBAL 
! 
! 
Most talked about menu 
items at a restaurant: 
! 
“If you are at Slanted Door 
in San Francisco, try their 
Cellophane Noodles with 
Dungeness Crab…” 
Sudeep Das 
Regional Dishes: 
! 
! 
“If you are in London, be 
sure to try a 
Meat Fruit Tipsy Cake …” 
Dish Tags 
! 
! 
“Find all restaurants that serve 
Lemon Ricotta Pancakes”
We used a purely Python 
Sudeep Das 
stack… 
During the exploratory stage, we applied the Python stack 
pandas scikit-learn gensim py-nltk 
We applied two main topic modeling methods: 
! 
• Latent Dirichlet Allocation (LDA) (Blei et al. 2003) 
• Non-negative Matrix Factorization (NMF) (Aurora et al. 2012) 
Currently porting over the techniques to Apache Spark.
We used a purely Python 
pandas scikit-learn gensim py-nltk 
Sudeep Das 
stack… 
vectorizer = CountVectorizer(max_df=0.9, 
max_features=10000, 
stop_words=“english”) 
counts = vectorizer.fit_transform(the_reviews.revs.ReviewText.values) 
decomp = decomposition.NMF(n_components=self.n_topics) 
.fit(vectorized_reviews.counts)
Some topics from the 
Sudeep Das 
SF Bay Area
Not all topics are created equal! 
The global topics cluster into six major categories: 
Food Drinks Ambiance 
Value Service Special occasion 
We found 200 to be a good number of topics to learn. ! 
! 
We manually label the first crop of topics, and subsequently use a 
classifier to relabel any new crop of topics against the labeled ones. 
Sudeep Das 
Generic
Topics and categories 
Food Drinks Ambiance 
Sudeep Das
Topics and categories 
Value Service Special occasion 
We ignore very generic topics that has to do with just going 
to a restaurant. 
Sudeep Das
Regional nuances in topics 
We ran topic analyses separately for each of four 
metropolitan areas: 
SAN FRANCISCO 
Sudeep Das 
CHICAGO 
NEW YORK 
LONDON
Bay view, ocean view, 
Sudeep Das 
city view … 
SAN FRANCISCO CHICAGO 
NEW YORK 
LONDON
Sunday Brunch or Sunday 
US CITIES LONDON 
Sudeep Das 
Roast?
Yorkshire Pudding, Curry 
and Afternoon Teas! 
Some topics that are dominant only in London 
Sudeep Das
Reassigning topics back to 
Sudeep Das 
restaurants 
Each review 
for a given 
restaurant 
has certain 
topic 
distribution 
Combining 
them, we 
identify the 
top topics 
for that 
restaurant. 
1 
0.5 
0 
Topic 01 Topic 02 Topic 03 Topic 04 Topic 05 
1 
0.5 
0 
Topic 01 Topic 02 Topic 03 Topic 04 Topic 05 
1 
0.5 
0 
Topic 01 Topic 02 Topic 03 Topic 04 Topic 05 
review 1 
review 2 
.. . 
review N 
1 
0.5 
0 
Topic 01 Topic 02 Topic 03 Topic 04 Topic 05 
Restaurant
From Restaurant 
Topics to Reviews 
Once we have identified the top topics for 
a restaurant, we score reviews according 
to the weight of that topic, and surface the 
top reviews. 
Sudeep Das
Time for a little demo! 
http://restaurant-topics.herokuapp.com/ 
Sudeep Das 
Python Flask + D3.js 
Deployed with Docker
Using Topics to Find 
Relevant Restaurants! 
We can now turn the topic 
analysis over its head! 
Once a topic is chosen, we can rank 
restaurants according to its 
relevance to that topic! 
Sudeep Das 
That, now, becomes a way to find 
restaurants according to a topic! 
“Find me all restaurants with a great wine list”
Using Topics to Find 
Relevant Restaurants! 
We can now turn the topic 
analysis over its head! 
One a topic is chosen, we can rank 
restaurants according to its 
relevance to that topic! 
Sudeep Das 
That, now, becomes a way to find 
restaurants according to a topic! 
“Find me all restaurants with a great wine list”
From N-grams and topics 
to complex dish tags … 
An n-gram is a contiguous sequence of n items 
from a given sequence of text. …(e.g. to be, be 
or, or not, not to, to be ) 
Most dishes are usually 1-grams 
(“tiramisu”) 2-grams (“pork cutlets”) 
or 3-grams (“lemon ricotta pancake”) 
Sudeep Das 
For each restaurant, we perform an N-gram 
analysis of the reviews within the scope of 
food topics and surface candidate dish tags 
We were able to generate several thousands of 
dish tags using this methodology!
From N-grams and topics 
to complex dish tags … 
Sudeep Das
From dish tags to most 
talked about menu items 
For each restaurant we find the dish tags with 
the strongest signal 
We perform a fuzzy matching with the menu 
data (both name and description) from 
SinglePlatform 
Sudeep Das 
This lets us automagically disambiguate the 
various ways people refer to the same dish! 
For e.g “crab noodles” , “noodles with dungeness”, “cellphone 
noodles with crab” all refer to the same dish “Cellophane Noodles 
with Dungeness Crab” on the menu.
Most talked about menu Items 
Sudeep Das
Finding regional dishes! 
Because we performed dish tag discovery by metro area, we were able 
to also surface the most unique dishes per metro. 
Sudeep Das 
! 
! 
! 
! 
ALL 
DISH 
TAGS 
DISH TAGS SPECIAL 
TO LONDON
Finding regional dishes! 
Sudeep Das
Finding regional dishes! 
Sudeep Das
Finding regional dishes! 
Sudeep Das
Evaluating Spark+MLlib… 
Sudeep Das
Future Directions 
Sudeep Das 
! 
Sentiment Analysis on Reviews 
! 
Trending Topic and Sentiments 
And a lot more …

Contenu connexe

Dernier

原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 

Dernier (20)

原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 

En vedette

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

En vedette (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

What are OpenTable Diners Talking About

  • 1. What Are OpenTable Diners Talking About? Sudeep Das, Ph.D. @datamusing Data Scientist OpenTable
  • 2. OpenTable in numbers • Our network connects diners with more than 31,000 restaurants worldwide. ! • Our diners have spent more than $20 billion at our partner restaurants. ! • OpenTable seats more than 15 million diners each month. ! • Every month, OpenTable diners write more than 400,000 restaurant reviews. Sudeep Das
  • 3. We’ve got reviews! • Our network connects diners with more than 31,000 restaurants worldwide. • Our diners have provided more than 25 million reviews. • For our first set of reviews, we looked at the San Francisco Bay Area where we have technology in about half of all reservation-taking restaurants and seated about 40% of all diners in this market. Sudeep Das
  • 4. Reviews come in all shapes Sudeep Das and sizes! This really is a hidden gem and I'm not sure I want to share but I will. :) The owner, Claude, has been here for 47 years and is all about quality, taste, and not overcharging for what he loves. My husband and I don't often get into the city at night, but when we do this is THE place. The Grand Marnier Souffle' is the best I've had in my life - and I have a few years on the life meter. The custard is not over the top and the texture of the entire dessert is superb. This is the only family style French restaurant I'm aware of in SF. It also doesn't charge you an arm and a leg for their excellent quality and that also goes for the wine list. Soup, salad, choice of main (try the lamb shank) and choice of “SUPERB!” Bay Area Reviews Post Jan 2013
  • 6. From diner-talk to insights! Analyze the global corpus of reviews in a geographic region to learn topics Classify topics into categories (food, ambiance, service etc…) Sudeep Das Map topics back to restaurants For each restaurant and a topic, surface relevant reviews Use learned topic weights to ! find restaurants relevant to a topic/category
  • 7. Topics can be leveraged to power global to local feature generation ULTRA LOCAL LOCAL GLOBAL ! ! Most talked about menu items at a restaurant: ! “If you are at Slanted Door in San Francisco, try their Cellophane Noodles with Dungeness Crab…” Sudeep Das Regional Dishes: ! ! “If you are in London, be sure to try a Meat Fruit Tipsy Cake …” Dish Tags ! ! “Find all restaurants that serve Lemon Ricotta Pancakes”
  • 8. We used a purely Python Sudeep Das stack… During the exploratory stage, we applied the Python stack pandas scikit-learn gensim py-nltk We applied two main topic modeling methods: ! • Latent Dirichlet Allocation (LDA) (Blei et al. 2003) • Non-negative Matrix Factorization (NMF) (Aurora et al. 2012) Currently porting over the techniques to Apache Spark.
  • 9. We used a purely Python pandas scikit-learn gensim py-nltk Sudeep Das stack… vectorizer = CountVectorizer(max_df=0.9, max_features=10000, stop_words=“english”) counts = vectorizer.fit_transform(the_reviews.revs.ReviewText.values) decomp = decomposition.NMF(n_components=self.n_topics) .fit(vectorized_reviews.counts)
  • 10. Some topics from the Sudeep Das SF Bay Area
  • 11. Not all topics are created equal! The global topics cluster into six major categories: Food Drinks Ambiance Value Service Special occasion We found 200 to be a good number of topics to learn. ! ! We manually label the first crop of topics, and subsequently use a classifier to relabel any new crop of topics against the labeled ones. Sudeep Das Generic
  • 12. Topics and categories Food Drinks Ambiance Sudeep Das
  • 13. Topics and categories Value Service Special occasion We ignore very generic topics that has to do with just going to a restaurant. Sudeep Das
  • 14. Regional nuances in topics We ran topic analyses separately for each of four metropolitan areas: SAN FRANCISCO Sudeep Das CHICAGO NEW YORK LONDON
  • 15. Bay view, ocean view, Sudeep Das city view … SAN FRANCISCO CHICAGO NEW YORK LONDON
  • 16. Sunday Brunch or Sunday US CITIES LONDON Sudeep Das Roast?
  • 17. Yorkshire Pudding, Curry and Afternoon Teas! Some topics that are dominant only in London Sudeep Das
  • 18. Reassigning topics back to Sudeep Das restaurants Each review for a given restaurant has certain topic distribution Combining them, we identify the top topics for that restaurant. 1 0.5 0 Topic 01 Topic 02 Topic 03 Topic 04 Topic 05 1 0.5 0 Topic 01 Topic 02 Topic 03 Topic 04 Topic 05 1 0.5 0 Topic 01 Topic 02 Topic 03 Topic 04 Topic 05 review 1 review 2 .. . review N 1 0.5 0 Topic 01 Topic 02 Topic 03 Topic 04 Topic 05 Restaurant
  • 19. From Restaurant Topics to Reviews Once we have identified the top topics for a restaurant, we score reviews according to the weight of that topic, and surface the top reviews. Sudeep Das
  • 20. Time for a little demo! http://restaurant-topics.herokuapp.com/ Sudeep Das Python Flask + D3.js Deployed with Docker
  • 21. Using Topics to Find Relevant Restaurants! We can now turn the topic analysis over its head! Once a topic is chosen, we can rank restaurants according to its relevance to that topic! Sudeep Das That, now, becomes a way to find restaurants according to a topic! “Find me all restaurants with a great wine list”
  • 22. Using Topics to Find Relevant Restaurants! We can now turn the topic analysis over its head! One a topic is chosen, we can rank restaurants according to its relevance to that topic! Sudeep Das That, now, becomes a way to find restaurants according to a topic! “Find me all restaurants with a great wine list”
  • 23. From N-grams and topics to complex dish tags … An n-gram is a contiguous sequence of n items from a given sequence of text. …(e.g. to be, be or, or not, not to, to be ) Most dishes are usually 1-grams (“tiramisu”) 2-grams (“pork cutlets”) or 3-grams (“lemon ricotta pancake”) Sudeep Das For each restaurant, we perform an N-gram analysis of the reviews within the scope of food topics and surface candidate dish tags We were able to generate several thousands of dish tags using this methodology!
  • 24. From N-grams and topics to complex dish tags … Sudeep Das
  • 25. From dish tags to most talked about menu items For each restaurant we find the dish tags with the strongest signal We perform a fuzzy matching with the menu data (both name and description) from SinglePlatform Sudeep Das This lets us automagically disambiguate the various ways people refer to the same dish! For e.g “crab noodles” , “noodles with dungeness”, “cellphone noodles with crab” all refer to the same dish “Cellophane Noodles with Dungeness Crab” on the menu.
  • 26. Most talked about menu Items Sudeep Das
  • 27. Finding regional dishes! Because we performed dish tag discovery by metro area, we were able to also surface the most unique dishes per metro. Sudeep Das ! ! ! ! ALL DISH TAGS DISH TAGS SPECIAL TO LONDON
  • 32. Future Directions Sudeep Das ! Sentiment Analysis on Reviews ! Trending Topic and Sentiments And a lot more …