SlideShare une entreprise Scribd logo
1  sur  60
Télécharger pour lire hors ligne
Recommenders
Shallow / Deep
SUDEEP DAS
Frontiers and Advances in Data Sciences Conference,
X’ian, China 2017
Recommendations
guide our
experiences almost
everywhere!
Personalization in my
typical day
Morning: News/ Workout/ Getting ready
Commute hours:
Music/ YouTube Lectures/ Books
Now and then:
Social Media/ Shopping online
Evenings are for Netflix, of course!
ORIGINS
● 1999-2005: Netflix Prize:
○ >10% improvement, win $1,000,000
● Top performing model(s) ended up be a
variation of Matrix Factorization (SVD++,
Koren, et al)
● Although Netflix’s rec system has moved on,
MF is still the foundational method on which
most collaborative filtering systems are
based
Background
Matrix
Factorization
Singular Value Decomposition (Origins)
R = U Σ VT
U
VT
=
users
items
Σ
ratings
matrix
left/right singular
vectors
(orthonormal basis)
Singular values
(scaling)
R
● Low-rank approximation
● Eckart-Young theorem:
SVD: Largest SV’s for approximation
≈
[U’,Σ’,VT
’] = argmin ǁR - UΣVT
ǁ2
R
F
Frobenius Norm
Low-rank Matrix Factorization
● No orthogonality requirement
● Weighted least squares (or others)
P≈R
Q
Size of latent space
U Σ VT
Scaling factor is
absorbed into
both matrices
(not normalized)
● Bias terms
● Regularization, e.g. L2, L1, etc
Low-rank MF (cont…)
Overall bias User bias Item bias
From Olivier Grisel, dotAI 2017
The FeedForward View
MF Extensions
● Replace user-vector with sum of item vectors
Asymmetric Matrix Factorization
( )≈R I(R)
items items
N(u) is all items user i
rated/viewed/clicked
Y
Q
AMF, relation to Neural Network
1-hot encoding of a user’s
play history
Single hidden layer is
equivalent to learning
a Y and Q matrix (aka
weights)
● SLIM replaces low-rank approx by a sparse item-item matrix.
Sparity comes from L1 regularizer.
● Equivalent to constructing a regression using user’s play history to
predict ratings
● NB: Important that diagonal is excluded. Otherwise solution is trivial.
SLIM
≈R I(R)
Diagonal
replaced with
with zeros
Y
items
items
0
Clustering and
PGM
Example / Motivation
Classic Example / Motivation
?
?
? ?
?
?
? ?
?
0.88
Items
Users now belong to
multiple “topics”,
with some proportion
0.12
Purchases are a mix
proportional to user’s
affinity for topic, and item
affinity within topic
K
D
W
θ φz w
α
β
Latent Dirichlet Allocation (LDA)
LDA as a generative model
What topics look like:
0.15 0.630.22
Final step: Recommending from topics
● Once we’ve learnt a user’s distribution over topics, and each topic’s
distribution over items. Producing a recommendation is easy.
● Score every item, i, using below, and recommend items with highest
probability (discarding items the user has already purchased)
Deep Learning
in Recommender Systems
Why deep?
Deep
Learning
Is Making
Waves
Everywhere!
In many domains, deep learning is achieving near-human
or super-human accuracy!
However, applications of Deep Learning in Recommender Systems is at its infancy.
So, what is Deep Learning?
A class of machine learning algorithms:
● that use a cascade of multiple non-linear processing layers
● and complex model structures
● to learn different representations of the data in each layer
● where higher level features are derived from lower level features to form a
hierarchical representation.
Balázs Hidasi, RecSys 2016
Traditional vs Deep
Handcrafted
Features
Learned/Trainable
Features
Trainable Classifier
Trainable Classifier
Traditional ML
Deep Learning
“Socrates”
“Socrates”
Learning hierarchical representations of data
Learned Features Trainable Classifier
Each layer learns progressively complex representations from its predecessor
“Socrates”
Raw
Pixels Edges
Parts of Objects composed
from edges
Object models
Earliest adaptation: Restricted Boltzmann Machines
From recent presentation by
Alexandros Karatzoglou
One hidden layer.
User feedback on
items interacted
with, are
propagated back
to all items.
Very similar to an
autoencoder!
There are many ways to make this deep.
From Olivier Grisel, dotAI 2017
From Olivier Grisel, dotAI 2017
From Olivier Grisel, dotAI 2017
From Olivier Grisel, dotAI 2017
Deep Triplet Networks
From Olivier Grisel,
dotAI 2017
Wide + Deep Models for Recommendations
In a recommender setting, you may want to train with a wide set of
cross-product feature transformations , so that the model essentially
memorizes these sparse feature combinations (rules):
Meh! Yay! Cheng et al, Google Inc.
(2016)
Wide + Deep Models for Recommendations
On the other hand, you may want the ability to generalize using the
representational power of a deep network. But deep nets can
over-generalize.
Cheng et al, Google Inc.
(2016)
Wide + Deep Models for Recommendations
Best of both worlds:
Jointly train a deep + wide
network. The cross-feature
transformation in the wide
model component can
memorize all those sparse,
specific rules, while the
deep model component can
generalize to similar items
via embeddings.
Cheng et al, Google Inc.
(2016)
Wide + Deep Models for Recommendations
Cheng et al, Google Inc. (2016)
Wide + Deep Model
for app
recommendations.
The Youtube Recommendation model
A two Stage Approach with two deep networks:
● The candidate generation network takes events
from the user’s YouTube activity history as input and
retrieves a small subset (hundreds) of videos from a
large corpus. These candidates are intended to be
generally relevant to the user with high precision. The
candidate generation network only provides broad
personalization via collaborative filtering.
● The ranking network scores each video according to
a desired objective function using a rich set of
features describing the video and user. The highest
scoring videos are presented to the user, ranked by
their score
Covington et al., Google Inc. (2016)
The Youtube Recommendation model
Deep candidate generation model architecture
● embedded sparse features concatenated with
dense features. Embeddings are averaged
before concatenation to transform variable
sized bags of sparse IDs into fixed-width
vectors suitable for input to the hidden layers.
● All hidden layers are fully connected.
● In training, a cross-entropy loss is minimized
with gradient descent on the output of the
sampled softmax.
● At serving, an approximate nearest neighbor
lookup is performed to generate hundreds of
candidate video recommendations.
Stage One
Covington et al., Google Inc. (2016)
The Youtube Recommendation model
Stage Two
Deep ranking network
architecture
● uses embedded categorical
features (both univalent and
multivalent) with shared
embeddings and powers of
normalized continuous
features.
● All layers are fully connected.
In practice, hundreds of
features are fed into the
network.
Covington et al., Google Inc. (2016)
Autoencoders
Collaborative
Denoising
Auto-Encoder
Collaborative Denoising Auto-Encoders for Top-N Recommender Systems, Wu et.al., WSDM 2016
● Treats the feedback on items
y that the user U has
interacted with (input layer)
as a noisy version of the
user’s preferences on all
items (output layer)
● Introduces a user specific
input node and hidden bias
node, while the item weights
are shared across all users.
Recurrent Neural Networks - Sequence Modeling
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
A recurrent neural network can be thought of as multiple copies of the same
network, each passing a message to a successor.
Session-based recommendation with Recurrent
Neural Networks (GRU4Rec)
Hidasi et al.
ICLR (2016)
● Treat each user session as
sequence of clicks
● Predict next item in the session
sequence
Adding Item metadata to GRU4Rec: Parallel RNN
Hidasi et al.
Recsys (2016)
● Separate RNNs for each input
type
○ Item ID
○ Image feature vector
obtained from CNN (last
avg. pooling layer)
Convolutional Neural Nets
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 (2016) http://cs231n.stanford.edu/
VBPR: Visual Bayesian Personalized Ranking from
Implicit Feedback
He et al., AAAI (2015)
Helping cold start with augmenting item factors with visual factors
● Create an item Factor that is a sum of two terms: An Item Visual Factor which is an embedding
of a Deep CNN on the item image, and the usual collaborative item factor.
Deep content based music recommendations
http://benanne.github.io/2014/08/05/spotify-cnns.html
Cold Starting New or Less Popular
Music
● Take the Mel Spectrogram of
the song and run it through
several convolutional and
MaxPooling layers to a
compressed 1d representation.
● The training objective is to
minimize the squared error
between the collaborative item
factors of a known item and the
item factor predicted from the
CNN>
● Then for a new item, the model
can predict the item factor, and
make recommendations. Aäron van den Oord, Sander Dieleman and Benjamin Schrauwen, NIPS
2013
The Pinterest Application: Pin2Vec Related Pins
Liu et al (2017)
https://medium.com/the-graph/applying-deep-learning-to-related-pi
ns-a6fee3c92f5e
Learn a 128 dimensional compressed
representation of each item
(embedding). Then use a similarity
function (cosine) between them to find
similar items.
The Pinterest Application: Pin2Vec Related Pins
Liu et al (2017)
https://medium.com/the-graph/applying-deep-learning-to-related-pi
ns-a6fee3c92f5e
Co-occurrence Pin2Vec
The Pinterest Application: Pin2Vec Related Pins
Liu et al (2017)
https://medium.com/the-graph/applying-deep-learning-to-related-pi
ns-a6fee3c92f5e
Some concluding thoughts
● Deep Learning is augmenting shallow model based recommender systems.
The main draws for DL in RecSys seems to be:
● Better generalization beyond linear models for user-item interactions.
● Embeddings: Unified representation of heterogeneous signals (e.g. add
image/audio/textual content as side information to item embeddings via
convolutional NNs).
● Exploitation of sequential information in actions leading up to recommendation
(e.g. LSTM on viewing/purchase/search history to predict what will be
watched/purchased/searched next).
● DL toolkits provide unprecedented flexibility in experimenting with loss
functions (e.g. in toolkits like TensorFlow/MxNet/Keras etc. switching the loss
from classification loss to ranking loss is trivial.
Headline
THANKS!
sdas@netflix.com
@datamusing
@netflixresearch

Contenu connexe

Tendances

Artwork Personalization at Netflix
Artwork Personalization at NetflixArtwork Personalization at Netflix
Artwork Personalization at NetflixJustin Basilico
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixJustin Basilico
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsJaya Kawale
 
Missing values in recommender models
Missing values in recommender modelsMissing values in recommender models
Missing values in recommender modelsParmeshwar Khurd
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectiveJustin Basilico
 
Contextualization at Netflix
Contextualization at NetflixContextualization at Netflix
Contextualization at NetflixLinas Baltrunas
 
Query Facet Mapping and its Applications in Streaming Services: The Netflix C...
Query Facet Mapping and its Applications in Streaming Services: The Netflix C...Query Facet Mapping and its Applications in Streaming Services: The Netflix C...
Query Facet Mapping and its Applications in Streaming Services: The Netflix C...Sudeep Das, Ph.D.
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectiveXavier Amatriain
 
Shallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender SystemShallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender SystemAnoop Deoras
 
Personalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing RecommendationsPersonalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing RecommendationsJustin Basilico
 
Recommendation Modeling with Impression Data at Netflix
Recommendation Modeling with Impression Data at NetflixRecommendation Modeling with Impression Data at Netflix
Recommendation Modeling with Impression Data at NetflixJiangwei Pan
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsJustin Basilico
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated RecommendationsHarald Steck
 
Context Aware Recommendations at Netflix
Context Aware Recommendations at NetflixContext Aware Recommendations at Netflix
Context Aware Recommendations at NetflixLinas Baltrunas
 
Machine learning for Netflix recommendations talk at SF Make School
Machine learning for Netflix recommendations talk at SF Make SchoolMachine learning for Netflix recommendations talk at SF Make School
Machine learning for Netflix recommendations talk at SF Make SchoolFaisal Siddiqi
 
Artwork Personalization at Netflix Fernando Amat RecSys2018
Artwork Personalization at Netflix Fernando Amat RecSys2018 Artwork Personalization at Netflix Fernando Amat RecSys2018
Artwork Personalization at Netflix Fernando Amat RecSys2018 Fernando Amat
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableJustin Basilico
 
Personalization at Netflix - Making Stories Travel
Personalization at Netflix -  Making Stories Travel Personalization at Netflix -  Making Stories Travel
Personalization at Netflix - Making Stories Travel Sudeep Das, Ph.D.
 
Reward Innovation for long-term member satisfaction
Reward Innovation for long-term member satisfactionReward Innovation for long-term member satisfaction
Reward Innovation for long-term member satisfactionJiangwei Pan
 

Tendances (20)

Artwork Personalization at Netflix
Artwork Personalization at NetflixArtwork Personalization at Netflix
Artwork Personalization at Netflix
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
 
Missing values in recommender models
Missing values in recommender modelsMissing values in recommender models
Missing values in recommender models
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry Perspective
 
Contextualization at Netflix
Contextualization at NetflixContextualization at Netflix
Contextualization at Netflix
 
Query Facet Mapping and its Applications in Streaming Services: The Netflix C...
Query Facet Mapping and its Applications in Streaming Services: The Netflix C...Query Facet Mapping and its Applications in Streaming Services: The Netflix C...
Query Facet Mapping and its Applications in Streaming Services: The Netflix C...
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
 
Shallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender SystemShallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender System
 
Personalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing RecommendationsPersonalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing Recommendations
 
Recommendation Modeling with Impression Data at Netflix
Recommendation Modeling with Impression Data at NetflixRecommendation Modeling with Impression Data at Netflix
Recommendation Modeling with Impression Data at Netflix
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Context Aware Recommendations at Netflix
Context Aware Recommendations at NetflixContext Aware Recommendations at Netflix
Context Aware Recommendations at Netflix
 
Machine learning for Netflix recommendations talk at SF Make School
Machine learning for Netflix recommendations talk at SF Make SchoolMachine learning for Netflix recommendations talk at SF Make School
Machine learning for Netflix recommendations talk at SF Make School
 
Artwork Personalization at Netflix Fernando Amat RecSys2018
Artwork Personalization at Netflix Fernando Amat RecSys2018 Artwork Personalization at Netflix Fernando Amat RecSys2018
Artwork Personalization at Netflix Fernando Amat RecSys2018
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
 
Personalization at Netflix - Making Stories Travel
Personalization at Netflix -  Making Stories Travel Personalization at Netflix -  Making Stories Travel
Personalization at Netflix - Making Stories Travel
 
Reward Innovation for long-term member satisfaction
Reward Innovation for long-term member satisfactionReward Innovation for long-term member satisfaction
Reward Innovation for long-term member satisfaction
 

Similaire à Deep Learning Advances in Recommender Systems

How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...Wee Hyong Tok
 
From Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptxFrom Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptxChun-Hao Chang
 
OReilly AI Transfer Learning
OReilly AI Transfer LearningOReilly AI Transfer Learning
OReilly AI Transfer LearningDanielle Dean
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning SystemsXavier Amatriain
 
BRV CTO Summit Deep Learning Talk
BRV CTO Summit Deep Learning TalkBRV CTO Summit Deep Learning Talk
BRV CTO Summit Deep Learning TalkDoug Chang
 
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDistributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDatabricks
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017Shuai Zhang
 
DLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep LearningDLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep LearningBrodmann17
 
On the Influence Propagation of Web Videos
On the Influence Propagation of Web VideosOn the Influence Propagation of Web Videos
On the Influence Propagation of Web Videosabidhavp
 
Fundamentals of Deep Recommender Systems
 Fundamentals of Deep Recommender Systems Fundamentals of Deep Recommender Systems
Fundamentals of Deep Recommender SystemsWQ Fan
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.pptyang947066
 
Recommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right DatasetRecommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right DatasetCrossing Minds
 
ODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPindico data
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning SystemsAnuj Gupta
 

Similaire à Deep Learning Advances in Recommender Systems (20)

How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...
 
From Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptxFrom Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptx
 
OReilly AI Transfer Learning
OReilly AI Transfer LearningOReilly AI Transfer Learning
OReilly AI Transfer Learning
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
 
BRV CTO Summit Deep Learning Talk
BRV CTO Summit Deep Learning TalkBRV CTO Summit Deep Learning Talk
BRV CTO Summit Deep Learning Talk
 
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDistributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
 
Deep Learning Recommender Systems
Deep Learning Recommender SystemsDeep Learning Recommender Systems
Deep Learning Recommender Systems
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017
 
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
 
DLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep LearningDLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep Learning
 
On the Influence Propagation of Web Videos
On the Influence Propagation of Web VideosOn the Influence Propagation of Web Videos
On the Influence Propagation of Web Videos
 
Fundamentals of Deep Recommender Systems
 Fundamentals of Deep Recommender Systems Fundamentals of Deep Recommender Systems
Fundamentals of Deep Recommender Systems
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.ppt
 
Recommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right DatasetRecommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right Dataset
 
ODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLP
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
 
Image captioning
Image captioningImage captioning
Image captioning
 
Dssg talk CNN intro
Dssg talk CNN introDssg talk CNN intro
Dssg talk CNN intro
 
Distributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark MeetupDistributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark Meetup
 

Dernier

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 

Dernier (20)

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 

Deep Learning Advances in Recommender Systems

  • 1. Recommenders Shallow / Deep SUDEEP DAS Frontiers and Advances in Data Sciences Conference, X’ian, China 2017
  • 4. Morning: News/ Workout/ Getting ready
  • 6. Now and then: Social Media/ Shopping online
  • 7. Evenings are for Netflix, of course!
  • 9.
  • 10. ● 1999-2005: Netflix Prize: ○ >10% improvement, win $1,000,000 ● Top performing model(s) ended up be a variation of Matrix Factorization (SVD++, Koren, et al) ● Although Netflix’s rec system has moved on, MF is still the foundational method on which most collaborative filtering systems are based Background
  • 12. Singular Value Decomposition (Origins) R = U Σ VT U VT = users items Σ ratings matrix left/right singular vectors (orthonormal basis) Singular values (scaling) R
  • 13. ● Low-rank approximation ● Eckart-Young theorem: SVD: Largest SV’s for approximation ≈ [U’,Σ’,VT ’] = argmin ǁR - UΣVT ǁ2 R F Frobenius Norm
  • 14. Low-rank Matrix Factorization ● No orthogonality requirement ● Weighted least squares (or others) P≈R Q Size of latent space U Σ VT Scaling factor is absorbed into both matrices (not normalized)
  • 15. ● Bias terms ● Regularization, e.g. L2, L1, etc Low-rank MF (cont…) Overall bias User bias Item bias
  • 16. From Olivier Grisel, dotAI 2017 The FeedForward View
  • 18. ● Replace user-vector with sum of item vectors Asymmetric Matrix Factorization ( )≈R I(R) items items N(u) is all items user i rated/viewed/clicked Y Q
  • 19. AMF, relation to Neural Network 1-hot encoding of a user’s play history Single hidden layer is equivalent to learning a Y and Q matrix (aka weights)
  • 20. ● SLIM replaces low-rank approx by a sparse item-item matrix. Sparity comes from L1 regularizer. ● Equivalent to constructing a regression using user’s play history to predict ratings ● NB: Important that diagonal is excluded. Otherwise solution is trivial. SLIM ≈R I(R) Diagonal replaced with with zeros Y items items 0
  • 23. Classic Example / Motivation
  • 24. ? ? ? ? ? ? ? ? ? 0.88 Items Users now belong to multiple “topics”, with some proportion 0.12 Purchases are a mix proportional to user’s affinity for topic, and item affinity within topic
  • 25. K D W θ φz w α β Latent Dirichlet Allocation (LDA)
  • 26. LDA as a generative model
  • 27. What topics look like: 0.15 0.630.22
  • 28. Final step: Recommending from topics ● Once we’ve learnt a user’s distribution over topics, and each topic’s distribution over items. Producing a recommendation is easy. ● Score every item, i, using below, and recommend items with highest probability (discarding items the user has already purchased)
  • 31. In many domains, deep learning is achieving near-human or super-human accuracy! However, applications of Deep Learning in Recommender Systems is at its infancy.
  • 32. So, what is Deep Learning? A class of machine learning algorithms: ● that use a cascade of multiple non-linear processing layers ● and complex model structures ● to learn different representations of the data in each layer ● where higher level features are derived from lower level features to form a hierarchical representation. Balázs Hidasi, RecSys 2016
  • 33. Traditional vs Deep Handcrafted Features Learned/Trainable Features Trainable Classifier Trainable Classifier Traditional ML Deep Learning “Socrates” “Socrates”
  • 34. Learning hierarchical representations of data Learned Features Trainable Classifier Each layer learns progressively complex representations from its predecessor “Socrates” Raw Pixels Edges Parts of Objects composed from edges Object models
  • 35. Earliest adaptation: Restricted Boltzmann Machines From recent presentation by Alexandros Karatzoglou One hidden layer. User feedback on items interacted with, are propagated back to all items. Very similar to an autoencoder!
  • 36. There are many ways to make this deep. From Olivier Grisel, dotAI 2017
  • 37. From Olivier Grisel, dotAI 2017
  • 38. From Olivier Grisel, dotAI 2017
  • 39. From Olivier Grisel, dotAI 2017
  • 40. Deep Triplet Networks From Olivier Grisel, dotAI 2017
  • 41. Wide + Deep Models for Recommendations In a recommender setting, you may want to train with a wide set of cross-product feature transformations , so that the model essentially memorizes these sparse feature combinations (rules): Meh! Yay! Cheng et al, Google Inc. (2016)
  • 42. Wide + Deep Models for Recommendations On the other hand, you may want the ability to generalize using the representational power of a deep network. But deep nets can over-generalize. Cheng et al, Google Inc. (2016)
  • 43. Wide + Deep Models for Recommendations Best of both worlds: Jointly train a deep + wide network. The cross-feature transformation in the wide model component can memorize all those sparse, specific rules, while the deep model component can generalize to similar items via embeddings. Cheng et al, Google Inc. (2016)
  • 44. Wide + Deep Models for Recommendations Cheng et al, Google Inc. (2016) Wide + Deep Model for app recommendations.
  • 45. The Youtube Recommendation model A two Stage Approach with two deep networks: ● The candidate generation network takes events from the user’s YouTube activity history as input and retrieves a small subset (hundreds) of videos from a large corpus. These candidates are intended to be generally relevant to the user with high precision. The candidate generation network only provides broad personalization via collaborative filtering. ● The ranking network scores each video according to a desired objective function using a rich set of features describing the video and user. The highest scoring videos are presented to the user, ranked by their score Covington et al., Google Inc. (2016)
  • 46. The Youtube Recommendation model Deep candidate generation model architecture ● embedded sparse features concatenated with dense features. Embeddings are averaged before concatenation to transform variable sized bags of sparse IDs into fixed-width vectors suitable for input to the hidden layers. ● All hidden layers are fully connected. ● In training, a cross-entropy loss is minimized with gradient descent on the output of the sampled softmax. ● At serving, an approximate nearest neighbor lookup is performed to generate hundreds of candidate video recommendations. Stage One Covington et al., Google Inc. (2016)
  • 47. The Youtube Recommendation model Stage Two Deep ranking network architecture ● uses embedded categorical features (both univalent and multivalent) with shared embeddings and powers of normalized continuous features. ● All layers are fully connected. In practice, hundreds of features are fed into the network. Covington et al., Google Inc. (2016)
  • 49. Collaborative Denoising Auto-Encoder Collaborative Denoising Auto-Encoders for Top-N Recommender Systems, Wu et.al., WSDM 2016 ● Treats the feedback on items y that the user U has interacted with (input layer) as a noisy version of the user’s preferences on all items (output layer) ● Introduces a user specific input node and hidden bias node, while the item weights are shared across all users.
  • 50. Recurrent Neural Networks - Sequence Modeling http://colah.github.io/posts/2015-08-Understanding-LSTMs/ A recurrent neural network can be thought of as multiple copies of the same network, each passing a message to a successor.
  • 51. Session-based recommendation with Recurrent Neural Networks (GRU4Rec) Hidasi et al. ICLR (2016) ● Treat each user session as sequence of clicks ● Predict next item in the session sequence
  • 52. Adding Item metadata to GRU4Rec: Parallel RNN Hidasi et al. Recsys (2016) ● Separate RNNs for each input type ○ Item ID ○ Image feature vector obtained from CNN (last avg. pooling layer)
  • 53. Convolutional Neural Nets Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 (2016) http://cs231n.stanford.edu/
  • 54. VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback He et al., AAAI (2015) Helping cold start with augmenting item factors with visual factors ● Create an item Factor that is a sum of two terms: An Item Visual Factor which is an embedding of a Deep CNN on the item image, and the usual collaborative item factor.
  • 55. Deep content based music recommendations http://benanne.github.io/2014/08/05/spotify-cnns.html Cold Starting New or Less Popular Music ● Take the Mel Spectrogram of the song and run it through several convolutional and MaxPooling layers to a compressed 1d representation. ● The training objective is to minimize the squared error between the collaborative item factors of a known item and the item factor predicted from the CNN> ● Then for a new item, the model can predict the item factor, and make recommendations. Aäron van den Oord, Sander Dieleman and Benjamin Schrauwen, NIPS 2013
  • 56. The Pinterest Application: Pin2Vec Related Pins Liu et al (2017) https://medium.com/the-graph/applying-deep-learning-to-related-pi ns-a6fee3c92f5e Learn a 128 dimensional compressed representation of each item (embedding). Then use a similarity function (cosine) between them to find similar items.
  • 57. The Pinterest Application: Pin2Vec Related Pins Liu et al (2017) https://medium.com/the-graph/applying-deep-learning-to-related-pi ns-a6fee3c92f5e Co-occurrence Pin2Vec
  • 58. The Pinterest Application: Pin2Vec Related Pins Liu et al (2017) https://medium.com/the-graph/applying-deep-learning-to-related-pi ns-a6fee3c92f5e
  • 59. Some concluding thoughts ● Deep Learning is augmenting shallow model based recommender systems. The main draws for DL in RecSys seems to be: ● Better generalization beyond linear models for user-item interactions. ● Embeddings: Unified representation of heterogeneous signals (e.g. add image/audio/textual content as side information to item embeddings via convolutional NNs). ● Exploitation of sequential information in actions leading up to recommendation (e.g. LSTM on viewing/purchase/search history to predict what will be watched/purchased/searched next). ● DL toolkits provide unprecedented flexibility in experimenting with loss functions (e.g. in toolkits like TensorFlow/MxNet/Keras etc. switching the loss from classification loss to ranking loss is trivial.