SlideShare une entreprise Scribd logo
1  sur  15
Télécharger pour lire hors ligne
Personalizing LinkedIn Feed
Guy Lebanon
September 24, 2015
Collaborators: Deepak Agarwal, Kevin Chang, Bee-Chung Chen,
Boyi Chen, Qi He, Zhenhao Hua, Yiming Ma, Mikhail Obukhov,
Pannagadatta Shivaswamy, Liang Tang, Hsiao-Ping Tseng, Jaewon
Yang, Lin Yang, and Liang Zhang
LinkedIn Feed: Value and Impact
Main gateway to LinkedIn’s website and mobile app.
Help users stay informed by showing news and other posts
related to their career
Help users stay connected to their professional network by
showing updates from connections
Help users establish their brand/reputation via sharing and
posting.
In addition, help user find jobs (JYMBII), grow network
(PYMK), and generate ad revenue.
Small tweaks make huge impact on hundreds of millions of
LinkedIn members.
LinkedIn Feed: A Machine Learning Problem
Recommendation system with a few twists:
Heterogenous inventory: updates from my connections, jobs
recommendations, news recommendations, ads, people you
may know, etc.
Explicit and Implicit signals: clicks, like, share, comment, VPT
Different value for different member segment: job seeker,
novice, consumer
High quality profile
Social graph data
Spam and unprofessional content
Online scoring of a very large number of candidate updates
Feed Relevance System
multiple homogenous sources =⇒
each rank top-k =⇒
combine top-k lists into a single ranked list =⇒
reranking stage
Feed Mixer System
Feed mixer combines top-k ranked lists
Online component responds to web server request
Offline component prepares training data, trains model, process
features
Activities
Most feed activities are formulated as (actor, verb, object) triplet,
whose type (actor type, verb type, object type) is the activity type.
Model
Ranking is based on logistic regression model capturing
interactions (clicks, likes, etc.). If yit is a binary variable
capturing whether user i interacted with update t, then
P(yit = 1 | user, update) =

1 + exp


j
βj [Xit]j




−1
where Xit is a vector describing the user i and update t and β
is the model parameter vector.
MLE training based on “random bucket” data that has a
reduced serving bias. Big data and high dimensionality lead to
distributed training procedures e.g., ADMM, Spark.
Features
Xit = (Xi , Xt, Xik, Xij , Xijk, Xio)
Xi : viewer features
e.g., profile information
Xt: update features
e.g., actor type, verb type, etc.
Xik viewer-activity type features
“viewer-activity type affinity” (computed offline)
Xij viewer actor features
“viewer-actor affinity” (computed offline)
Xijk viewer-actor-activity type features
“contextual viewer-actor affinity” (partially computed offline)
Xio viewer object features
e.g., viewer-object content features
Affinity Features
Double or triple interaction features such as Xij or Xijk are hard to
scale due to the explosion of feature dimensionality.
Solution: Build a separate model for viewer-actor affinity αij ,
viewer-activity type affinity αit, and contextual viewer-actor
affinity αijk.
Affinity models are trained via a separate pipeline and stored
offline
At serve time, relevant affinity scores are loaded from data
store and are used as features in the CTR prediction model
Decouples a very complex problem into several simpler
problems and at the same time separates heavy computation
in offline and fast loading in online.
Joint Feed Effects
Some joint feed effects are hard to model using a “pointwise”
model such as a CTR model.
For example, we observe that engagement drops for non-diverse
feeds
Joint Feed Effects
We also observe that engagement drops for previously seen
updates.
Reranking
Ranking based on CTR scoring model cannot take into account
entire feed. Reranker manipulates the CTR ranked list to have a
better ranking and lower negative “joint feed effects”.
Reranking stages include
ensuring actor diversity (within session)
ensuring actor diversity (cross session)
ensuring update type diversity
impression discounting
additional “secret sauce”
Open Problems
AB tests for social networks
Explore/Exploit with minimal negative effect on member’s
experience
Likes, comments, shares create viral updates that propagate in
a network. How do we measure impact of a social gesture on
the eco-system
Tradeoff between optimizing for consumers and producers
Tradeoff between value to members and value to eco-system
Thank You
Additional details can be found in the Feed Relevance KDD 2014
(Activity Ranking in LinkedIn Feed) and KDD 2015 (Personalizing
LinkedIn Feed) papers on feed relevance.

Contenu connexe

Similaire à GT_feed

Qualche spunto sul misurare i social network - Barcamp Torino 2008 - Luca Grivet
Qualche spunto sul misurare i social network - Barcamp Torino 2008 - Luca GrivetQualche spunto sul misurare i social network - Barcamp Torino 2008 - Luca Grivet
Qualche spunto sul misurare i social network - Barcamp Torino 2008 - Luca Grivethypergriso
 
Activity Ranking in LinkedIn Feed
Activity Ranking in LinkedIn FeedActivity Ranking in LinkedIn Feed
Activity Ranking in LinkedIn FeedBodla Kumar
 
Recsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakRecsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakDeepak Agarwal
 
Tutorial on metrics of user engagement -- Applications to Search & E- commerce
Tutorial on metrics of user engagement -- Applications to Search & E- commerceTutorial on metrics of user engagement -- Applications to Search & E- commerce
Tutorial on metrics of user engagement -- Applications to Search & E- commerceMounia Lalmas-Roelleke
 
Digital analytics lecture4
Digital analytics lecture4Digital analytics lecture4
Digital analytics lecture4Joni Salminen
 
Beginners discussion to - Google Analytics
Beginners discussion to - Google Analytics Beginners discussion to - Google Analytics
Beginners discussion to - Google Analytics Lee Trevena
 
Measureable Knowledge Management
Measureable Knowledge ManagementMeasureable Knowledge Management
Measureable Knowledge ManagementPeter H. Reiser
 
2010-INCREMENTAL USER MODELING WITH HETEROGENEOUS USER BEHAVIORS-30628
2010-INCREMENTAL USER MODELING WITH HETEROGENEOUS USER BEHAVIORS-306282010-INCREMENTAL USER MODELING WITH HETEROGENEOUS USER BEHAVIORS-30628
2010-INCREMENTAL USER MODELING WITH HETEROGENEOUS USER BEHAVIORS-30628Hua Li, PhD
 
User Engagement - A Scientific Challenge
User Engagement - A Scientific ChallengeUser Engagement - A Scientific Challenge
User Engagement - A Scientific ChallengeMounia Lalmas-Roelleke
 
Building Vibrant Communities - Erfolgreiche Einführung von Enterprise 2.0
Building Vibrant Communities - Erfolgreiche Einführung von Enterprise 2.0Building Vibrant Communities - Erfolgreiche Einführung von Enterprise 2.0
Building Vibrant Communities - Erfolgreiche Einführung von Enterprise 2.0Peter H. Reiser
 
IRJET- A New Approach to Product Recommendation Systems
IRJET- A New Approach to Product Recommendation SystemsIRJET- A New Approach to Product Recommendation Systems
IRJET- A New Approach to Product Recommendation SystemsIRJET Journal
 
IRJET- A New Approach to Product Recommendation Systems
IRJET-  	  A New Approach to Product Recommendation SystemsIRJET-  	  A New Approach to Product Recommendation Systems
IRJET- A New Approach to Product Recommendation SystemsIRJET Journal
 
Artificial Intelligence at LinkedIn
Artificial Intelligence at LinkedInArtificial Intelligence at LinkedIn
Artificial Intelligence at LinkedInBill Liu
 
Google Analytics: MVPs and Game-Changing New Features
Google Analytics: MVPs and Game-Changing New FeaturesGoogle Analytics: MVPs and Game-Changing New Features
Google Analytics: MVPs and Game-Changing New FeaturesBrian Alpert
 
Silicon Valley 2.0: The Startup Metrics Revolution (Tokyo, December 2008)
Silicon Valley 2.0: The Startup Metrics Revolution (Tokyo, December 2008)Silicon Valley 2.0: The Startup Metrics Revolution (Tokyo, December 2008)
Silicon Valley 2.0: The Startup Metrics Revolution (Tokyo, December 2008)Dave McClure
 
Return on investment of social media by Luke Williams
Return on investment of social media by Luke WilliamsReturn on investment of social media by Luke Williams
Return on investment of social media by Luke WilliamsCIPR Wessex
 
The eigen rumor algorithm
The eigen rumor algorithmThe eigen rumor algorithm
The eigen rumor algorithmamooool2000
 
Open Data Infrastructures Evaluation Framework using Value Modelling
Open Data Infrastructures Evaluation Framework using Value Modelling Open Data Infrastructures Evaluation Framework using Value Modelling
Open Data Infrastructures Evaluation Framework using Value Modelling Yannis Charalabidis
 
Modelling Personalization
Modelling PersonalizationModelling Personalization
Modelling PersonalizationBogo Vatovec
 

Similaire à GT_feed (20)

Qualche spunto sul misurare i social network - Barcamp Torino 2008 - Luca Grivet
Qualche spunto sul misurare i social network - Barcamp Torino 2008 - Luca GrivetQualche spunto sul misurare i social network - Barcamp Torino 2008 - Luca Grivet
Qualche spunto sul misurare i social network - Barcamp Torino 2008 - Luca Grivet
 
Activity Ranking in LinkedIn Feed
Activity Ranking in LinkedIn FeedActivity Ranking in LinkedIn Feed
Activity Ranking in LinkedIn Feed
 
Recsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakRecsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and Deepak
 
Tutorial on metrics of user engagement -- Applications to Search & E- commerce
Tutorial on metrics of user engagement -- Applications to Search & E- commerceTutorial on metrics of user engagement -- Applications to Search & E- commerce
Tutorial on metrics of user engagement -- Applications to Search & E- commerce
 
Digital analytics lecture4
Digital analytics lecture4Digital analytics lecture4
Digital analytics lecture4
 
Beginners discussion to - Google Analytics
Beginners discussion to - Google Analytics Beginners discussion to - Google Analytics
Beginners discussion to - Google Analytics
 
Measureable Knowledge Management
Measureable Knowledge ManagementMeasureable Knowledge Management
Measureable Knowledge Management
 
2010-INCREMENTAL USER MODELING WITH HETEROGENEOUS USER BEHAVIORS-30628
2010-INCREMENTAL USER MODELING WITH HETEROGENEOUS USER BEHAVIORS-306282010-INCREMENTAL USER MODELING WITH HETEROGENEOUS USER BEHAVIORS-30628
2010-INCREMENTAL USER MODELING WITH HETEROGENEOUS USER BEHAVIORS-30628
 
User Engagement - A Scientific Challenge
User Engagement - A Scientific ChallengeUser Engagement - A Scientific Challenge
User Engagement - A Scientific Challenge
 
Building Vibrant Communities - Erfolgreiche Einführung von Enterprise 2.0
Building Vibrant Communities - Erfolgreiche Einführung von Enterprise 2.0Building Vibrant Communities - Erfolgreiche Einführung von Enterprise 2.0
Building Vibrant Communities - Erfolgreiche Einführung von Enterprise 2.0
 
IRJET- A New Approach to Product Recommendation Systems
IRJET- A New Approach to Product Recommendation SystemsIRJET- A New Approach to Product Recommendation Systems
IRJET- A New Approach to Product Recommendation Systems
 
IRJET- A New Approach to Product Recommendation Systems
IRJET-  	  A New Approach to Product Recommendation SystemsIRJET-  	  A New Approach to Product Recommendation Systems
IRJET- A New Approach to Product Recommendation Systems
 
Artificial Intelligence at LinkedIn
Artificial Intelligence at LinkedInArtificial Intelligence at LinkedIn
Artificial Intelligence at LinkedIn
 
Google Analytics: MVPs and Game-Changing New Features
Google Analytics: MVPs and Game-Changing New FeaturesGoogle Analytics: MVPs and Game-Changing New Features
Google Analytics: MVPs and Game-Changing New Features
 
Yahoo! Engagement Study
Yahoo! Engagement StudyYahoo! Engagement Study
Yahoo! Engagement Study
 
Silicon Valley 2.0: The Startup Metrics Revolution (Tokyo, December 2008)
Silicon Valley 2.0: The Startup Metrics Revolution (Tokyo, December 2008)Silicon Valley 2.0: The Startup Metrics Revolution (Tokyo, December 2008)
Silicon Valley 2.0: The Startup Metrics Revolution (Tokyo, December 2008)
 
Return on investment of social media by Luke Williams
Return on investment of social media by Luke WilliamsReturn on investment of social media by Luke Williams
Return on investment of social media by Luke Williams
 
The eigen rumor algorithm
The eigen rumor algorithmThe eigen rumor algorithm
The eigen rumor algorithm
 
Open Data Infrastructures Evaluation Framework using Value Modelling
Open Data Infrastructures Evaluation Framework using Value Modelling Open Data Infrastructures Evaluation Framework using Value Modelling
Open Data Infrastructures Evaluation Framework using Value Modelling
 
Modelling Personalization
Modelling PersonalizationModelling Personalization
Modelling Personalization
 

GT_feed

  • 1. Personalizing LinkedIn Feed Guy Lebanon September 24, 2015 Collaborators: Deepak Agarwal, Kevin Chang, Bee-Chung Chen, Boyi Chen, Qi He, Zhenhao Hua, Yiming Ma, Mikhail Obukhov, Pannagadatta Shivaswamy, Liang Tang, Hsiao-Ping Tseng, Jaewon Yang, Lin Yang, and Liang Zhang
  • 2. LinkedIn Feed: Value and Impact Main gateway to LinkedIn’s website and mobile app. Help users stay informed by showing news and other posts related to their career Help users stay connected to their professional network by showing updates from connections Help users establish their brand/reputation via sharing and posting. In addition, help user find jobs (JYMBII), grow network (PYMK), and generate ad revenue. Small tweaks make huge impact on hundreds of millions of LinkedIn members.
  • 3.
  • 4. LinkedIn Feed: A Machine Learning Problem Recommendation system with a few twists: Heterogenous inventory: updates from my connections, jobs recommendations, news recommendations, ads, people you may know, etc. Explicit and Implicit signals: clicks, like, share, comment, VPT Different value for different member segment: job seeker, novice, consumer High quality profile Social graph data Spam and unprofessional content Online scoring of a very large number of candidate updates
  • 5. Feed Relevance System multiple homogenous sources =⇒ each rank top-k =⇒ combine top-k lists into a single ranked list =⇒ reranking stage
  • 6. Feed Mixer System Feed mixer combines top-k ranked lists Online component responds to web server request Offline component prepares training data, trains model, process features
  • 7. Activities Most feed activities are formulated as (actor, verb, object) triplet, whose type (actor type, verb type, object type) is the activity type.
  • 8. Model Ranking is based on logistic regression model capturing interactions (clicks, likes, etc.). If yit is a binary variable capturing whether user i interacted with update t, then P(yit = 1 | user, update) =  1 + exp   j βj [Xit]j     −1 where Xit is a vector describing the user i and update t and β is the model parameter vector. MLE training based on “random bucket” data that has a reduced serving bias. Big data and high dimensionality lead to distributed training procedures e.g., ADMM, Spark.
  • 9. Features Xit = (Xi , Xt, Xik, Xij , Xijk, Xio) Xi : viewer features e.g., profile information Xt: update features e.g., actor type, verb type, etc. Xik viewer-activity type features “viewer-activity type affinity” (computed offline) Xij viewer actor features “viewer-actor affinity” (computed offline) Xijk viewer-actor-activity type features “contextual viewer-actor affinity” (partially computed offline) Xio viewer object features e.g., viewer-object content features
  • 10. Affinity Features Double or triple interaction features such as Xij or Xijk are hard to scale due to the explosion of feature dimensionality. Solution: Build a separate model for viewer-actor affinity αij , viewer-activity type affinity αit, and contextual viewer-actor affinity αijk. Affinity models are trained via a separate pipeline and stored offline At serve time, relevant affinity scores are loaded from data store and are used as features in the CTR prediction model Decouples a very complex problem into several simpler problems and at the same time separates heavy computation in offline and fast loading in online.
  • 11. Joint Feed Effects Some joint feed effects are hard to model using a “pointwise” model such as a CTR model. For example, we observe that engagement drops for non-diverse feeds
  • 12. Joint Feed Effects We also observe that engagement drops for previously seen updates.
  • 13. Reranking Ranking based on CTR scoring model cannot take into account entire feed. Reranker manipulates the CTR ranked list to have a better ranking and lower negative “joint feed effects”. Reranking stages include ensuring actor diversity (within session) ensuring actor diversity (cross session) ensuring update type diversity impression discounting additional “secret sauce”
  • 14. Open Problems AB tests for social networks Explore/Exploit with minimal negative effect on member’s experience Likes, comments, shares create viral updates that propagate in a network. How do we measure impact of a social gesture on the eco-system Tradeoff between optimizing for consumers and producers Tradeoff between value to members and value to eco-system
  • 15. Thank You Additional details can be found in the Feed Relevance KDD 2014 (Activity Ranking in LinkedIn Feed) and KDD 2015 (Personalizing LinkedIn Feed) papers on feed relevance.