Владимир Гулин, Mail.Ru Group, Learning to rank using clickthrough data

Vladimir Gulin
Learning to rank
using clickthrough data

2
Search Engine Architecture
2
WEB CRAWLER
INDEX
BACKEND
FRONTEND

3
What is ranking?
3
 Main algorithm in search engine
 Based on ML algorithms
 Computes relevance score for query-document pair
 The most kept secret of search companies
Today ranking quality depends on
 Evaluation of ranking quality
 A method of Data Set construction
 Features of search engine
 ML algorithm

4
How to evaluate ranking quality?
4
Classical approach
Classical
Classical approach
 Select set of queries 𝑄 = {𝑞1, 𝑞2, … , 𝑞|𝑄|} from logs
 For each 𝑞 ∈ 𝑄 ∃ set of documents
𝑞 → 𝐷 = {𝑑1, 𝑑2, … , 𝑑 𝑁 𝑞
}
 For each (𝑞, 𝑑) ask experts for mark ∈ {0,1,2,3,4,5}
Discount Cumulative Gain
𝑫𝑪𝑮 =
𝟐 𝒓𝒆𝒍 𝒊 − 𝟏
log 𝟐 𝒊 + 𝟏
𝑁 𝑞
𝒊=𝟏𝒒∈𝑸

5
How to evaluate ranking quality with clickthrough
data?
5
Evaluation with absolute metrics
 Users were shown results from different rankings
 Measure statistics about user responses
• Abandonment rate
• Reformulation rate
• Position of first click
• Time to first click
• Etc.
Evaluation using Paired Comparisons
 Show a combination of results from 2 ranking
 Infer relative preferences
• Balanced interleaving
• Team-draft interleaving
• Etc.

6
Team-draft interleaving
6
SERP A
1. UrlA1
2. UrlA2
3. UrlA3
4. UrlA4
5. UrlA5
6. UrlA6
7. UrlA7
SERP B
1. UrlB1
2. UrlB2
3. UrlB3
4. UrlB4
5. UrlB5
6. UrlB6
7. UrlB7
SERP
1. UrlB1
2. UrlA1
3. UrlA2
4. UrlB2
5. UrlA3
6. UrlB3
7. UrlB4
∆=
𝑤𝑖𝑛𝑠 𝐴 +
1
2
𝑡𝑖𝑒𝑠(𝐴,𝐵)
𝑤𝑖𝑛𝑠 𝐴 + 𝑤𝑖𝑛𝑠 𝐵 + 𝑡𝑖𝑒𝑠(𝐴,𝐵)
- 0.5

Learning to rank with classical approach
7
Learning to rank algorithms
 Pointwise
𝐿 𝑓 𝑥 = (𝒇 𝒙𝒊 − 𝒓𝒆𝒍𝒊) 𝟐
𝑁 𝑞
 Pairwise
 Listwise
Discount Cumulative Gain
𝑫𝑪𝑮 =
𝟐 𝒓𝒆𝒍 𝒊 − 𝟏
log 𝟐 𝒊 + 𝟏
𝑁 𝑞
→ 𝒎𝒂𝒙
𝐿 𝑓 𝑥 = − log
𝑒 𝑓(𝑥 𝑖)
𝑒 𝑓(𝑥 𝑖) + 𝑒 𝑓(𝑥 𝑗)
(𝒊,𝒋)𝒒∈𝑸
𝐿 𝑓 𝑥 = −
𝑒 𝑟𝑒𝑙 𝑗
𝑒 𝑟𝑒𝑙 𝑘
𝑁 𝑞
𝒌=𝟏
log
𝑒 𝑓(𝑥 𝑗)
𝑒 𝑓(𝑥 𝑘)𝑁 𝑞
𝒌=𝟏
𝑁 𝑞
𝒋=𝟏𝒒∈𝑸

8
Typical problems of the classical approach
8
Problems with documents
 Search index is constantly changing we have to rebuild
ranking model often.
Problems with experts
 Experts do mistakes
 Group of experts not equal millions of users
 Experts do not ask queries
 We fit ranking for instructions(100 pages), not for users
Problems with queries
 Queries become irrelevant
 Ratings always outdated

9
Advantages and disadvantages of clickthrough
data
9
9
Expert judgements Clickthrough data
Thousands per day Millions per day
Expensive Cheap
Low speed of obtaining High speed of obtaining
Noisy data Extremely noisy data
Fresh only at the moment of
assessment
Always fresh data
Can evaluate any query (not
always correct)
Can’t evaluate queries that
nobody asks in SE
Judgements are biased Unbiased (in terms of our flow
of queries)

How we can use clickthrough data for
optimizing TDI?
10
Simple approach
SERP 1 SERP 2
vs
From 2 rankings select only serps, that win on TDI experiment

11
Optimal SERP construction
11
11
Given
 Query q
 Set of documents for q
𝑞 → 𝐷 = {𝑑1, 𝑑2, … , 𝑑 𝑁 𝑞
}
 User sessions with different permutations of docs from set D
Idea
 Let`s construct permutation (optimal permutation - OP) of docs that will win
any other permutation of these documents in terms of TDI experiments in
average

12
Information from user session
12
12
Example (Case 1)
query q
1. url1
2. url2
3. url3
4. url4
5. url5
6. url6
7. url7
8. url8
9. url9
10. url10
CLICK
What information have we received from this session?

13
13
13
Example (Case 1)
query q
1. url1
2. url2
3. url3
4. url4
5. url5
6. url6
7. url7
8. url8
9. url9
10. url10
CLICK
𝑢𝑟𝑙1 >
𝑢𝑟𝑙2
𝑢𝑟𝑙3
𝑢𝑟𝑙4
𝑢𝑟𝑙5
𝑢𝑟𝑙6
𝑢𝑟𝑙7
𝑢𝑟𝑙8
𝑢𝑟𝑙9
𝑢𝑟𝑙10
Remark:
It is obvious that it is possible to use more
complex click model (CCM, DBN, etc.)

14
14
14
Example (Case 2)
query q
1. url1
2. url2
3. url3
4. url4
5. url5
6. url6
7. url7
8. url8
9. url9
10. url10
What information have we received from this session?
CLICK
CLICK
CLICK

15
15
15
Example (Case 2)
query q
1. url1
2. url2
3. url3
4. url4
5. url5
6. url6
7. url7
8. url8
9. url9
10. url10
CLICK
CLICK
CLICK
𝑢𝑟𝑙2 >
𝑢𝑟𝑙1
𝑢𝑟𝑙3
𝑢𝑟𝑙5
𝑢𝑟𝑙6
𝑢𝑟𝑙7
𝑢𝑟𝑙9
𝑢𝑟𝑙10
𝑢𝑟𝑙4 >
𝑢𝑟𝑙1
𝑢𝑟𝑙3
𝑢𝑟𝑙5
𝑢𝑟𝑙6
𝑢𝑟𝑙7
𝑢𝑟𝑙9
𝑢𝑟𝑙10
𝑢𝑟𝑙8 >
𝑢𝑟𝑙1
𝑢𝑟𝑙3
𝑢𝑟𝑙5
𝑢𝑟𝑙6
𝑢𝑟𝑙7
𝑢𝑟𝑙9
𝑢𝑟𝑙10

16
16
16
Given
 For query q aggregate partial relative relevance judgments from all users
sessions
query q (session 1)
url1 > url2
url2 > url4
url1 > url5
….
query q (session 2)
url4 > url5
url2 > url1
url3 > url5
….
query q (session 3)
url4 > url5
url2 > url1
url5 > url2
….
query q (session k)
url4 > url5
url2 > url1
url3 > url5
….
query q
url4 > url5 (5 times)
url5 > url2 (-7 times)
….

17
17
17
Given
 Let`s find weights for each document for query q from system of linear
equations
query q
….
𝑥4 − 𝑥5 = 5
𝑥2 − 𝑥1 = 3
𝑥5 − 𝑥2 = −7
….

18
18
18
In common case
 Add information about positions of docs
query q
….
𝛾(𝑝𝑜𝑠4)𝑥4 − 𝛾 𝑝𝑜𝑠5 𝑥5 = 𝜑(𝑝𝑜𝑠4, 𝑝𝑜𝑠5, 5)
𝛾(𝑝𝑜𝑠2)𝑥2 − 𝛾 𝑝𝑜𝑠1 𝑥1 = φ(pos1, pos2,3)
𝛾(𝑝𝑜𝑠5)𝑥5 − 𝛾 𝑝𝑜𝑠2 𝑥2 = φ(pos2, pos5,7)
….

19
19
19
Finally
𝜸 𝟏𝟏 𝒙 𝟏 − 𝜸 𝟏𝟐 𝒙 𝟐 = 𝝋 𝟏
….
𝜸 𝟐𝟏 𝒙 𝟏 − 𝜸 𝟐𝟑 𝒙 𝟑 = 𝝋 𝟐
𝜸 𝑵𝑵 𝒒−𝟏 𝒙 𝑵 𝒒−𝟏 − 𝜸 𝑵𝑵 𝒒
𝒙 𝑵 𝒒
= 𝝋 𝑵
𝒀𝒙 = Ф
Solution for x
𝒙 = (𝒀 𝑻
𝒀)−𝟏
𝒀 𝑻
Ф
𝒅𝒊𝒎(𝒀) = 𝑵 × 𝑵 𝒒
𝒅𝒊𝒎 𝒙 = 𝑵 𝒒
𝒅𝒊𝒎 Ф = 𝑵
𝑵 − 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒑𝒂𝒓𝒕𝒊𝒂𝒍 𝒓𝒆𝒍𝒂𝒕𝒊𝒗𝒆 𝒋𝒖𝒅𝒈𝒎𝒆𝒏𝒕𝒔
𝑵 𝒒 − 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒅𝒐𝒄𝒔 𝒇𝒐𝒓 𝒒𝒖𝒆𝒓𝒚 𝒒

20
Results
20
20
 Computed Optimized Serps
for 200000 most frequent queries (7% of flow of queries)
 +14% quality for these frequent queries
 +1% search quality
 NOT BAD
 Let`s try use Optimized Serps for machine learning to rank
Amount of statistics

22
Learning from top results
22
Problems with learning from top results (Example)

23
Learning from top results
23
Problems with learning from top results
 Out of top there are many documents with quite another features distribution
 In all documents word “barcelona” there is in title. Therefore feature, that describes
availability words of query in title will be useless for this query.
Solution
 Let`s sample from set
of unlabeled urls
 We need sampling,
because we can`t add
all unlabeled data to
training data
………
Urls, that should be on top
Unlabeled urls

24
Semi-supervised learning to rank
24
Sampling from unlabeled urls
………
Unlabeled docs Build self organizing map Get one doc from each cluster
Sampled url
Sampled url
Sampled url
Sampled url
Sampled url

25
25
Add sampled docs as “irrelevant” to training set
Sampled url
Sampled url
………
Sampled url
Unlabeled urlsFinal training data for query q

Train data set
25
2626
Training data for query 𝑞1 Training data for query 𝑞2 Training data for query 𝑞|𝑄|
…..
Optimized Serp urls
Unlabeled urls (marked as irrelevant)

27
Results
26
 2.5% search quality

Final Results
27
 We received the automatic search improvement method
 This method can learn improved ranking function without any explicit
feedback from experts
timeline
TDI experiment with our old ranking, based on expert judgments
0
-0.01
0.01
0.02
0.03
0.04
0.05

29
Using clickthrough data
for online learning to rank

30
Using clickthrough data for
online learning to rank
29
Typical problems with new ranking formula construction
 We need large dataset (5-10 millions points)
 Usually we use active learning for obtaining this data
 It is necessary about 10-15 iterations of active learning for obtaining
new ranking formula with same quality as current model
 We can`t use all available clickthrough data for training out ranking formula
 Can we improve current formula using new clickthrough data?
 Can we improve current formula using ALL new clickthrough data?

31
Typical ranking formula
30
Typical ranking formula specification
 Ensemble of tens of thousands decision trees
 Trained using gradient boosting algorithm

32
Idea
31
«Recognition is clusterization, and the role of supervisor is
primarily to name clusters correct…»
Geoffrey Hinton

33
Typical ranking formula
32
Typical ranking formula specification
 Ranking formula can return only finite set of values
 Each decision tree in ensemble contains only several predicates
 Each query-document pair is described by aggregate of predicates of ensemble
 Let`s use partition of multidimensional space generated
by ranking formula as clustering
 Let`s remap all clickthrough data on this clusterization

34
Online learning to rank
33
point
point

36
Online learning to rank results
35
Online learning to rank
 We get online learning to rank method
 Method allows us to use ALL clickthrough feedback from users
 We don`t need to retrain model
 Method allows to actualize current ranking formula
under current users behavior

Владимир Гулин, Mail.Ru Group, Learning to rank using clickthrough data

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à Владимир Гулин, Mail.Ru Group, Learning to rank using clickthrough data

Similaire à Владимир Гулин, Mail.Ru Group, Learning to rank using clickthrough data (20)

Plus de Mail.ru Group

Plus de Mail.ru Group (20)

Dernier

Dernier (20)

Владимир Гулин, Mail.Ru Group, Learning to rank using clickthrough data