This document summarizes a presentation about fast matrix computations and their applications. It discusses using matrix methods to solve problems related to recommendation systems, network analysis, and ranking aggregation more efficiently. Specific algorithms discussed include PageRank, heat kernels, and nuclear norm ranking. It also describes theoretical guarantees for localization and runtime of these algorithms, as well as empirical evaluations demonstrating their accuracy and performance on real-world datasets.
3. error
1
Models Previous work
– and algorithms for high performance !
from the PI tackled net- computations
matrix and network
FIGURE 6
std
2
work alignment with matrix methods =for cm
edge
Std, s 0.39
Big data(b)methods
overlap:
SIMAX ‘09, SISC ‘11,MapReduce ‘11, ICASSP ’12
1
j
i
i0
Overlap
Overlap
j0
error
SC ‘05, WAW ‘07, SISC ‘10, WWW ’10, …
Massive matrix "
computations
std
0
0
Fast & Scalable"
Network centrality
0
20
10
10
A
L
B
Tensor eigenvalues"
0
(d) Std, s = 1.95 cm
Ax = b
min kAx bk
Ax = x
This proposal is for matchand a power method
Network alignment
tensor
ing triangles using
P
methods:
on multi-threaded
maximize
Tijk xi xj xk
model compared to the prediction standard debble locations at the final time for two values of
ICDM ‘09, SC ‘11, TKDE ‘13
= 1.95 cm. (Colors are visible in the electronic
ijk
n
and kxk2 = 1
subject to distributed
j
Triangle
j
X
i
s
i
approximately twenty minutes to construct using (next) architectures
k
[x
]i = ⇢ · (
Tijk xj xk + xi )
k
s.
jk
s- involved a few pre- and post-processing steps:
ta
where ! ensures the 2-norm
m Aria, globally transpose the data, compute the
g errors. The preprocessing steps took approx- SSHOPM method due to "
nd
0
0
Data clustering
WSDM ‘12, KDD ‘12, CIKM ’13 …
0
A
recise timing information, L we do notB
but
report
David
Kolda and Mayo
Gleich
· Purdue
Netflix
3
t
r
o
s.
g
n.
o
4. The talk ends, you
believe -- whatever
you want to.
Image from rockysprings, deviantart, CC share-alike
4
Everything in the world can be
explained by a matrix, and we see
how deep the rabbit hole goes
5. Matrix computations in a red-pill
David Gleich · Purdue
Netflix
5
Solve a problem better by
exploiting its structure!
6. Problem 1 – (Faster) !
Recommendation as link prediction
WHY NO PREPROCESSING?
Top-k predicted “links”
are movies to watch!
David F. Gleich (Purdue)
David Gleich · Purdue
Emory Math/CS Seminar
Netflix
6
Pairwise scores give
user similarity
19 of 47
7. David Gleich · Purdue
Netflix
7
Problem 2 – (Better) !
Best movies
8. Matrix computations in a red-pill
David Gleich · Purdue
Netflix
8
Solve a problem better by
exploiting its structure!
9. Matrix structure
Netflix graph
Movies "
“liked”
(>3 stars?)
Problem 1!
Adjacency matrix
Normalized Laplacian matrix
Random walk matrix
Netflix matrix
1
1
4
5
5
Problem 2"
Pairwise comparison matrix
David Gleich · Purdue
Netflix
9
5
10. Problem 1 – (Faster) !
Recommendation as link prediction
WHY NO PREPROCESSING?
Top-k predicted “links”
are movies to watch!
David F. Gleich (Purdue)
David Gleich · Purdue
Emory Math/CS Seminar
Netflix
10
Pairwise scores give
user similarity
19 of 47
11. z score (edge-based) is
movie = X ↵ ✓
pred. on
`
`=1
1
X
num. paths of length `
from user to movie
✓
◆
user
k=
(↵ A )
ind. vec.
`=1
{z
}
|
` Math/CS Seminar
Emory `
⌘ei
◆
David Gleich · Purdue
Netflix
11
1
Movie prediction"
vector
)
Matrix based link predictors
12. Matrix based link predictors
1
X
✓
◆
user
k=
(↵ A )
ind. vec.
`=1
{z
}
|
`
`
⌘ei
Neumann
Carl Neumann
1
X
k =0
↵A)k = ei
(tA)k
David Gleich · Purdue
Netflix
12
(I
13. The Katz score (edge-based) is
Matrix based link predictors
(I
↵A)k = ei
PageRank
(I
↵P)x = ei
Semi-super."
learning
Heat kernel
(IDavid F. Gleich (Purdue) = ei
↵L)x
exp{↵P}x = ei
Emory Math/CS Seminar
They all look at sums of "
damped paths, but "
change the details, slightly
David Gleich · Purdue
Netflix
13
Katz
14. Matrix based link predictors
are localized!
PageRank scores for one node!
Crawl of flickr from 2006 ~800k nodes, 6M edges, alpha=1/2
0
1.5
error
||xtrue – xnnz||1
10
1
0.5
−10
10
−15
0
2
4
plot(x)
6
8
10
5
x 10
10
0
10
2
4
10
6
10
10
nonzeros
David Gleich · Purdue
Netflix
14
0
−5
10
15. Matrix based link predictors
are localized!
KATZ SCORES ARE LOCALIZED
David F. Gleich (Purdue)
Emory Math/CS Seminar
32 of 47
David Gleich · Purdue
Netflix
15
Up to 50 neighbors is
99.65% of the total
mass
16. Matrix computations in a red-pill
David Gleich · Purdue
Netflix
16
Solve a problem better by
exploiting its structure!
17. How do we compute them fast?
PageRank
xj = ↵
X
i neigh. of j
xi
deg(i)
+ 1 if j is the target user
w/ access to in-links & degs.
w/ access to out-links
PageRankPull
PageRankPush
xj(k+1)
(k)
↵xa /6
(k)
↵xb /2
(k)
↵xc /3
= fj
xj(k+1)
↵
X
i!j
xi(k ) /degi = fj
Let
b
a
c
j = blue node
(k+1)
= xj(k) + rj
(k +1)
=0
then
xj
Update
r(k +1) rj
(k
(k)
ra +1) = ra + ↵rj(k ) /3
(k
(k)
rb +1) = rb + ↵rj(k ) /3
(k
(k)
rc +1) = rc + ↵rj(k ) /3
David Gleich · Purdue
Netflix
17
(k +1)
Solve for
xj
j = blue node
18. We have good theory
for this algorithm …
David Gleich · Purdue
Netflix
18
… and even better
empirical performance.
19. Theory
Andersen, Chung, Lang (2006)!
For PageRank, “fast runtimes” and “localization”
Bonchi, Esfandiar, Gleich, et al. (2010/2013)!
For Katz, “fast runtimes”
David Gleich · Purdue
Netflix
19
Kloster, Gleich (2013)!
For Katz, Heat Kernel, "
“fast runtimes” and “localization”"
(assuming power-law degrees)
20. Accuracy vs. work !
(Heat kernel)
dblp−cc
dblp collaboration graph, 225k vertices
1
0.6
tol=10−5
tol=10−4
0.4
@10
@25
0.2
@100
@1000
0
−2
−1
0
10
10
10
Effective matrix−vector products
David Gleich · Purdue
Netflix
20
Precision
0.8
For the dblp collaboration
graph, we study the
precision in finding the
100 largest nodes as we
vary the work. This set of
100 does not include the
nodes immediate
neighbors. (One column,
but representative)
22. 40
Never got to try it …
analytics
test on
HelloMovies.com Need to ix now
matr
Netflix
60
80
Ran out of money once we had the algorithms
… promising initial results though!
I collaborate with the company behind He
David Gleich · Purdue
Netflix
22
Note
1
1
1
1
1
1
1
1
23. David Gleich · Purdue
Netflix
23
Problem 2 – (Better) !
Best movies
24. Which is a better list of good DVDs?
Lord of the Rings 3: The Return of …
Lord of the Rings 3: The Return of …
Lord of the Rings 1: The Fellowship
Lord of the Rings 1: The Fellowship
Lord of the Rings 2: The Two Towers
Lord of the Rings 2: The Two Towers
Lost: Season 1
Star Wars V: Empire Strikes Back
Battlestar Galactica: Season 1
Raiders of the Lost Ark
Fullmetal Alchemist
Star Wars IV: A New Hope
Trailer Park Boys: Season 4
Shawshank Redemption
Trailer Park Boys: Season 3
Star Wars VI: Return of the Jedi
Tenchi Muyo!
Lord of the Rings 3: Bonus DVD
Shawshank Redemption
The Godfather
Nuclear Norm "
based rank aggregation
(the mean rating)
(not matrix completion on the
netflix rating matrix)
David Gleich · Purdue
Netflix
24/40
Standard "
rank aggregation"
25. Rank Aggregation
Given partial orders on subsets of items, rank aggregation
is the problem of finding an overall ordering.
Voting Find the winning candidate
Program committees Find the best papers given reviews
Dining Find the best restaurant in Chicago
David Gleich · Purdue
Netflix
25/40
26. Ranking is really hard
John Kemeny
Ken Arrow
All rank aggregations
involve some measure of
compromise
A good ranking is the
“average” ranking under a
permutation distance
NP hard to compute
Kemeny’s ranking
David Gleich · Purdue
Netflix
26/40
Dwork, Kumar, Naor, !
Sivikumar
27. Supposewe had scores
Suppose we had scores
Let be the score of the ith movie/song/paper/team to rank
Suppose we can compare the ith to jth:
is skew-symmetric, rank 2.
Also works for
with an extra log.
Numerical ranking is intimately intertwined
with skew-symmetric matrices
Kemeny and Snell, Mathematical Models in Social Sciences (1978)
David F. Gleich (Purdue)
KDD 2011
David Gleich · Purdue
Netflix
6/20
27/40
Then
28. Using ratings as comparisons
Arithmetic Mean
Ratings induce
various skewsymmetric matrices.
From David 1988 – The
Method of Paired Comparisons
David Gleich · Purdue
Netflix
28/40
Log-odds
29. Extracting the scores
Extracting the scores
do we have?
Do we trust all
Not really.
David F. Gleich (Purdue)
105
101
101
105
Number of Comparisons
?
Netflix data 17k movies,
500k users, 100M ratings–
99.17% filled
KDD 2011
David
Gleich · Purdue
Netflix
29/40
How many
Most.
107
Movie Pairs
Given with all entries, then
is the Borda
count, the least-squares
solution to
8/20
30. Only partial info? COMPLETE
IT!
Only partial info? Complete it!
Let
be known for
We trust these scores.
Goal Find the simplest skew-symmetric matrix that matches
the data
noiseless
Both of these are NP-hard too.
David F. Gleich (Purdue)
David Gleich · Purdue
KDD 2011
Netflix
30/40
noisy
9/20
31. From a French nuclear test in 1970, imageNetflix
from http://picdit.wordpress.com/2008/07/21/8David Gleich · Purdue
insane-nuclear-explosions/
31/40
Solution GO NUCLEAR!
32. The ranking algorithm
The Ranking Algorithm
0. INPUT (ratings data) and c
(for trust on comparisons)
1. Compute from
2. Discard entries with fewer than
c comparisons
3. Set
to be indices and
values of what’s left
4.
= SVP(
)
David Gleich · Purdue
Netflix
32/40
5. OUTPUT
33. Exact recovery
Exactrecovery
results
Fraction of trials recovered
indices. Instead we view the following theorem as providing
intuition for the noisy problem.
Consider the operator basis for Hermitian matrices:
H = S [ K [ D where
p
S = {1/ 2(ei eT + ej eT ) : 1 i < j n};
j
i
David Gross showed how to recover Hermitian matrices.
p
K = {ı/ 2(ei eT ej eT ) : 1we get n}; exact
j
i
i.e. the conditions under which i < j the
Note that
D = {ei eT : 1 i n}.
i
1
0.8
0.6
0.4
0.2
0
2
10
is Hermitian. Thus our new result!
T
Theorem 5. Let s be centered, i.e., s e = 0. Let Y =
seT
esT where ✓ = maxi s2 /(sT s) and ⇢ = ((maxi si )
i
(mini si ))/ksk. Also, let ⌦ ⇢ H be a random set of elements
with size |⌦| O(2n⌫(1 + )(log n)2 ) where ⌫ = max((n✓ +
1)/4, n⇢2 ). Then the solution of
minimize
kXk⇤
Figure
ity of
about
both th
§6.1 fo
6.1 R
The fi
subject to trace(X W i ) = trace((ıY ) W i ), W i 2 ⌦
ability o
the nois
is equal to ıY with probability at least 1 n .
with un
These a
The proof of this theorem follows directly by Theorem 4 if Netflix
= se
David Gleich · Purdue
Y
⇤
33/40
⇤
34. Recovery Discussion and Experiments
Confession If
, then just look at differences from
a connected set. Constants? Not very good.
Intuition for the truth.
David Gleich · Purdue
Netflix
34
35. Recovery Discussion and Experiments
Recovery Experiments
look at differences from
Confession If
, then just
a connected set. Constants? Not very good.
Intuition for the truth.
KDD 2011
16/20
David Gleich · Purdue
Netflix
35/40
David F. Gleich (Purdue)
36. Evaluation
Nuclear norm ranking
Mean rating
1
Median Kendall’s Tau
0.9
0.8
20
10
5
2
1.5
0.7
0.6
0.5
0.9
0.8
0.7
0.6
0.5
0
0.2
0.4 0.6
Error
0.8
1
0
0.2
0.4 0.6
Error
0.8
1
Figure 3: The performanceDavid Gleich · Purdue
of our algorithm Netflix
(left)
36/40
Median Kendall’s Tau
1
37. Tie in with PageRank
Another way to compute the scores is through a
close relative of PageRank and the linkprediction methods.
Massey or Colley methods
(2I + D A)s = “differeneces”
(L + 2D 1 )x = “scaled differences”
David Gleich · Purdue
Netflix
37/40
38. Ongoing Work
Finding communities in large networks !
We have the best community finder (as of CIKM2013)"
Whang, Gleich, Dhillon (CIKM)
Fast clique detection!
We have the fastest solver for max-clique problems, useful for
computing temporal strong components (Rossi, Gleich, et al. arXiv)
Scalable network alignment !
w
v
s
Overlap
r
& Low-rank clustering with features + links!
wtu
u
t
A
L
B
& Evolving network analysis!
David Gleich · Purdue
Netflix
38
& Scalable, distributed implementations !
of fast graph kernels!
39. References
!
Papers
Gleich & Lim, KDD 2011 – Nuclear Norm Ranking"
Esfandiar, Gleich, Bonchi et al. – WAW2010, J. Internet. Math. 2013"
Kloster & Gleich, WAW2013, arXiv 1310.3423
Code!
www.cs.purdue.edu/homes/dgleich/codes!
bit.ly/dgleich-code
Supported by NSF CAREER 1149756-CCF
www.cs.purdue.edu/homes/dgleich
David Gleich · Purdue
Netflix
39
!
!