SlideShare une entreprise Scribd logo
1  sur  39
Télécharger pour lire hors ligne
Fast matrix primitives for
ranking, communities !
and more.



David F. Gleich!

David Gleich · Purdue

Netflix

1

Computer Science!
Purdue University!
Netflix

2

David Gleich · Purdue
error
1

Models Previous work
– and algorithms for high performance !
from the PI tackled net- computations
matrix and network
FIGURE 6

std

2

work alignment with matrix methods =for cm
edge
Std, s 0.39
Big data(b)methods
overlap:
SIMAX ‘09, SISC ‘11,MapReduce ‘11, ICASSP ’12
1

j

i

i0
Overlap
Overlap

j0

error



SC ‘05, WAW ‘07, SISC ‘10, WWW ’10, …

Massive matrix "
computations

std

0

0

Fast & Scalable"
Network centrality

0

20
10

10

A
L
B
Tensor eigenvalues"
0

(d) Std, s = 1.95 cm

Ax = b
min kAx bk
Ax = x

This proposal is for matchand a power method
Network alignment
tensor
ing triangles using

P
methods:
on multi-threaded
maximize
Tijk xi xj xk

model compared to the prediction standard debble locations at the final time for two values of
ICDM ‘09, SC ‘11, TKDE ‘13
= 1.95 cm. (Colors are visible in the electronic

ijk
n
and kxk2 = 1
subject to distributed 
j
Triangle
j
X
i
s
i
approximately twenty minutes to construct using (next) architectures
k
[x
]i = ⇢ · (
Tijk xj xk + xi )
k
s.
jk
s- involved a few pre- and post-processing steps:
ta
where ! ensures the 2-norm
m Aria, globally transpose the data, compute the
g errors. The preprocessing steps took approx- SSHOPM method due to "
nd
0

0

Data clustering

WSDM ‘12, KDD ‘12, CIKM ’13 …

0

A
recise timing information, L we do notB
but
report

David
Kolda and Mayo
Gleich

· Purdue

Netflix

3

t
r
o
s.
g
n.
o
The talk ends, you
believe -- whatever
you want to. 

Image from rockysprings, deviantart, CC share-alike

4

Everything in the world can be
explained by a matrix, and we see
how deep the rabbit hole goes
Matrix computations in a red-pill

David Gleich · Purdue

Netflix

5

Solve a problem better by
exploiting its structure!
Problem 1 – (Faster) !
Recommendation as link prediction
WHY NO PREPROCESSING?
Top-k predicted “links”
are movies to watch!

David F. Gleich (Purdue)

David Gleich · Purdue

Emory Math/CS Seminar

Netflix

6

Pairwise scores give
user similarity

19 of 47
David Gleich · Purdue

Netflix

7

Problem 2 – (Better) !
Best movies
Matrix computations in a red-pill

David Gleich · Purdue

Netflix

8

Solve a problem better by
exploiting its structure!
Matrix structure
Netflix graph

Movies "
“liked”
(>3 stars?)
Problem 1!
Adjacency matrix
Normalized Laplacian matrix
Random walk matrix

Netflix matrix
1

1

4
5

5

Problem 2"
Pairwise comparison matrix
David Gleich · Purdue

Netflix

9

5
Problem 1 – (Faster) !
Recommendation as link prediction
WHY NO PREPROCESSING?
Top-k predicted “links”
are movies to watch!

David F. Gleich (Purdue)

David Gleich · Purdue

Emory Math/CS Seminar

Netflix

10

Pairwise scores give
user similarity

19 of 47
z score (edge-based) is

  movie  =   X  ↵   ✓            
  
pred. on
`

`=1

1
X

num. paths of length `
from user to movie

✓

◆

user
k=
(↵ A )
ind. vec.
`=1
{z
}
|
` Math/CS Seminar
Emory `

⌘ei

◆

David Gleich · Purdue

Netflix

11

1

Movie prediction"
vector

)

Matrix based link predictors
Matrix based link predictors
1
X

✓

◆

user
k=
(↵ A )
ind. vec.
`=1
{z
}
|
`

`

⌘ei

Neumann

Carl Neumann

1
X
k =0

↵A)k = ei

(tA)k
David Gleich · Purdue

Netflix

12

(I
The Katz score (edge-based) is

Matrix based link predictors

                             

(I

↵A)k = ei

PageRank

(I

↵P)x = ei

Semi-super."
learning
Heat kernel

(IDavid F. Gleich (Purdue) = ei
↵L)x
exp{↵P}x = ei

Emory Math/CS Seminar

They all look at sums of "
damped paths, but "
change the details, slightly

David Gleich · Purdue

Netflix

13

Katz
Matrix based link predictors
are localized!
PageRank scores for one node!
Crawl of flickr from 2006 ~800k nodes, 6M edges, alpha=1/2
0

1.5

error
||xtrue – xnnz||1

10

1

0.5

−10

10

−15

0

2

4

plot(x)

6

8

10
5

x 10

10

0

10

2

4

10

6

10

10

nonzeros

David Gleich · Purdue

Netflix

14

0

−5

10
Matrix based link predictors
are localized!
KATZ SCORES ARE LOCALIZED

David F. Gleich (Purdue)

Emory Math/CS Seminar

32 of 47

David Gleich · Purdue

Netflix

15

Up to 50 neighbors is
99.65% of the total
mass
Matrix computations in a red-pill

David Gleich · Purdue

Netflix

16

Solve a problem better by
exploiting its structure!
How do we compute them fast?
PageRank

xj = ↵

X

i neigh. of j

xi
deg(i)

+ 1 if j is the target user

w/ access to in-links & degs.

w/ access to out-links

PageRankPull

PageRankPush

xj(k+1)

(k)
↵xa /6

(k)
↵xb /2

(k)
↵xc /3

= fj
xj(k+1)

↵

X
i!j

xi(k ) /degi = fj

Let 

b
a

c

j = blue node

(k+1)

= xj(k) + rj

(k +1)

=0

then
 xj

Update 
r(k +1) rj

(k
(k)
ra +1) = ra + ↵rj(k ) /3

(k
(k)
rb +1) = rb + ↵rj(k ) /3
(k
(k)
rc +1) = rc + ↵rj(k ) /3

David Gleich · Purdue

Netflix

17

(k +1)
Solve for 
xj

j = blue node
We have good theory
for this algorithm …

David Gleich · Purdue

Netflix

18

… and even better
empirical performance.
Theory
Andersen, Chung, Lang (2006)!
For PageRank, “fast runtimes” and “localization”
Bonchi, Esfandiar, Gleich, et al. (2010/2013)!
For Katz, “fast runtimes” 

David Gleich · Purdue

Netflix

19

Kloster, Gleich (2013)!
For Katz, Heat Kernel, "
“fast runtimes” and “localization”"
(assuming power-law degrees)
Accuracy vs. work !
(Heat kernel)
dblp−cc
dblp collaboration graph, 225k vertices
1

0.6

tol=10−5

tol=10−4

0.4

@10
@25

0.2

@100
@1000

0
−2

−1

0

10
10
10
Effective matrix−vector products

David Gleich · Purdue

Netflix

20

Precision

0.8

For the dblp collaboration
graph, we study the
precision in finding the
100 largest nodes as we
vary the work. This set of
100 does not include the
nodes immediate
neighbors. (One column,
but representative)
David Gleich · Purdue

Netflix

21

Empirical runtime (Katz)
TIMING
40

Never got to try it …
 analytics
test on 
HelloMovies.com Need to ix now
matr
Netflix

60

80

Ran out of money once we had the algorithms
… promising initial results though!

I collaborate with the company behind He
David Gleich · Purdue

Netflix

22

Note

1
1
1
1
1
1
1
1
David Gleich · Purdue

Netflix

23

Problem 2 – (Better) !
Best movies
Which is a better list of good DVDs?
Lord of the Rings 3: The Return of …

Lord of the Rings 3: The Return of …

Lord of the Rings 1: The Fellowship 

Lord of the Rings 1: The Fellowship 

Lord of the Rings 2: The Two Towers

Lord of the Rings 2: The Two Towers

Lost: Season 1

Star Wars V: Empire Strikes Back

Battlestar Galactica: Season 1

Raiders of the Lost Ark

Fullmetal Alchemist

Star Wars IV: A New Hope

Trailer Park Boys: Season 4

Shawshank Redemption

Trailer Park Boys: Season 3

Star Wars VI: Return of the Jedi

Tenchi Muyo!

Lord of the Rings 3: Bonus DVD

Shawshank Redemption

The Godfather
Nuclear Norm "
based rank aggregation

(the mean rating)

(not matrix completion on the
netflix rating matrix)
David Gleich · Purdue

Netflix

24/40

Standard "
rank aggregation"
Rank Aggregation

Given partial orders on subsets of items, rank aggregation
is the problem of finding an overall ordering.

Voting Find the winning candidate

Program committees Find the best papers given reviews
Dining Find the best restaurant in Chicago
David Gleich · Purdue

Netflix

25/40
Ranking is really hard
John Kemeny
Ken Arrow

All rank aggregations
involve some measure of
compromise

A good ranking is the
“average” ranking under a
permutation distance

NP hard to compute
Kemeny’s ranking

David Gleich · Purdue

Netflix

26/40

Dwork, Kumar, Naor, !
Sivikumar
Supposewe had scores
Suppose we had scores
Let    be the score of the ith movie/song/paper/team to rank
Suppose we can compare the ith to jth:

  
is skew-symmetric, rank 2.

Also works for   

with an extra log.

Numerical ranking is intimately intertwined
with skew-symmetric matrices
Kemeny and Snell, Mathematical Models in Social Sciences (1978)
David F. Gleich (Purdue)

KDD 2011

David Gleich · Purdue

Netflix
6/20

27/40

Then   
Using ratings as comparisons

Arithmetic Mean

Ratings induce
various skewsymmetric matrices.

From David 1988 – The
Method of Paired Comparisons



David Gleich · Purdue

Netflix

28/40

Log-odds
Extracting the scores
Extracting the scores

do we have?

Do we trust all   
Not really.

David F. Gleich (Purdue)

105

101
101
105
Number of Comparisons

?
Netflix data 17k movies,
500k users, 100M ratings–
99.17% filled

KDD 2011
David

Gleich · Purdue

Netflix

29/40

How many   
Most.

107
Movie Pairs

Given    with all entries, then
  
is the Borda
count, the least-squares
solution to   

8/20
Only partial info? COMPLETE
IT!
Only partial info? Complete it!
Let   

be known for   

We trust these scores.

Goal Find the simplest skew-symmetric matrix that matches
the data   

  

noiseless

  
Both of these are NP-hard too.

David F. Gleich (Purdue)

David Gleich · Purdue

KDD 2011

Netflix

30/40

noisy

9/20
From a French nuclear test in 1970, imageNetflix
from http://picdit.wordpress.com/2008/07/21/8David Gleich · Purdue
insane-nuclear-explosions/

31/40

Solution GO NUCLEAR!
The ranking algorithm

The Ranking Algorithm
0. INPUT    (ratings data) and c
(for trust on comparisons)
1. Compute    from   
2. Discard entries with fewer than
c comparisons
3. Set   
to be indices and
values of what’s left
4.   

= SVP(  

)

David Gleich · Purdue

Netflix

32/40

5. OUTPUT   
Exact recovery
Exactrecovery
 results

Fraction of trials recovered

indices. Instead we view the following theorem as providing
intuition for the noisy problem.
Consider the operator basis for Hermitian matrices:

H = S [ K [ D where
p
S = {1/ 2(ei eT + ej eT ) : 1  i < j  n};
j
i
David Gross showed how to recover Hermitian matrices.
p
K = {ı/ 2(ei eT ej eT ) : 1we get n}; exact   
j
i
i.e. the conditions under which  i < j the

Note that   

D = {ei eT : 1  i  n}.
i

1
0.8
0.6
0.4
0.2
0
2
10

is Hermitian. Thus our new result!
T

Theorem 5. Let s be centered, i.e., s e = 0. Let Y =
seT
esT where ✓ = maxi s2 /(sT s) and ⇢ = ((maxi si )
i
(mini si ))/ksk. Also, let ⌦ ⇢ H be a random set of elements
with size |⌦| O(2n⌫(1 + )(log n)2 ) where ⌫ = max((n✓ +
1)/4, n⇢2 ). Then the solution of
minimize

kXk⇤

Figure
ity of
about
both th
§6.1 fo

6.1 R

The fi
subject to trace(X W i ) = trace((ıY ) W i ), W i 2 ⌦
ability o
the nois
is equal to ıY with probability at least 1 n .
with un
These a
The proof of this theorem follows directly by Theorem 4 if Netflix
 = se
David Gleich · Purdue
Y
  
⇤

33/40

⇤
Recovery Discussion and Experiments
Confession If   

, then just look at differences from
a connected set. Constants? Not very good.

  

Intuition for the truth.
  

David Gleich · Purdue

Netflix

34

  
Recovery Discussion and Experiments
Recovery Experiments
 look at differences from
Confession If   
, then just
a connected set. Constants? Not very good.

  

Intuition for the truth.
  

KDD 2011

16/20

David Gleich · Purdue

Netflix

35/40

David F. Gleich (Purdue)

  
Evaluation
Nuclear norm ranking

Mean rating
1

Median Kendall’s Tau

0.9
0.8
20
10
5
2
1.5

0.7
0.6
0.5

0.9
0.8
0.7
0.6
0.5

0

0.2

0.4 0.6
Error

0.8

1

0

0.2

0.4 0.6
Error

0.8

1

Figure 3: The performanceDavid Gleich · Purdue
of our algorithm Netflix
(left)

36/40

Median Kendall’s Tau

1
Tie in with PageRank
Another way to compute the scores is through a
close relative of PageRank and the linkprediction methods.


Massey or Colley methods
(2I + D A)s = “differeneces”
(L + 2D 1 )x = “scaled differences”

David Gleich · Purdue

Netflix

37/40
Ongoing Work
Finding communities in large networks !
We have the best community finder (as of CIKM2013)"
Whang, Gleich, Dhillon (CIKM)

Fast clique detection!
We have the fastest solver for max-clique problems, useful for
computing temporal strong components (Rossi, Gleich, et al. arXiv)

Scalable network alignment !

w
v
s

Overlap

r

& Low-rank clustering with features + links!
wtu

u

t

A

L

B

& Evolving network analysis!
David Gleich · Purdue

Netflix

38

& Scalable, distributed implementations !
of fast graph kernels!
References
!
Papers 
Gleich & Lim, KDD 2011 – Nuclear Norm Ranking"
Esfandiar, Gleich, Bonchi et al. – WAW2010, J. Internet. Math. 2013"
Kloster & Gleich, WAW2013, arXiv 1310.3423

Code!

www.cs.purdue.edu/homes/dgleich/codes!
bit.ly/dgleich-code


Supported by NSF CAREER 1149756-CCF 

www.cs.purdue.edu/homes/dgleich
David Gleich · Purdue
Netflix

39

!

!

Contenu connexe

Tendances

IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEYIMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEYijcsit
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGANNAVER Engineering
 
GAN in medical imaging
GAN in medical imagingGAN in medical imaging
GAN in medical imagingCheng-Bin Jin
 
Usage of Generative Adversarial Networks (GANs) in Healthcare
Usage of Generative Adversarial Networks (GANs) in HealthcareUsage of Generative Adversarial Networks (GANs) in Healthcare
Usage of Generative Adversarial Networks (GANs) in HealthcareGlobalLogic Ukraine
 
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...宏毅 李
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and ApplicationsEmanuele Ghelfi
 
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...Hansol Kang
 
Uncertainty Quantification in AI
Uncertainty Quantification in AIUncertainty Quantification in AI
Uncertainty Quantification in AIFlorian Wilhelm
 
[GAN by Hung-yi Lee]Part 1: General introduction of GAN
[GAN by Hung-yi Lee]Part 1: General introduction of GAN[GAN by Hung-yi Lee]Part 1: General introduction of GAN
[GAN by Hung-yi Lee]Part 1: General introduction of GANNAVER Engineering
 
Generative Adversarial Networks and Their Applications in Medical Imaging
Generative Adversarial Networks  and Their Applications in Medical ImagingGenerative Adversarial Networks  and Their Applications in Medical Imaging
Generative Adversarial Networks and Their Applications in Medical ImagingSanghoon Hong
 
Deep Convolutional GANs - meaning of latent space
Deep Convolutional GANs - meaning of latent spaceDeep Convolutional GANs - meaning of latent space
Deep Convolutional GANs - meaning of latent spaceHansol Kang
 
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)Thomas da Silva Paula
 

Tendances (12)

IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEYIMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGAN
 
GAN in medical imaging
GAN in medical imagingGAN in medical imaging
GAN in medical imaging
 
Usage of Generative Adversarial Networks (GANs) in Healthcare
Usage of Generative Adversarial Networks (GANs) in HealthcareUsage of Generative Adversarial Networks (GANs) in Healthcare
Usage of Generative Adversarial Networks (GANs) in Healthcare
 
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
 
Uncertainty Quantification in AI
Uncertainty Quantification in AIUncertainty Quantification in AI
Uncertainty Quantification in AI
 
[GAN by Hung-yi Lee]Part 1: General introduction of GAN
[GAN by Hung-yi Lee]Part 1: General introduction of GAN[GAN by Hung-yi Lee]Part 1: General introduction of GAN
[GAN by Hung-yi Lee]Part 1: General introduction of GAN
 
Generative Adversarial Networks and Their Applications in Medical Imaging
Generative Adversarial Networks  and Their Applications in Medical ImagingGenerative Adversarial Networks  and Their Applications in Medical Imaging
Generative Adversarial Networks and Their Applications in Medical Imaging
 
Deep Convolutional GANs - meaning of latent space
Deep Convolutional GANs - meaning of latent spaceDeep Convolutional GANs - meaning of latent space
Deep Convolutional GANs - meaning of latent space
 
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 

En vedette

PageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structuresPageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structuresDavid Gleich
 
Overlapping clusters for distributed computation
Overlapping clusters for distributed computationOverlapping clusters for distributed computation
Overlapping clusters for distributed computationDavid Gleich
 
Localized methods in graph mining
Localized methods in graph miningLocalized methods in graph mining
Localized methods in graph miningDavid Gleich
 
Graph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimcGraph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimcDavid Gleich
 
Non-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansNon-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansDavid Gleich
 
Using Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningUsing Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningDavid Gleich
 
Spacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chainsSpacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chainsDavid Gleich
 
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structuresDavid Gleich
 
Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...David Gleich
 
A history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveA history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveDavid Gleich
 
A multithreaded method for network alignment
A multithreaded method for network alignmentA multithreaded method for network alignment
A multithreaded method for network alignmentDavid Gleich
 
Direct tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architecturesDirect tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architecturesDavid Gleich
 
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...David Gleich
 
Tall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceTall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceDavid Gleich
 
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...David Gleich
 
MapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsMapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsDavid Gleich
 
Iterative methods for network alignment
Iterative methods for network alignmentIterative methods for network alignment
Iterative methods for network alignmentDavid Gleich
 
Anti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutAnti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutDavid Gleich
 
The power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulantsThe power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulantsDavid Gleich
 
Relaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksRelaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksDavid Gleich
 

En vedette (20)

PageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structuresPageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structures
 
Overlapping clusters for distributed computation
Overlapping clusters for distributed computationOverlapping clusters for distributed computation
Overlapping clusters for distributed computation
 
Localized methods in graph mining
Localized methods in graph miningLocalized methods in graph mining
Localized methods in graph mining
 
Graph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimcGraph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimc
 
Non-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansNon-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-means
 
Using Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningUsing Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based Learning
 
Spacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chainsSpacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chains
 
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structures
 
Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...
 
A history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveA history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspective
 
A multithreaded method for network alignment
A multithreaded method for network alignmentA multithreaded method for network alignment
A multithreaded method for network alignment
 
Direct tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architecturesDirect tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architectures
 
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
 
Tall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceTall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduce
 
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
 
MapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsMapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applications
 
Iterative methods for network alignment
Iterative methods for network alignmentIterative methods for network alignment
Iterative methods for network alignment
 
Anti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutAnti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCut
 
The power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulantsThe power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulants
 
Relaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksRelaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networks
 

Similaire à Fast matrix primitives for ranking, link-prediction and more

Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010
Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010
Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010Yahoo Developer Network
 
Recommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLRecommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLDavid Gleich
 
SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix DatasetBen Mabey
 
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networksDavid Gleich
 
Skew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregationSkew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregationDavid Gleich
 
Localized methods for diffusions in large graphs
Localized methods for diffusions in large graphsLocalized methods for diffusions in large graphs
Localized methods for diffusions in large graphsDavid Gleich
 
Simulation Informatics; Analyzing Large Scientific Datasets
Simulation Informatics; Analyzing Large Scientific DatasetsSimulation Informatics; Analyzing Large Scientific Datasets
Simulation Informatics; Analyzing Large Scientific DatasetsDavid Gleich
 
A dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationA dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationDavid Gleich
 
Tutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksTutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksMLReview
 
Software Defined Visualization (SDVis): Get the Most Out of ParaView* with OS...
Software Defined Visualization (SDVis): Get the Most Out of ParaView* with OS...Software Defined Visualization (SDVis): Get the Most Out of ParaView* with OS...
Software Defined Visualization (SDVis): Get the Most Out of ParaView* with OS...Intel® Software
 
Dagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphsDagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphsArijit Khan
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic rankingFELIX75
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraJason Riedy
 
Towards controlling evolutionary dynamics through network geometry: some very...
Towards controlling evolutionary dynamics through network geometry: some very...Towards controlling evolutionary dynamics through network geometry: some very...
Towards controlling evolutionary dynamics through network geometry: some very...Kolja Kleineberg
 
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Intel® Software
 
pptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspacespptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspacesbutest
 
pptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspacespptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspacesbutest
 

Similaire à Fast matrix primitives for ranking, link-prediction and more (20)

Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010
Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010
Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010
 
ilp-nlp-slides.pdf
ilp-nlp-slides.pdfilp-nlp-slides.pdf
ilp-nlp-slides.pdf
 
Recommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLRecommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQL
 
lec02.pptx
lec02.pptxlec02.pptx
lec02.pptx
 
SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix Dataset
 
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networks
 
Skew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregationSkew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregation
 
Localized methods for diffusions in large graphs
Localized methods for diffusions in large graphsLocalized methods for diffusions in large graphs
Localized methods for diffusions in large graphs
 
Simulation Informatics; Analyzing Large Scientific Datasets
Simulation Informatics; Analyzing Large Scientific DatasetsSimulation Informatics; Analyzing Large Scientific Datasets
Simulation Informatics; Analyzing Large Scientific Datasets
 
A dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationA dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportation
 
Tutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksTutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial Networks
 
Software Defined Visualization (SDVis): Get the Most Out of ParaView* with OS...
Software Defined Visualization (SDVis): Get the Most Out of ParaView* with OS...Software Defined Visualization (SDVis): Get the Most Out of ParaView* with OS...
Software Defined Visualization (SDVis): Get the Most Out of ParaView* with OS...
 
Dagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphsDagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphs
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic ranking
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear Algebra
 
Towards controlling evolutionary dynamics through network geometry: some very...
Towards controlling evolutionary dynamics through network geometry: some very...Towards controlling evolutionary dynamics through network geometry: some very...
Towards controlling evolutionary dynamics through network geometry: some very...
 
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
 
pptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspacespptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspaces
 
pptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspacespptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspaces
 
AIRS2016
AIRS2016AIRS2016
AIRS2016
 

Plus de David Gleich

Correlation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksCorrelation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksDavid Gleich
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresDavid Gleich
 
Spacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisSpacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisDavid Gleich
 
Personalized PageRank based community detection
Personalized PageRank based community detectionPersonalized PageRank based community detection
Personalized PageRank based community detectionDavid Gleich
 
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...David Gleich
 
How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...David Gleich
 
Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduceDavid Gleich
 
Matrix methods for Hadoop
Matrix methods for HadoopMatrix methods for Hadoop
Matrix methods for HadoopDavid Gleich
 

Plus de David Gleich (8)

Correlation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksCorrelation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networks
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structures
 
Spacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisSpacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysis
 
Personalized PageRank based community detection
Personalized PageRank based community detectionPersonalized PageRank based community detection
Personalized PageRank based community detection
 
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
 
How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...
 
Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduce
 
Matrix methods for Hadoop
Matrix methods for HadoopMatrix methods for Hadoop
Matrix methods for Hadoop
 

Dernier

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 

Dernier (20)

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 

Fast matrix primitives for ranking, link-prediction and more

  • 1. Fast matrix primitives for ranking, communities ! and more. David F. Gleich! David Gleich · Purdue Netflix 1 Computer Science! Purdue University!
  • 3. error 1 Models Previous work – and algorithms for high performance ! from the PI tackled net- computations matrix and network FIGURE 6 std 2 work alignment with matrix methods =for cm edge Std, s 0.39 Big data(b)methods overlap: SIMAX ‘09, SISC ‘11,MapReduce ‘11, ICASSP ’12 1 j i i0 Overlap Overlap j0 error SC ‘05, WAW ‘07, SISC ‘10, WWW ’10, … Massive matrix " computations std 0 0 Fast & Scalable" Network centrality 0 20 10 10 A L B Tensor eigenvalues" 0 (d) Std, s = 1.95 cm Ax = b min kAx bk Ax = x This proposal is for matchand a power method Network alignment tensor ing triangles using P methods: on multi-threaded maximize Tijk xi xj xk model compared to the prediction standard debble locations at the final time for two values of ICDM ‘09, SC ‘11, TKDE ‘13 = 1.95 cm. (Colors are visible in the electronic ijk n and kxk2 = 1 subject to distributed j Triangle j X i s i approximately twenty minutes to construct using (next) architectures k [x ]i = ⇢ · ( Tijk xj xk + xi ) k s. jk s- involved a few pre- and post-processing steps: ta where ! ensures the 2-norm m Aria, globally transpose the data, compute the g errors. The preprocessing steps took approx- SSHOPM method due to " nd 0 0 Data clustering WSDM ‘12, KDD ‘12, CIKM ’13 … 0 A recise timing information, L we do notB but report David Kolda and Mayo Gleich · Purdue Netflix 3 t r o s. g n. o
  • 4. The talk ends, you believe -- whatever you want to. Image from rockysprings, deviantart, CC share-alike 4 Everything in the world can be explained by a matrix, and we see how deep the rabbit hole goes
  • 5. Matrix computations in a red-pill David Gleich · Purdue Netflix 5 Solve a problem better by exploiting its structure!
  • 6. Problem 1 – (Faster) ! Recommendation as link prediction WHY NO PREPROCESSING? Top-k predicted “links” are movies to watch! David F. Gleich (Purdue) David Gleich · Purdue Emory Math/CS Seminar Netflix 6 Pairwise scores give user similarity 19 of 47
  • 7. David Gleich · Purdue Netflix 7 Problem 2 – (Better) ! Best movies
  • 8. Matrix computations in a red-pill David Gleich · Purdue Netflix 8 Solve a problem better by exploiting its structure!
  • 9. Matrix structure Netflix graph Movies " “liked” (>3 stars?) Problem 1! Adjacency matrix Normalized Laplacian matrix Random walk matrix Netflix matrix 1 1 4 5 5 Problem 2" Pairwise comparison matrix David Gleich · Purdue Netflix 9 5
  • 10. Problem 1 – (Faster) ! Recommendation as link prediction WHY NO PREPROCESSING? Top-k predicted “links” are movies to watch! David F. Gleich (Purdue) David Gleich · Purdue Emory Math/CS Seminar Netflix 10 Pairwise scores give user similarity 19 of 47
  • 11. z score (edge-based) is   movie  =   X  ↵   ✓                pred. on ` `=1 1 X num. paths of length ` from user to movie ✓ ◆ user k= (↵ A ) ind. vec. `=1 {z } | ` Math/CS Seminar Emory ` ⌘ei ◆ David Gleich · Purdue Netflix 11 1 Movie prediction" vector ) Matrix based link predictors
  • 12. Matrix based link predictors 1 X ✓ ◆ user k= (↵ A ) ind. vec. `=1 {z } | ` ` ⌘ei Neumann Carl Neumann 1 X k =0 ↵A)k = ei (tA)k David Gleich · Purdue Netflix 12 (I
  • 13. The Katz score (edge-based) is Matrix based link predictors                               (I ↵A)k = ei PageRank (I ↵P)x = ei Semi-super." learning Heat kernel (IDavid F. Gleich (Purdue) = ei ↵L)x exp{↵P}x = ei Emory Math/CS Seminar They all look at sums of " damped paths, but " change the details, slightly David Gleich · Purdue Netflix 13 Katz
  • 14. Matrix based link predictors are localized! PageRank scores for one node! Crawl of flickr from 2006 ~800k nodes, 6M edges, alpha=1/2 0 1.5 error ||xtrue – xnnz||1 10 1 0.5 −10 10 −15 0 2 4 plot(x) 6 8 10 5 x 10 10 0 10 2 4 10 6 10 10 nonzeros David Gleich · Purdue Netflix 14 0 −5 10
  • 15. Matrix based link predictors are localized! KATZ SCORES ARE LOCALIZED David F. Gleich (Purdue) Emory Math/CS Seminar 32 of 47 David Gleich · Purdue Netflix 15 Up to 50 neighbors is 99.65% of the total mass
  • 16. Matrix computations in a red-pill David Gleich · Purdue Netflix 16 Solve a problem better by exploiting its structure!
  • 17. How do we compute them fast? PageRank xj = ↵ X i neigh. of j xi deg(i) + 1 if j is the target user w/ access to in-links & degs. w/ access to out-links PageRankPull PageRankPush xj(k+1) (k) ↵xa /6 (k) ↵xb /2 (k) ↵xc /3 = fj xj(k+1) ↵ X i!j xi(k ) /degi = fj Let b a c j = blue node (k+1) = xj(k) + rj (k +1) =0 then xj Update r(k +1) rj (k (k) ra +1) = ra + ↵rj(k ) /3 (k (k) rb +1) = rb + ↵rj(k ) /3 (k (k) rc +1) = rc + ↵rj(k ) /3 David Gleich · Purdue Netflix 17 (k +1) Solve for xj j = blue node
  • 18. We have good theory for this algorithm … David Gleich · Purdue Netflix 18 … and even better empirical performance.
  • 19. Theory Andersen, Chung, Lang (2006)! For PageRank, “fast runtimes” and “localization” Bonchi, Esfandiar, Gleich, et al. (2010/2013)! For Katz, “fast runtimes” David Gleich · Purdue Netflix 19 Kloster, Gleich (2013)! For Katz, Heat Kernel, " “fast runtimes” and “localization”" (assuming power-law degrees)
  • 20. Accuracy vs. work ! (Heat kernel) dblp−cc dblp collaboration graph, 225k vertices 1 0.6 tol=10−5 tol=10−4 0.4 @10 @25 0.2 @100 @1000 0 −2 −1 0 10 10 10 Effective matrix−vector products David Gleich · Purdue Netflix 20 Precision 0.8 For the dblp collaboration graph, we study the precision in finding the 100 largest nodes as we vary the work. This set of 100 does not include the nodes immediate neighbors. (One column, but representative)
  • 21. David Gleich · Purdue Netflix 21 Empirical runtime (Katz) TIMING
  • 22. 40 Never got to try it … analytics test on HelloMovies.com Need to ix now matr Netflix 60 80 Ran out of money once we had the algorithms … promising initial results though! I collaborate with the company behind He David Gleich · Purdue Netflix 22 Note 1 1 1 1 1 1 1 1
  • 23. David Gleich · Purdue Netflix 23 Problem 2 – (Better) ! Best movies
  • 24. Which is a better list of good DVDs? Lord of the Rings 3: The Return of … Lord of the Rings 3: The Return of … Lord of the Rings 1: The Fellowship Lord of the Rings 1: The Fellowship Lord of the Rings 2: The Two Towers Lord of the Rings 2: The Two Towers Lost: Season 1 Star Wars V: Empire Strikes Back Battlestar Galactica: Season 1 Raiders of the Lost Ark Fullmetal Alchemist Star Wars IV: A New Hope Trailer Park Boys: Season 4 Shawshank Redemption Trailer Park Boys: Season 3 Star Wars VI: Return of the Jedi Tenchi Muyo! Lord of the Rings 3: Bonus DVD Shawshank Redemption The Godfather Nuclear Norm " based rank aggregation (the mean rating) (not matrix completion on the netflix rating matrix) David Gleich · Purdue Netflix 24/40 Standard " rank aggregation"
  • 25. Rank Aggregation Given partial orders on subsets of items, rank aggregation is the problem of finding an overall ordering. Voting Find the winning candidate Program committees Find the best papers given reviews Dining Find the best restaurant in Chicago David Gleich · Purdue Netflix 25/40
  • 26. Ranking is really hard John Kemeny Ken Arrow All rank aggregations involve some measure of compromise A good ranking is the “average” ranking under a permutation distance NP hard to compute Kemeny’s ranking David Gleich · Purdue Netflix 26/40 Dwork, Kumar, Naor, ! Sivikumar
  • 27. Supposewe had scores Suppose we had scores Let    be the score of the ith movie/song/paper/team to rank Suppose we can compare the ith to jth:    is skew-symmetric, rank 2. Also works for    with an extra log. Numerical ranking is intimately intertwined with skew-symmetric matrices Kemeny and Snell, Mathematical Models in Social Sciences (1978) David F. Gleich (Purdue) KDD 2011 David Gleich · Purdue Netflix 6/20 27/40 Then   
  • 28. Using ratings as comparisons Arithmetic Mean Ratings induce various skewsymmetric matrices. From David 1988 – The Method of Paired Comparisons David Gleich · Purdue Netflix 28/40 Log-odds
  • 29. Extracting the scores Extracting the scores do we have? Do we trust all    Not really. David F. Gleich (Purdue) 105 101 101 105 Number of Comparisons ? Netflix data 17k movies, 500k users, 100M ratings– 99.17% filled KDD 2011 David Gleich · Purdue Netflix 29/40 How many    Most. 107 Movie Pairs Given    with all entries, then    is the Borda count, the least-squares solution to    8/20
  • 30. Only partial info? COMPLETE IT! Only partial info? Complete it! Let    be known for    We trust these scores. Goal Find the simplest skew-symmetric matrix that matches the data       noiseless    Both of these are NP-hard too. David F. Gleich (Purdue) David Gleich · Purdue KDD 2011 Netflix 30/40 noisy 9/20
  • 31. From a French nuclear test in 1970, imageNetflix from http://picdit.wordpress.com/2008/07/21/8David Gleich · Purdue insane-nuclear-explosions/ 31/40 Solution GO NUCLEAR!
  • 32. The ranking algorithm The Ranking Algorithm 0. INPUT    (ratings data) and c (for trust on comparisons) 1. Compute    from    2. Discard entries with fewer than c comparisons 3. Set    to be indices and values of what’s left 4.    = SVP(   ) David Gleich · Purdue Netflix 32/40 5. OUTPUT   
  • 33. Exact recovery Exactrecovery results Fraction of trials recovered indices. Instead we view the following theorem as providing intuition for the noisy problem. Consider the operator basis for Hermitian matrices: H = S [ K [ D where p S = {1/ 2(ei eT + ej eT ) : 1  i < j  n}; j i David Gross showed how to recover Hermitian matrices. p K = {ı/ 2(ei eT ej eT ) : 1we get n}; exact    j i i.e. the conditions under which  i < j the Note that    D = {ei eT : 1  i  n}. i 1 0.8 0.6 0.4 0.2 0 2 10 is Hermitian. Thus our new result! T Theorem 5. Let s be centered, i.e., s e = 0. Let Y = seT esT where ✓ = maxi s2 /(sT s) and ⇢ = ((maxi si ) i (mini si ))/ksk. Also, let ⌦ ⇢ H be a random set of elements with size |⌦| O(2n⌫(1 + )(log n)2 ) where ⌫ = max((n✓ + 1)/4, n⇢2 ). Then the solution of minimize kXk⇤ Figure ity of about both th §6.1 fo 6.1 R The fi subject to trace(X W i ) = trace((ıY ) W i ), W i 2 ⌦ ability o the nois is equal to ıY with probability at least 1 n . with un These a The proof of this theorem follows directly by Theorem 4 if Netflix = se David Gleich · Purdue Y    ⇤ 33/40 ⇤
  • 34. Recovery Discussion and Experiments Confession If    , then just look at differences from a connected set. Constants? Not very good.    Intuition for the truth.    David Gleich · Purdue Netflix 34   
  • 35. Recovery Discussion and Experiments Recovery Experiments look at differences from Confession If    , then just a connected set. Constants? Not very good.    Intuition for the truth.    KDD 2011 16/20 David Gleich · Purdue Netflix 35/40 David F. Gleich (Purdue)   
  • 36. Evaluation Nuclear norm ranking Mean rating 1 Median Kendall’s Tau 0.9 0.8 20 10 5 2 1.5 0.7 0.6 0.5 0.9 0.8 0.7 0.6 0.5 0 0.2 0.4 0.6 Error 0.8 1 0 0.2 0.4 0.6 Error 0.8 1 Figure 3: The performanceDavid Gleich · Purdue of our algorithm Netflix (left) 36/40 Median Kendall’s Tau 1
  • 37. Tie in with PageRank Another way to compute the scores is through a close relative of PageRank and the linkprediction methods. Massey or Colley methods (2I + D A)s = “differeneces” (L + 2D 1 )x = “scaled differences” David Gleich · Purdue Netflix 37/40
  • 38. Ongoing Work Finding communities in large networks ! We have the best community finder (as of CIKM2013)" Whang, Gleich, Dhillon (CIKM) Fast clique detection! We have the fastest solver for max-clique problems, useful for computing temporal strong components (Rossi, Gleich, et al. arXiv) Scalable network alignment ! w v s Overlap r & Low-rank clustering with features + links! wtu u t A L B & Evolving network analysis! David Gleich · Purdue Netflix 38 & Scalable, distributed implementations ! of fast graph kernels!
  • 39. References ! Papers Gleich & Lim, KDD 2011 – Nuclear Norm Ranking" Esfandiar, Gleich, Bonchi et al. – WAW2010, J. Internet. Math. 2013" Kloster & Gleich, WAW2013, arXiv 1310.3423 Code! www.cs.purdue.edu/homes/dgleich/codes! bit.ly/dgleich-code Supported by NSF CAREER 1149756-CCF www.cs.purdue.edu/homes/dgleich David Gleich · Purdue Netflix 39 ! !