SlideShare a Scribd company logo
1 of 38
Download to read offline
Anti-differentiating
approximation algorithms !
& new relationships between !
Page Rank, spectral, and localized flow
David F. Gleich!
Purdue University!
Joint work with 
Michael Mahoney.
Supported by "
NSF CAREER
1149756-CCF, "
Simons Inst.
ICERM
David Gleich · Purdue
1
Anti-differentiating
approximation algorithms !
& new relationships between !
Page Rank, spectral, and localized flow
A new derivation of the PageRank vector for an
undirected graph based on Laplacians, cuts, or flows.
A new understanding of the “push” methods to
compute Personalized PageRank
An empirical improvement to methods for semi-
supervised learning.
1st
2nd
ICERM
David Gleich · Purdue
2
The PageRank problem !

 The PageRank random surfer
1.  With probability beta, follow
a random-walk step
2.  With probability (1-beta),
jump randomly ~ dist. v.
Goal find the stationary dist. x!
!
Alg Solve the linear system



Symmetric adjacency matrix
Diagonal degree matrix
Solution
Jump-vector
(I AD 1
)x = (1 )v
x = AD 1
x + (1 )v
ICERM
David Gleich · Purdue
3
The PageRank problem & !
the Laplacian
1. (I AD 1
)x = (1 )v;
2. (I A)y = (1 )D 1/2
v,
where A = D 1/2
AD 1/2
and x = D1/2
y; and
3. [↵D + L]z = ↵v where = 1/(1 + ↵) and x = Dz.
Combinatorial Laplacian
ICERM
David Gleich · Purdue
4
The Push Algorithm for PageRank
Proposed (in closest form) in Andersen, Chung, Lang "
(also by McSherry, Jeh & Widom) for personalized PageRank
Strongly related to Gauss-Seidel (see my talk at Simons for this)
Derived to show improved runtime for balanced solvers
1. x(1)
= 0, r(1)
= (1 )ei , k = 1
2. while any rj > ⌧dj (dj is the degree of node j)
3. x(k+1)
= x(k)
+ (rj ⌧dj ⇢)ej
4. r(k+1)
i =
8
><
>:
⌧dj ⇢ i = j
r(k)
i + (rj ⌧dj ⇢)/dj i ⇠ j
r(k)
i otherwise
5. k k + 1
The
Push
Method!
⌧, ⇢
ICERM
David Gleich · Purdue
5
… demo of push …
ICERM
David Gleich · Purdue
6
Why do we care
about push?

1.  Used for empirical
studies of
“communities”
2.  Used for “fast
PageRank”
approximation
It produces sparse
approximations to
PageRank!
Newman’s netscience!
379 vertices, 1828 nnz
“zero” on most of the nodes
v has a single "
one here
7
Our question!
Why does the “push method” have
such incredible empirical utility? 
8
The O(correct) answer

1.  PageRank related to Laplacian
2.  Laplacian related to cuts
3.  Andersen, Chung, Lang provides the "
“right” bounds and “localization”


This talk the θ(correct) answer?"
A deeper insight into the relationship
ICERM
David Gleich · Purdue
9
Intellectually indebted to …
Chin, Mądry, Miller & Peng [2013]
Orecchia & Zhu [2014]
10
minimize kBxkC,1 =
P
ij2E Ci,j |xi xj |
subject to xs = 1, xt = 0, x 0.
The s-t min-cut problem
Unweighted incidence matrix
Diagonal capacity matrix
11
The localized cut graph



Related to a construction
used in “FlowImprove” "
Andersen & Lang (2007); and
Orecchia & Zhu (2014)
AS =
2
4
0 ↵dT
S 0
↵dS A ↵d¯S
0 ↵dT
¯S 0
3
5
Connect s to vertices
in S with weight ↵ · degree
Connect t to vertices
in ¯S with weight ↵ · degree
ICERM
David Gleich · Purdue
12
The localized cut graph
Connect s to vertices
in S with weight ↵ · degree
Connect t to vertices
in ¯S with weight ↵ · degree
BS =
2
4
e IS 0
0 B 0
0 I¯S e
3
5
minimize kBSxkC(↵),1
subject to xs = 1, xt = 0
x 0.
Solve the s-t min-cut
ICERM
David Gleich · Purdue
13
The localized cut graph
Connect s to vertices
in S with weight ↵ · degree
Connect t to vertices
in ¯S with weight ↵ · degree
BS =
2
4
e IS 0
0 B 0
0 I¯S e
3
5
Solve the “electrical flow” 

s-t min-cut
minimize kBSxkC(↵),2
subject to xs = 1, xt = 0
ICERM
David Gleich · Purdue
14
s-t min-cut à PageRank 
The PageRank vector z that solves
(↵D + L)z = ↵v
with v = dS/vol(S) is a renormalized
solution of the electrical cut computation:
minimize kBSxkC(↵),2
subject to xs = 1, xt = 0.
Specifically, if x is the solution, then
x =
2
4
1
vol(S)z
0
3
5
Proof
Square and expand
the objective into
a Laplacian, then
apply constraints.
ICERM
David Gleich · Purdue
15
PageRank à s-t min-cut
That equivalence works if v is degree-weighted.
What if v is the uniform vector? 
A(s) =
2
4
0 ↵sT
0
↵s A ↵(d s)
0 ↵(d s)T
0
3
5 .
ICERM
David Gleich · Purdue
16
And beyond …

Easy to cook up interesting diffusion-like
problems and adapt them to this framework. In
particular, Zhou et al. (2004) gave a semi-
supervised learning diffusion we study soon.
2
4
0 eT
S 0
eS ✓A e¯S
0 e¯S 0
3
5 . (I + ✓L)x = eS
ICERM
David Gleich · Purdue
17
Back to the push method
Let x be the output from the push method
with 0 < < 1, v = dS/vol(S),
⇢ = 1, and ⌧ > 0.
Set ↵ = 1
,  = ⌧vol(S)/ , and let zG solve:
minimize 1
2 kBSzk
2
C(↵),2 + kDzk1
subject to zs = 1, zt = 0, z 0
,
where z =
h 1
zG
0
i
.
Then x = DzG/vol(S).
Proof Write out KKT conditions
Show that the push method
solves them. Slackness was “tricky”
Regularization
for sparsity
ICERM
David Gleich · Purdue
18
Need for
normalization
… demo of equivalence …
19
This is a case of 
Algorithmic Anti-differentiation!
20
The ideal world
Given Problem P
Derive solution
characterization C
Show algorithm A "
finds a solution where C
holds
Profit?!
Given “min-cut”
Derive “max-flow is
equivalent to min-cut”
Show push-relabel
solves max-flow "

Profit!!
ICERM
David Gleich · Purdue
21
(The ideal world)’
Given Problem P
Derive solution approx.
characterization C’
Show algorithm A’
quickly finds a solution
where C’ holds
Profit?!
Given “sparest-cut”
Derive Rayleigh-
quotient approximation
Show power-method
finds a good Rayleigh-
quotient 
Profit?!
ICERM
David Gleich · Purdue
22
The real world?
Given Task P
Hack around until you
find something useful
Write paper presenting
“novel heuristic” H for P
and …
Profit!!
Given “find-communities”
Hack around "
??? (hidden) ???
Write paper presenting
“three matvecs finds real-
world communities”
Profit!!
ICERM
David Gleich · Purdue
23
Understand why H works!
Show heuristic H solves P’
Guess and check!
until you find something H
solves
Derive characterization of
heuristic H
The real world
Given “find-communities”
Hack around "

Write paper presenting
“three matvecs finds real-
world communities”
Profit!!
Algorithmic Anti-differentiation!
Given heuristic H, is there a problem P’
such that H is an algorithm for P’ ? 
ICERM
David Gleich · Purdue
24
e.g. Mahoney & Orecchia
If your algorithm is related
to optimization, this is: 
Given a procedure X, "
what objective does it
optimize?
The real world
Algorithmic Anti-differentiation!
Given heuristic H, is there a problem P’
such that H is an algorithm for P’ ? 
In an unconstrained
case, this is just
“anti-differentiation!”
ICERM
David Gleich · Purdue
25
Algorithmic Anti-differentiation
in the literature
Dhillon et al. (2007) "
Spectral clustering, trace minimization & kernel k-means
Saunders (1995) LSQR & Craig iterative methods
ICERM
David Gleich · Purdue
26
Why does it matter?!
These details matter in "
many empirical studies, and
can dramatically impact
performance (speed or quality)
ICERM
David Gleich · Purdue
27
Semi-supervised
Learning on Graphs
Ai,j = exp
✓
kdi dj k2
2
2 2
◆
di
dj = 2.5
= 1.25
Zhou et al. NIPS (2003)
28
Semi-supervised
Learning on Graphs
= 2.5
= 1.25
Experiment predict unlabeled
images from the labeled ones
29
Semi-supervised
Learning on Graphs
K2 = (D A) 1
K1 = (I A) 1
K3 = (Diag(Ae) A) 1
Y = Ki L
Our new “kernel” 
Indicators on the
revealed labels
Predictions
Experiment vary number of
labeled images and track perf. 
y = argmaxj Y
30
Semi-supervised
Learning on Graphs
K2 = (D A) 1
K1 = (I A) 1
K3 = (Diag(Ae) A) 1
Y = Ki L
Experiment vary number of
labeled images and track perf. 
y = argmaxj Y
0 20 40
0
0.2
0.4
0.6
0.8
1
Num. labels
Errorrate
K1
K2
K3
RK3
= 1.25
Regularized K3
Zhou et al. NIPS (2004)
31
Semi-supervised
Learning on Graphs
K2 = (D A) 1
K1 = (I A) 1
K3 = (Diag(Ae) A) 1
Y = Ki L
Experiment vary number of
labeled images and track perf. 
y = argmaxj Y
Regularized K3
= 2.5
Our new value
Random guessing
32
Semi-supervised
Learning on Graphs
K2 = (D A) 1
K1 = (I A) 1
K3 = (Diag(Ae) A) 1
Y = Ki L
Experiment vary number of
labeled images and track perf. 
y = argmaxj Y
Regularized K3
0 20 40
0
0.2
0.4
0.6
0.8
1
Num. labels
Errorrate
K1
K2
K3
RK3
= 2.5
Our new value
Random guessing
33
What’s happening?
0 0.5 1
0
0.2
0.4
0.6
0.8
1
2 vs. 1,2,3,4, σ=2.50
false pos.
truepos.
K1
K2
K3
RK3
0 0.5 1
0
0.2
0.4
0.6
0.8
1
2 vs. 1,2,3,4, σ=1.25
false pos.
truepos.
K1
K2
K3
RK3
Much better performance!
ICERM
David Gleich · Purdue
34
The results of our !
regularized estimate
500 1000 1500 2000 2500 3000 3500
0.05
0.1
0.15
0.2
0.25
0.3
0.35
ICERM
David Gleich · Purdue
35
Why does it matter?!
Theory has the answer! 
We “sweep” over cuts from
approximate eigenvectors!
It’s the order not the values.
ICERM
David Gleich · Purdue
36
0 20 40
0
0.1
0.2
0.3
0.4
Num. labels
Errorrate
K1
K2
K3
RK3
Improved performance
Y = Ki L
Regularized K3
y = argminj SortedRank(Y)
We have spent no time tuning the reg. parameter.
ICERM
David Gleich · Purdue
37
K2 = (D A) 1
K1 = (I A) 1
K3 = (Diag(Ae) A) 1
= 2.5
Our new value
Anti-di↵erentiating Approximation Algorithms
16 nonzeros 15 nonzeros 284 nonzeros 24 nonzeros
Recap & Conclusions
ICERM
David Gleich · Purdue
38
Open issues!
Better treatment of directed graphs?

Algorithm for rho < 1?!
rho set to ½ in most “uses”
Need new analysis 

New relationships between
localized cuts & PageRank

New understanding of PPR"
push procedure

Improvements to semi-
supervised learning on
graphs!

More Related Content

What's hot

Personalized PageRank based community detection
Personalized PageRank based community detectionPersonalized PageRank based community detection
Personalized PageRank based community detectionDavid Gleich
 
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networksDavid Gleich
 
Spacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisSpacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisDavid Gleich
 
Big data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsBig data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsDavid Gleich
 
Non-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansNon-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansDavid Gleich
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresDavid Gleich
 
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structuresDavid Gleich
 
Correlation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksCorrelation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksDavid Gleich
 
Lesson 26: Integration by Substitution (handout)
Lesson 26: Integration by Substitution (handout)Lesson 26: Integration by Substitution (handout)
Lesson 26: Integration by Substitution (handout)Matthew Leingang
 
Relaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksRelaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksDavid Gleich
 
A lattice-based consensus clustering
A lattice-based consensus clusteringA lattice-based consensus clustering
A lattice-based consensus clusteringDmitrii Ignatov
 
Uncertainty Modeling in Deep Learning
Uncertainty Modeling in Deep LearningUncertainty Modeling in Deep Learning
Uncertainty Modeling in Deep LearningSungjoon Choi
 
Pattern-based classification of demographic sequences
Pattern-based classification of demographic sequencesPattern-based classification of demographic sequences
Pattern-based classification of demographic sequencesDmitrii Ignatov
 
A new generalized lindley distribution
A new generalized lindley distributionA new generalized lindley distribution
A new generalized lindley distributionAlexander Decker
 
PR 103: t-SNE
PR 103: t-SNEPR 103: t-SNE
PR 103: t-SNETaeoh Kim
 
High-Performance Approach to String Similarity using Most Frequent K Characters
High-Performance Approach to String Similarity using Most Frequent K CharactersHigh-Performance Approach to String Similarity using Most Frequent K Characters
High-Performance Approach to String Similarity using Most Frequent K CharactersHolistic Benchmarking of Big Linked Data
 

What's hot (20)

Personalized PageRank based community detection
Personalized PageRank based community detectionPersonalized PageRank based community detection
Personalized PageRank based community detection
 
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networks
 
Spacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisSpacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysis
 
Big data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsBig data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphs
 
Non-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansNon-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-means
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structures
 
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structures
 
Correlation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksCorrelation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networks
 
Lesson 26: Integration by Substitution (handout)
Lesson 26: Integration by Substitution (handout)Lesson 26: Integration by Substitution (handout)
Lesson 26: Integration by Substitution (handout)
 
Relaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksRelaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networks
 
A lattice-based consensus clustering
A lattice-based consensus clusteringA lattice-based consensus clustering
A lattice-based consensus clustering
 
Uncertainty Modeling in Deep Learning
Uncertainty Modeling in Deep LearningUncertainty Modeling in Deep Learning
Uncertainty Modeling in Deep Learning
 
Pattern-based classification of demographic sequences
Pattern-based classification of demographic sequencesPattern-based classification of demographic sequences
Pattern-based classification of demographic sequences
 
Uncertainty in Deep Learning
Uncertainty in Deep LearningUncertainty in Deep Learning
Uncertainty in Deep Learning
 
Cs36565569
Cs36565569Cs36565569
Cs36565569
 
MUMS Opening Workshop - Extrapolation: The Art of Connecting Model-Based Pred...
MUMS Opening Workshop - Extrapolation: The Art of Connecting Model-Based Pred...MUMS Opening Workshop - Extrapolation: The Art of Connecting Model-Based Pred...
MUMS Opening Workshop - Extrapolation: The Art of Connecting Model-Based Pred...
 
A new generalized lindley distribution
A new generalized lindley distributionA new generalized lindley distribution
A new generalized lindley distribution
 
PR 103: t-SNE
PR 103: t-SNEPR 103: t-SNE
PR 103: t-SNE
 
On the Zeros of Complex Polynomials
On the Zeros of Complex PolynomialsOn the Zeros of Complex Polynomials
On the Zeros of Complex Polynomials
 
High-Performance Approach to String Similarity using Most Frequent K Characters
High-Performance Approach to String Similarity using Most Frequent K CharactersHigh-Performance Approach to String Similarity using Most Frequent K Characters
High-Performance Approach to String Similarity using Most Frequent K Characters
 

Viewers also liked

Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...David Gleich
 
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...David Gleich
 
Direct tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architecturesDirect tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architecturesDavid Gleich
 
The power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulantsThe power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulantsDavid Gleich
 
MapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsMapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsDavid Gleich
 
Iterative methods for network alignment
Iterative methods for network alignmentIterative methods for network alignment
Iterative methods for network alignmentDavid Gleich
 
A history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveA history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveDavid Gleich
 
A multithreaded method for network alignment
A multithreaded method for network alignmentA multithreaded method for network alignment
A multithreaded method for network alignmentDavid Gleich
 
Tall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceTall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceDavid Gleich
 
How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...David Gleich
 
Tall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architecturesTall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architecturesDavid Gleich
 
A dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationA dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationDavid Gleich
 
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...David Gleich
 
MapReduce for scientific simulation analysis
MapReduce for scientific simulation analysisMapReduce for scientific simulation analysis
MapReduce for scientific simulation analysisDavid Gleich
 
Recommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLRecommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLDavid Gleich
 
Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduceDavid Gleich
 
Massive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph AlgorithmsMassive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph AlgorithmsDavid Gleich
 
Graph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimcGraph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimcDavid Gleich
 
Overlapping clusters for distributed computation
Overlapping clusters for distributed computationOverlapping clusters for distributed computation
Overlapping clusters for distributed computationDavid Gleich
 
Fast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreFast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreDavid Gleich
 

Viewers also liked (20)

Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...
 
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
 
Direct tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architecturesDirect tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architectures
 
The power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulantsThe power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulants
 
MapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsMapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applications
 
Iterative methods for network alignment
Iterative methods for network alignmentIterative methods for network alignment
Iterative methods for network alignment
 
A history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveA history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspective
 
A multithreaded method for network alignment
A multithreaded method for network alignmentA multithreaded method for network alignment
A multithreaded method for network alignment
 
Tall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceTall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduce
 
How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...
 
Tall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architecturesTall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architectures
 
A dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationA dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportation
 
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
 
MapReduce for scientific simulation analysis
MapReduce for scientific simulation analysisMapReduce for scientific simulation analysis
MapReduce for scientific simulation analysis
 
Recommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLRecommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQL
 
Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduce
 
Massive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph AlgorithmsMassive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph Algorithms
 
Graph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimcGraph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimc
 
Overlapping clusters for distributed computation
Overlapping clusters for distributed computationOverlapping clusters for distributed computation
Overlapping clusters for distributed computation
 
Fast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreFast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and more
 

Similar to Anti-differentiating Approximation Algorithms: PageRank and MinCut

Supervised Prediction of Graph Summaries
Supervised Prediction of Graph SummariesSupervised Prediction of Graph Summaries
Supervised Prediction of Graph SummariesDaniil Mirylenka
 
20130928 automated theorem_proving_harrison
20130928 automated theorem_proving_harrison20130928 automated theorem_proving_harrison
20130928 automated theorem_proving_harrisonComputer Science Club
 
Delayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsDelayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsChristian Robert
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet AllocationMarco Righini
 
block-mdp-masters-defense.pdf
block-mdp-masters-defense.pdfblock-mdp-masters-defense.pdf
block-mdp-masters-defense.pdfJunghyun Lee
 
Explaining the idea behind automatic relevance determination and bayesian int...
Explaining the idea behind automatic relevance determination and bayesian int...Explaining the idea behind automatic relevance determination and bayesian int...
Explaining the idea behind automatic relevance determination and bayesian int...Florian Wilhelm
 
Simulation Informatics; Analyzing Large Scientific Datasets
Simulation Informatics; Analyzing Large Scientific DatasetsSimulation Informatics; Analyzing Large Scientific Datasets
Simulation Informatics; Analyzing Large Scientific DatasetsDavid Gleich
 
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsMethods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsRyan B Harvey, CSDP, CSM
 
GAN for Bayesian Inference objectives
GAN for Bayesian Inference objectivesGAN for Bayesian Inference objectives
GAN for Bayesian Inference objectivesNatan Katz
 
Bayesian Neural Networks
Bayesian Neural NetworksBayesian Neural Networks
Bayesian Neural NetworksNatan Katz
 
Large Scale Recommendation: a view from the Trenches
Large Scale Recommendation: a view from the TrenchesLarge Scale Recommendation: a view from the Trenches
Large Scale Recommendation: a view from the TrenchesAnne-Marie Tousch
 
Variational inference
Variational inference  Variational inference
Variational inference Natan Katz
 
NICE Implementations of Variational Inference
NICE Implementations of Variational Inference NICE Implementations of Variational Inference
NICE Implementations of Variational Inference Natan Katz
 
NICE Research -Variational inference project
NICE Research -Variational inference projectNICE Research -Variational inference project
NICE Research -Variational inference projectNatan Katz
 
Teaching & Learning with Technology TLT 2016
Teaching & Learning with Technology TLT 2016Teaching & Learning with Technology TLT 2016
Teaching & Learning with Technology TLT 2016Roy Clariana
 

Similar to Anti-differentiating Approximation Algorithms: PageRank and MinCut (20)

Supervised Prediction of Graph Summaries
Supervised Prediction of Graph SummariesSupervised Prediction of Graph Summaries
Supervised Prediction of Graph Summaries
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof..."Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
 
20130928 automated theorem_proving_harrison
20130928 automated theorem_proving_harrison20130928 automated theorem_proving_harrison
20130928 automated theorem_proving_harrison
 
Delayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsDelayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithms
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet Allocation
 
block-mdp-masters-defense.pdf
block-mdp-masters-defense.pdfblock-mdp-masters-defense.pdf
block-mdp-masters-defense.pdf
 
Explaining the idea behind automatic relevance determination and bayesian int...
Explaining the idea behind automatic relevance determination and bayesian int...Explaining the idea behind automatic relevance determination and bayesian int...
Explaining the idea behind automatic relevance determination and bayesian int...
 
Simulation Informatics; Analyzing Large Scientific Datasets
Simulation Informatics; Analyzing Large Scientific DatasetsSimulation Informatics; Analyzing Large Scientific Datasets
Simulation Informatics; Analyzing Large Scientific Datasets
 
QMC: Transition Workshop - Approximating Multivariate Functions When Function...
QMC: Transition Workshop - Approximating Multivariate Functions When Function...QMC: Transition Workshop - Approximating Multivariate Functions When Function...
QMC: Transition Workshop - Approximating Multivariate Functions When Function...
 
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsMethods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
 
GAN for Bayesian Inference objectives
GAN for Bayesian Inference objectivesGAN for Bayesian Inference objectives
GAN for Bayesian Inference objectives
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Bayesian Neural Networks
Bayesian Neural NetworksBayesian Neural Networks
Bayesian Neural Networks
 
Large Scale Recommendation: a view from the Trenches
Large Scale Recommendation: a view from the TrenchesLarge Scale Recommendation: a view from the Trenches
Large Scale Recommendation: a view from the Trenches
 
CSC446: Pattern Recognition (LN6)
CSC446: Pattern Recognition (LN6)CSC446: Pattern Recognition (LN6)
CSC446: Pattern Recognition (LN6)
 
Variational inference
Variational inference  Variational inference
Variational inference
 
NICE Implementations of Variational Inference
NICE Implementations of Variational Inference NICE Implementations of Variational Inference
NICE Implementations of Variational Inference
 
NICE Research -Variational inference project
NICE Research -Variational inference projectNICE Research -Variational inference project
NICE Research -Variational inference project
 
Teaching & Learning with Technology TLT 2016
Teaching & Learning with Technology TLT 2016Teaching & Learning with Technology TLT 2016
Teaching & Learning with Technology TLT 2016
 

Recently uploaded

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 

Recently uploaded (20)

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 

Anti-differentiating Approximation Algorithms: PageRank and MinCut

  • 1. Anti-differentiating approximation algorithms ! & new relationships between ! Page Rank, spectral, and localized flow David F. Gleich! Purdue University! Joint work with Michael Mahoney. Supported by " NSF CAREER 1149756-CCF, " Simons Inst. ICERM David Gleich · Purdue 1
  • 2. Anti-differentiating approximation algorithms ! & new relationships between ! Page Rank, spectral, and localized flow A new derivation of the PageRank vector for an undirected graph based on Laplacians, cuts, or flows. A new understanding of the “push” methods to compute Personalized PageRank An empirical improvement to methods for semi- supervised learning. 1st 2nd ICERM David Gleich · Purdue 2
  • 3. The PageRank problem ! The PageRank random surfer 1.  With probability beta, follow a random-walk step 2.  With probability (1-beta), jump randomly ~ dist. v. Goal find the stationary dist. x! ! Alg Solve the linear system Symmetric adjacency matrix Diagonal degree matrix Solution Jump-vector (I AD 1 )x = (1 )v x = AD 1 x + (1 )v ICERM David Gleich · Purdue 3
  • 4. The PageRank problem & ! the Laplacian 1. (I AD 1 )x = (1 )v; 2. (I A)y = (1 )D 1/2 v, where A = D 1/2 AD 1/2 and x = D1/2 y; and 3. [↵D + L]z = ↵v where = 1/(1 + ↵) and x = Dz. Combinatorial Laplacian ICERM David Gleich · Purdue 4
  • 5. The Push Algorithm for PageRank Proposed (in closest form) in Andersen, Chung, Lang " (also by McSherry, Jeh & Widom) for personalized PageRank Strongly related to Gauss-Seidel (see my talk at Simons for this) Derived to show improved runtime for balanced solvers 1. x(1) = 0, r(1) = (1 )ei , k = 1 2. while any rj > ⌧dj (dj is the degree of node j) 3. x(k+1) = x(k) + (rj ⌧dj ⇢)ej 4. r(k+1) i = 8 >< >: ⌧dj ⇢ i = j r(k) i + (rj ⌧dj ⇢)/dj i ⇠ j r(k) i otherwise 5. k k + 1 The Push Method! ⌧, ⇢ ICERM David Gleich · Purdue 5
  • 6. … demo of push … ICERM David Gleich · Purdue 6
  • 7. Why do we care about push? 1.  Used for empirical studies of “communities” 2.  Used for “fast PageRank” approximation It produces sparse approximations to PageRank! Newman’s netscience! 379 vertices, 1828 nnz “zero” on most of the nodes v has a single " one here 7
  • 8. Our question! Why does the “push method” have such incredible empirical utility? 8
  • 9. The O(correct) answer 1.  PageRank related to Laplacian 2.  Laplacian related to cuts 3.  Andersen, Chung, Lang provides the " “right” bounds and “localization” This talk the θ(correct) answer?" A deeper insight into the relationship ICERM David Gleich · Purdue 9
  • 10. Intellectually indebted to … Chin, Mądry, Miller & Peng [2013] Orecchia & Zhu [2014] 10
  • 11. minimize kBxkC,1 = P ij2E Ci,j |xi xj | subject to xs = 1, xt = 0, x 0. The s-t min-cut problem Unweighted incidence matrix Diagonal capacity matrix 11
  • 12. The localized cut graph Related to a construction used in “FlowImprove” " Andersen & Lang (2007); and Orecchia & Zhu (2014) AS = 2 4 0 ↵dT S 0 ↵dS A ↵d¯S 0 ↵dT ¯S 0 3 5 Connect s to vertices in S with weight ↵ · degree Connect t to vertices in ¯S with weight ↵ · degree ICERM David Gleich · Purdue 12
  • 13. The localized cut graph Connect s to vertices in S with weight ↵ · degree Connect t to vertices in ¯S with weight ↵ · degree BS = 2 4 e IS 0 0 B 0 0 I¯S e 3 5 minimize kBSxkC(↵),1 subject to xs = 1, xt = 0 x 0. Solve the s-t min-cut ICERM David Gleich · Purdue 13
  • 14. The localized cut graph Connect s to vertices in S with weight ↵ · degree Connect t to vertices in ¯S with weight ↵ · degree BS = 2 4 e IS 0 0 B 0 0 I¯S e 3 5 Solve the “electrical flow” 
 s-t min-cut minimize kBSxkC(↵),2 subject to xs = 1, xt = 0 ICERM David Gleich · Purdue 14
  • 15. s-t min-cut à PageRank The PageRank vector z that solves (↵D + L)z = ↵v with v = dS/vol(S) is a renormalized solution of the electrical cut computation: minimize kBSxkC(↵),2 subject to xs = 1, xt = 0. Specifically, if x is the solution, then x = 2 4 1 vol(S)z 0 3 5 Proof Square and expand the objective into a Laplacian, then apply constraints. ICERM David Gleich · Purdue 15
  • 16. PageRank à s-t min-cut That equivalence works if v is degree-weighted. What if v is the uniform vector? A(s) = 2 4 0 ↵sT 0 ↵s A ↵(d s) 0 ↵(d s)T 0 3 5 . ICERM David Gleich · Purdue 16
  • 17. And beyond … Easy to cook up interesting diffusion-like problems and adapt them to this framework. In particular, Zhou et al. (2004) gave a semi- supervised learning diffusion we study soon. 2 4 0 eT S 0 eS ✓A e¯S 0 e¯S 0 3 5 . (I + ✓L)x = eS ICERM David Gleich · Purdue 17
  • 18. Back to the push method Let x be the output from the push method with 0 < < 1, v = dS/vol(S), ⇢ = 1, and ⌧ > 0. Set ↵ = 1 ,  = ⌧vol(S)/ , and let zG solve: minimize 1 2 kBSzk 2 C(↵),2 + kDzk1 subject to zs = 1, zt = 0, z 0 , where z = h 1 zG 0 i . Then x = DzG/vol(S). Proof Write out KKT conditions Show that the push method solves them. Slackness was “tricky” Regularization for sparsity ICERM David Gleich · Purdue 18 Need for normalization
  • 19. … demo of equivalence … 19
  • 20. This is a case of Algorithmic Anti-differentiation! 20
  • 21. The ideal world Given Problem P Derive solution characterization C Show algorithm A " finds a solution where C holds Profit?! Given “min-cut” Derive “max-flow is equivalent to min-cut” Show push-relabel solves max-flow " Profit!! ICERM David Gleich · Purdue 21
  • 22. (The ideal world)’ Given Problem P Derive solution approx. characterization C’ Show algorithm A’ quickly finds a solution where C’ holds Profit?! Given “sparest-cut” Derive Rayleigh- quotient approximation Show power-method finds a good Rayleigh- quotient Profit?! ICERM David Gleich · Purdue 22
  • 23. The real world? Given Task P Hack around until you find something useful Write paper presenting “novel heuristic” H for P and … Profit!! Given “find-communities” Hack around " ??? (hidden) ??? Write paper presenting “three matvecs finds real- world communities” Profit!! ICERM David Gleich · Purdue 23
  • 24. Understand why H works! Show heuristic H solves P’ Guess and check! until you find something H solves Derive characterization of heuristic H The real world Given “find-communities” Hack around " Write paper presenting “three matvecs finds real- world communities” Profit!! Algorithmic Anti-differentiation! Given heuristic H, is there a problem P’ such that H is an algorithm for P’ ? ICERM David Gleich · Purdue 24 e.g. Mahoney & Orecchia
  • 25. If your algorithm is related to optimization, this is: Given a procedure X, " what objective does it optimize? The real world Algorithmic Anti-differentiation! Given heuristic H, is there a problem P’ such that H is an algorithm for P’ ? In an unconstrained case, this is just “anti-differentiation!” ICERM David Gleich · Purdue 25
  • 26. Algorithmic Anti-differentiation in the literature Dhillon et al. (2007) " Spectral clustering, trace minimization & kernel k-means Saunders (1995) LSQR & Craig iterative methods ICERM David Gleich · Purdue 26
  • 27. Why does it matter?! These details matter in " many empirical studies, and can dramatically impact performance (speed or quality) ICERM David Gleich · Purdue 27
  • 28. Semi-supervised Learning on Graphs Ai,j = exp ✓ kdi dj k2 2 2 2 ◆ di dj = 2.5 = 1.25 Zhou et al. NIPS (2003) 28
  • 29. Semi-supervised Learning on Graphs = 2.5 = 1.25 Experiment predict unlabeled images from the labeled ones 29
  • 30. Semi-supervised Learning on Graphs K2 = (D A) 1 K1 = (I A) 1 K3 = (Diag(Ae) A) 1 Y = Ki L Our new “kernel” Indicators on the revealed labels Predictions Experiment vary number of labeled images and track perf. y = argmaxj Y 30
  • 31. Semi-supervised Learning on Graphs K2 = (D A) 1 K1 = (I A) 1 K3 = (Diag(Ae) A) 1 Y = Ki L Experiment vary number of labeled images and track perf. y = argmaxj Y 0 20 40 0 0.2 0.4 0.6 0.8 1 Num. labels Errorrate K1 K2 K3 RK3 = 1.25 Regularized K3 Zhou et al. NIPS (2004) 31
  • 32. Semi-supervised Learning on Graphs K2 = (D A) 1 K1 = (I A) 1 K3 = (Diag(Ae) A) 1 Y = Ki L Experiment vary number of labeled images and track perf. y = argmaxj Y Regularized K3 = 2.5 Our new value Random guessing 32
  • 33. Semi-supervised Learning on Graphs K2 = (D A) 1 K1 = (I A) 1 K3 = (Diag(Ae) A) 1 Y = Ki L Experiment vary number of labeled images and track perf. y = argmaxj Y Regularized K3 0 20 40 0 0.2 0.4 0.6 0.8 1 Num. labels Errorrate K1 K2 K3 RK3 = 2.5 Our new value Random guessing 33
  • 34. What’s happening? 0 0.5 1 0 0.2 0.4 0.6 0.8 1 2 vs. 1,2,3,4, σ=2.50 false pos. truepos. K1 K2 K3 RK3 0 0.5 1 0 0.2 0.4 0.6 0.8 1 2 vs. 1,2,3,4, σ=1.25 false pos. truepos. K1 K2 K3 RK3 Much better performance! ICERM David Gleich · Purdue 34
  • 35. The results of our ! regularized estimate 500 1000 1500 2000 2500 3000 3500 0.05 0.1 0.15 0.2 0.25 0.3 0.35 ICERM David Gleich · Purdue 35
  • 36. Why does it matter?! Theory has the answer! We “sweep” over cuts from approximate eigenvectors! It’s the order not the values. ICERM David Gleich · Purdue 36
  • 37. 0 20 40 0 0.1 0.2 0.3 0.4 Num. labels Errorrate K1 K2 K3 RK3 Improved performance Y = Ki L Regularized K3 y = argminj SortedRank(Y) We have spent no time tuning the reg. parameter. ICERM David Gleich · Purdue 37 K2 = (D A) 1 K1 = (I A) 1 K3 = (Diag(Ae) A) 1 = 2.5 Our new value
  • 38. Anti-di↵erentiating Approximation Algorithms 16 nonzeros 15 nonzeros 284 nonzeros 24 nonzeros Recap & Conclusions ICERM David Gleich · Purdue 38 Open issues! Better treatment of directed graphs? Algorithm for rho < 1?! rho set to ½ in most “uses” Need new analysis New relationships between localized cuts & PageRank New understanding of PPR" push procedure Improvements to semi- supervised learning on graphs!