SlideShare une entreprise Scribd logo
1  sur  30
Télécharger pour lire hors ligne
Using Local Spectral
Methods to Robustify
Graph-Based Learning
David F. Gleich!
Purdue University!
Joint work with 
Michael
Mahoney @
Berkeley
supported by "
NSF CAREER
CCF-1149756
Code www.cs.purdue.edu/homes/dgleich/codes/robust-diffusions!
KDD2015
The graph-based data analysis pipeline
1 0 0 0 1 0 0 1
0 1 0 1 0 0 1 1
0 1 0 1 0 0 0 1
1 0 0 0 0 0 1 1
1 1 0 1 1 1 0 1
1 0 1 1 0 0 0 1
1 0 1 1 1 0 1 0
1 1 1 1 0 1 0 0
1 1 1 0 0 1 1 1
1 1 0 1 1 1 1 1
"
Raw data!
•  Relationships
•  Images
•  Text records
•  Etc.
"
Convert to a graph!
•  Nearest neighs
•  Kernels
•  2-mode to 1-mode
•  Etc.
"
Algorithm/Learning!
•  Important nodes
•  Infer features
•  Clustering
•  Etc.
KDD2015
David Gleich · Purdue
2
“Noise” in the initial data
modeling decisions
"
Explicit graphs!
are those that are
given to a data
analyst. 

“A social network”
•  Known spam
accounts included?
•  Users not logged in
for a year?
•  Etc.
A type of noise
"
Constructed graphs!
are built based on some
other primary data."


“nearest neighbor graphs”
•  K-NN or ε-NN
•  Thresholding correlations
to zero

Often made for computational
convenience! (Graph too big.)
A different type a noise!
"
Labeled graphs!
occur in information
diffusion/propagation 


“function prediction”
•  Labeled nodes
•  Labeled edges
•  Some are wrong


A direct type of noise!
Do these decisions matter? 
Our experience Yes! Dramatically so!
KDD2015
David Gleich · Purdue
3
The graph-based data analysis pipeline
1 0 0 0 1 0 0 1
0 1 0 1 0 0 1 1
0 1 0 1 0 0 0 1
1 0 0 0 0 0 1 1
1 1 0 1 1 1 0 1
1 0 1 1 0 0 0 1
1 0 1 1 1 0 1 0
1 1 1 1 0 1 0 0
1 1 1 0 0 1 1 1
1 1 0 1 1 1 1 1
"
Raw data!
•  Relationships
•  Images
•  Text records
•  Etc.
"
Convert to a graph!
•  Nearest neighs
•  Kernels
•  2-mode to 1-mode
•  Etc.
"
Algorithm/Learning!
•  Important nodes
•  Infer features
•  Clustering
•  Etc.
Most algorithmic and
statistical research
happens here
The potential
downstream signal is
determined by this step
KDD2015
David Gleich · Purdue
4
Our goal: towards an integrative analysis
1 0 0 0 1 0 0 1
0 1 0 1 0 0 1 1
0 1 0 1 0 0 0 1
1 0 0 0 0 0 1 1
1 1 0 1 1 1 0 1
1 0 1 1 0 0 0 1
1 0 1 1 1 0 1 0
1 1 1 1 0 1 0 0
1 1 1 0 0 1 1 1
1 1 0 1 1 1 1 1
!
•  How does the graph creation process affect
the outcomes of graph-based learning?
•  Is there anything we can do to make this
process more robust?
KDD2015
David Gleich · Purdue
5
Graph-based learning is usually only
one component of a big pipeline
KDD2015
David Gleich · Purdue
6
1 0 0 0 1 0 0 1
0 1 0 1 0 0 1 1
0 1 0 1 0 0 0 1
1 0 0 0 0 0 1 1
1 1 0 1 1 1 0 1
1 0 1 1 0 0 0 1
1 0 1 1 1 0 1 0
1 1 1 1 0 1 0 0
1 1 1 0 0 1 1 1
1 1 0 1 1 1 1 1
1 0 0 0 1 0 0 1
0 1 0 1 0 0 1 1
0 1 0 1 0 0 0 1
1 0 0 0 0 0 1 1
1 1 0 1 1 1 0 1
1 0 1 1 0 0 0 1
1 0 1 1 1 0 1 0
1 1 1 1 0 1 0 0
1 1 1 0 0 1 1 1
1 1 0 1 1 1 1 1
1 0 0 0 1 0 0 1
0 1 0 1 0 0 1 1
0 1 0 1 0 0 0 1
1 0 0 0 0 0 1 1
1 1 0 1 1 1 0 1
1 0 1 1 0 0 0 1
1 0 1 1 1 0 1 0
1 1 1 1 0 1 0 0
1 1 1 0 0 1 1 1
1 1 0 1 1 1 1 1
1 0 0 0 1 0 0 1
0 1 0 1 0 0 1 1
0 1 0 1 0 0 0 1
1 0 0 0 0 0 1 1
1 1 0 1 1 1 0 1
1 0 1 1 0 0 0 1
1 0 1 1 1 0 1 0
1 1 1 1 0 1 0 0
1 1 1 0 0 1 1 1
1 1 0 1 1 1 1 1
1 0 0 0 1 0 0 1
0 1 0 1 0 0 1 1
0 1 0 1 0 0 0 1
1 0 0 0 0 0 1 1
1 1 0 1 1 1 0 1
1 0 1 1 0 0 0 1
1 0 1 1 1 0 1 0
1 1 1 1 0 1 0 0
1 1 1 0 0 1 1 1
1 1 0 1 1 1 1 1
1 0 0 0 1 0 0 1
0 1 0 1 0 0 1 1
0 1 0 1 0 0 0 1
1 0 0 0 0 0 1 1
1 1 0 1 1 1 0 1
1 0 1 1 0 0 0 1
1 0 1 1 1 0 1 0
1 1 1 1 0 1 0 0
1 1 1 0 0 1 1 1
1 1 0 1 1 1 1 1
1 0 0 0 1 0 0 1
0 1 0 1 0 0 1 1
0 1 0 1 0 0 0 1
1 0 0 0 0 0 1 1
1 1 0 1 1 1 0 1
1 0 1 1 0 0 0 1
1 0 1 1 1 0 1 0
1 1 1 1 0 1 0 0
1 1 1 0 0 1 1 1
1 1 0 1 1 1 1 1
1 0 0 0 1 0 0 1
0 1 0 1 0 0 1 1
0 1 0 1 0 0 0 1
1 0 0 0 0 0 1 1
1 1 0 1 1 1 0 1
1 0 1 1 0 0 0 1
1 0 1 1 1 0 1 0
1 1 1 1 0 1 0 0
1 1 1 0 0 1 1 1
1 1 0 1 1 1 1 1
Many databases over"
genes with survival rates"
for various cancers
List of possible genes"
responsible for survival
Cluster analysis
Reinteration of data
THIS STEP SHOULD!
BE ROBUST TO !
VARIATIONS ABOVE!
Scalable graph analytics
Local methods are one of the most successful
classes of scalable graph analytics
They don’t even look at the entire graph.
•  Andersen Chung Lang (ACL) Push method
Conjecture"
Local methods regularize some variant of the
original algorithm or problem.
Justification"
For ACL and a few relatives this is exact!
Impact?!
Improved robustness to noise?
KDD2015
David Gleich · Purdue
7
c
For instance, to "
answer “what function” is
shared by the started node,
we’d only look at the circled
region.
Our contributions
We study these issues in the case of "
semi-supervised learning (SSL) on graphs
1.  We illustrate a common mincut framework for a variety of SSL
methods
2.  Show how to “localize” one (and make it scalable!)
3.  Provide a more robust SSL labeling method
4.  Identify a weakness in SSL methods: they cannot use extra
edges! We find one useful way to do so.
KDD2015
David Gleich · Purdue
8
Semi-supervised "
graph-based learning 
KDD2015
David Gleich · Purdue
9
Given a graph, and a few labeled nodes,
predict the labels on the rest of the graph.
Algorithm

1.  Run a diffusion for
each label (possibly
with neg. info from
other classes)
2.  Assign new labels
based on the value of
each diffusion
Semi-supervised "
graph-based learning 
KDD2015
David Gleich · Purdue
10
Given a graph, and a few labeled nodes,
predict the labels on the rest of the graph.
Algorithm

1.  Run a diffusion for
each label (possibly
with neg. info from
other classes)
2.  Assign new labels
based on the value of
each diffusion
Semi-supervised "
graph-based learning 
KDD2015
David Gleich · Purdue
11
Given a graph, and a few labeled nodes,
predict the labels on the rest of the graph.
Algorithm

1.  Run a diffusion for
each label (possibly
with neg. info from
other classes)
2.  Assign new labels
based on the value of
each diffusion
The diffusions proposed for semi-
supervised learning are s,t-cut minorants
KDD2015
David Gleich · Purdue
12
1
3
2
6
4
5
7
8
9
10
t
s
In the unweighted case, "
solve via max-flow.

In the weighted case,
solve via network simplex
or industrial LP.
minimize
qP
ij2E Ci,j |xi xj |2
subject to xs = 1, xt = 0.
minimize
P
ij2E Ci,j |xi xj |
subject to xs = 1, xt = 0.
MINCUT LP
 Spectral minorant – lin. sys.
Representative cut problems
KDD2015
David Gleich · Purdue
13
∞
∞
∞
∞
s
t
ZGL
α
α 4α
3α
4α
6α
3α
3α
5α
5α
5α 2α
5α
4α
5α
s
t
Zhou et al.
Positive label
Neg. label
Unlabeled
Andersen-Lang weighting "
variation too
Joachims has a variation too.
Zhou et al. NIPS 2003; Zhu et al., ICML 2003;
Andersen Lang, SODA 2008; Joachims, ICML 2003
These help our intuition about the solutions
All spectral minorants are linear systems.
Implicit regularization views
on the Zhou et al. diffusion
KDD2015
David Gleich · Purdue
14
α
α 4α
3α
4α
6α
3α
3α
5α
5α
5α 2α
5α
4α
5α
s
t
Zhou et al.
RESULT!
The spectral minorant of Zhou is equivalent to
the weakly-local MOV solution.
PROOF!
The two linear systems are the same (after
working out a few equivalences).
IMPORTANCE!
We’d expect Zhou to be “more robust” 
minimize
qP
ij2E Ci,j |xi xj |2
subject to xs = 1, xt = 0.
The Mahoney-Orecchia-Vishnoi (MOV) vector is a
localized variation on the Fiedler vector to find a small
conductance set nearby a seed set.
A scalable, localized algorithm for Zhou
et al’s diffusion. 
KDD2015
David Gleich · Purdue
15
RESULT!
We can use a variation on coordinate descent methods related to
the Andersen-Chung-Lang PUSH procedure to solve Zhou’s
diffusion in a scalable manner. 
PROOF. See Gleich-Mahoney ICML ‘14
IMPORTANCE (1)!
We should be able to make Zhou et al. scale.
IMPORTANCE (2)!
Using this algorithm adds another implicit regularization term that
should further improve robustness!
minimize
qP
ij2E Ci,j |xi xj |2
subject to xs = 1, xt = 0.
minimize
P
ij2E Ci,j |xi xj |2
+ ⌧
P
i2V di xi
subject to xs = 1, xt = 0, xi 0.
Semi-supervised "
graph-based learning 
KDD2015
David Gleich · Purdue
16
Given a graph, and a few labeled nodes,
predict the labels on the rest of the graph.
Algorithm

1.  Run a diffusion for
each label (possibly
with neg. info from
other classes)
2.  Assign new labels
based on the value of
each diffusion
Traditional rounding methods
for SSL are value-based
KDD2015
David Gleich · Purdue
17
Class 1 Class 2Class 3
Class 1
Class 2
Class 3
CLASS 1
CLASS 2
CLASS 3
VALUE-BASED
Use the largest value of the diffusion to pick the label.
Zhou’s diffusion
But value based rounding
doesn’t work for all diffusions
KDD2015
David Gleich · Purdue
18
Class 1 Class 2Class 3
Class 1
Class 2
Class 3
Class 1 Class 2Class 3 Class 1 Class 2Class 3 Class 1 Class 2Class 3
(b) Zhou et al., l = 3 (c) Andersen-Lang, l = 3 (d) Joachims, l = 3 (e) ZGL, l = 3
Class 1 Class 2Class 3 Class 1 Class 2Class 3 Class 1 Class 2Class 3 Class 1 Class 2Class 3
(f) Zhou et al., l = 15 (g) Andersen-Lang, l = 15 (h) Joachims, l = 15 (i) ZGL, l = 15
CLASS 1
CLASS 3
CLASS 2
VALUE-BASED rounding fails
for most of these diffusions
BUT
 There is still a
signal there!
Adding more labels doesn’t help either, see the
paper for those details
Rank-based rounding is far
more robust. 
KDD2015
David Gleich · Purdue
19
Class 1 Class 2Class 3
NEW IDEA!
Look at the RANK of the
item in each diffusion
instead of it’s VALUE.

JUSTIFICATION!
Based on the idea of
sweep-cut rounding in
spectral methods (use the
order induced by the
eigenvector, not its values)

IMPACT!
Much more robust
rounding to labels
Rank-based rounding has a
big impact on a real-study.
KDD2015
David Gleich · Purdue
20
2 4 6 8 10
0
0.2
0.4
0.6
0.8
errorrate
average training samples per class
Zhou
Zhou+Push
2 4 6 8 10
0
0.2
0.4
0.6
0.8
1
errorrate
average training samples per class
Zhou
Zhou+Push
We used the digit prediction task out of Zhou’s paper and added just a bit of noise"
as label errors and switched parameters.
VALUE-BASED
 RANK-BASED
Main empirical results
1.  Zhou’s diffusion seems to work best for sparse
graphs whereas the ZGL diffusion works best for
dense
2.  On the digit’s dataset, dense graph constructions
yield higher error rates
3.  Densifying the a super-sparse graph construction on
the digits dataset yields lower error.
4.  And a similar fact holds on an Amazon co-
purchasing network. 
KDD2015
David Gleich · Purdue
21
5 10
0
50
100
150
number of labels
numberofmistakes
An illustrative synthetic problem
shows the differences.
Two-class block-model, 150 nodes each"
between prob = 0.02, "
withinprob = 0.35 (dense) or 0.06 (sparse)
Reveal labels for k nodes (varied) and we have
different error rates (sparse 0/10% low/high)
and dense (20%/60% for low-high)
KDD2015
David Gleich · Purdue
22
5 10
0
50
100
150
number of labels
numberofmistakes
Joachims
Zhou
ZGL
Real-world scenario"
sparse graph, high error
Sparse graph,"
low error
20 40 60
0
20
40
60
80
100
number of labels
numberofmistakes
20 40 60
0
20
40
60
80
100
number of labels
numberofmistakes
Dense graph,"
low error rate
Dense graph,"
high error rate
Varying density in an SSL
construction.
KDD2015
David Gleich · Purdue
23
Ai,j = exp
✓
kdi dj k2
2
2 2
◆
di
dj = 2.5
= 1.25
We use the digits
experiment from
Zhou et al. 2003.
10 digits and a
few label errors. 

We vary density
either by the
num. of nearest
neighbors or by
the kernel width.
As density increases, the
results just get worse
KDD2015
David Gleich · Purdue
24
1 1.5 2 2.5
0
0.1
0.2
0.3
0.4
errorrate
σ
0.8
1.2
1.5
1.8
2.1
2.5
Zhou
Zhou+Push
10
2
0
0.1
0.2
0.3
0.4
errorrate
nearest neighbors
5
10
25
50
100
150
200
250
Zhou
Zhou+Push
Varying kernel width
 Varying nearest neighors
•  Adding “more” edges seems to only hurt. (Unless there is no signal). 
•  Zhou+Push seems to be slightly more robust (Maybe).
Some observations and a
question.
Adding “more data” yields “worse results” for
this procedure (in a simple setting).

Suppose I have a real-world system that can
work with up to E edges on some graph. 
Is there a way I can create new edges?
KDD2015
David Gleich · Purdue
25
Densifying the graph with
path expansions
KDD2015
David Gleich · Purdue
26
Ak =
kX
`=1
A` If A is the adjacency matrix, then this counts the
total weight on all paths up to length k. 
We now repeat the nearest neighbor computation, but with paired
parameters such that we have the same average degree. 
Zhou Zhou w. Push
Avg. Deg k = 1 k 1 k = 1 k 1
19 0.163 0.114 0.156 0.117
41 0.156 0.132 0.158 0.113
53 0.183 0.142 0.179 0.136
104 0.193 0.145 0.178 0.144
138 0.216 0.102 0.204 0.101
k=4, nn = 3
The same result holds for Amazon’s
co-purchasing network
KDD2015
David Gleich · Purdue
27
mean F1 Confidence intervals
k Zhou Zhou
w. Push
Zhou Zhou w. Push
1 0.173 0.229 [0.15 0.19] [0.21 0.25]
2 0.197 0.231 [0.18 0.22] [0.21 0.25]
3 0.221 0.238 [0.17 0.27] [0.19 0.28]
Amazon’s co-purchasing network (on Snap) is effectively
a highly sparse nearest-neighbor network from their
(denser) co-purchasing graph. 

We attempt to predict the items in a product category
based on a small sample and study the F1 score for the
predictions. 
Some small details missing – see the full paper.
Ak =
kX
`=1
A`
(a) K2 sparse (b) K2 dense (c) RK2
Figure 2: We artificially densify this graph to Ak based on a cons
and dense di↵usions and regularization. The color indicates the
circled nodes. The unavoidable errors are caused by a mislabeled
regularizing di↵usions on dense graphs produces only a small
Towards some theory, i.e. why are
densified sparse graphs better?
KDD2015
David Gleich · Purdue
28
How do sparsity, density,
and regularization of a
diffusion play into the results
in a controlled setting?
THE ERROR
Labels
(a) K2 sparse (b) K2 dense (c) RK2
Figure 2: We artificially densify this graph to Ak based on a cons
and dense di↵usions and regularization. The color indicates the
circled nodes. The unavoidable errors are caused by a mislabeled
regularizing di↵usions on dense graphs produces only a small
Towards some theory, i.e. why are
densified sparse graphs better?
KDD2015
David Gleich · Purdue
29
THE ERROR
Labels
Using Push algorithm
P5
`=1 A`
P5
`=1 A`
Regularization
Dense
Dense
Recap, discussion, future work
Contributions
1.  Flow-setup for SSL
diffusions
2.  New robust rounding rule for
class selection
3.  Localized Zhou’s diffusion
4.  Empirical insights on density
of graph constructions
Observations
•  Many of these insights
translate to directed,
weighted graphs with fuzzy
labels and/or some parallel
architectures.
•  Weakness mainly empirical
results on the density.
•  We need a theoretical basis
for the densification theory!
KDD2015
David Gleich · Purdue
30
Supported by NSF, ARO, DARPA
 CODE www.cs.purdue.edu/homes/dgleich/codes/robust-diffusions

Contenu connexe

Tendances

Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...David Gleich
 
Big data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsBig data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsDavid Gleich
 
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networksDavid Gleich
 
Spacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisSpacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisDavid Gleich
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresDavid Gleich
 
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structuresDavid Gleich
 
A dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationA dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationDavid Gleich
 
Correlation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksCorrelation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksDavid Gleich
 
Lecture 5 backpropagation
Lecture 5 backpropagationLecture 5 backpropagation
Lecture 5 backpropagationParveenMalik18
 
PR 103: t-SNE
PR 103: t-SNEPR 103: t-SNE
PR 103: t-SNETaeoh Kim
 
Numerical Approximation of Filtration Processes through Porous Media
Numerical Approximation of Filtration Processes through Porous MediaNumerical Approximation of Filtration Processes through Porous Media
Numerical Approximation of Filtration Processes through Porous MediaRaheel Ahmed
 
Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...David Gleich
 
Backpropagation: Understanding How to Update ANNs Weights Step-by-Step
Backpropagation: Understanding How to Update ANNs Weights Step-by-StepBackpropagation: Understanding How to Update ANNs Weights Step-by-Step
Backpropagation: Understanding How to Update ANNs Weights Step-by-StepAhmed Gad
 
Contactless Calipper - English
Contactless Calipper - EnglishContactless Calipper - English
Contactless Calipper - EnglishGiga Khizanishvili
 
PR 113: The Perception Distortion Tradeoff
PR 113: The Perception Distortion TradeoffPR 113: The Perception Distortion Tradeoff
PR 113: The Perception Distortion TradeoffTaeoh Kim
 
Lesson 26: Integration by Substitution (handout)
Lesson 26: Integration by Substitution (handout)Lesson 26: Integration by Substitution (handout)
Lesson 26: Integration by Substitution (handout)Matthew Leingang
 
Multimodal Residual Learning for Visual Question-Answering
Multimodal Residual Learning for Visual Question-AnsweringMultimodal Residual Learning for Visual Question-Answering
Multimodal Residual Learning for Visual Question-AnsweringNAVER D2
 

Tendances (20)

Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
 
Big data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsBig data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphs
 
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networks
 
Spacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisSpacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysis
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structures
 
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structures
 
A dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationA dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportation
 
Correlation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksCorrelation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networks
 
What is Critical in GAN Training?
What is Critical in GAN Training?What is Critical in GAN Training?
What is Critical in GAN Training?
 
Lecture 5 backpropagation
Lecture 5 backpropagationLecture 5 backpropagation
Lecture 5 backpropagation
 
PR 103: t-SNE
PR 103: t-SNEPR 103: t-SNE
PR 103: t-SNE
 
Numerical Approximation of Filtration Processes through Porous Media
Numerical Approximation of Filtration Processes through Porous MediaNumerical Approximation of Filtration Processes through Porous Media
Numerical Approximation of Filtration Processes through Porous Media
 
Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...
 
Backpropagation: Understanding How to Update ANNs Weights Step-by-Step
Backpropagation: Understanding How to Update ANNs Weights Step-by-StepBackpropagation: Understanding How to Update ANNs Weights Step-by-Step
Backpropagation: Understanding How to Update ANNs Weights Step-by-Step
 
Contactless Calipper - English
Contactless Calipper - EnglishContactless Calipper - English
Contactless Calipper - English
 
PR 113: The Perception Distortion Tradeoff
PR 113: The Perception Distortion TradeoffPR 113: The Perception Distortion Tradeoff
PR 113: The Perception Distortion Tradeoff
 
B0560508
B0560508B0560508
B0560508
 
Lesson 26: Integration by Substitution (handout)
Lesson 26: Integration by Substitution (handout)Lesson 26: Integration by Substitution (handout)
Lesson 26: Integration by Substitution (handout)
 
Cs36565569
Cs36565569Cs36565569
Cs36565569
 
Multimodal Residual Learning for Visual Question-Answering
Multimodal Residual Learning for Visual Question-AnsweringMultimodal Residual Learning for Visual Question-Answering
Multimodal Residual Learning for Visual Question-Answering
 

En vedette

How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...David Gleich
 
Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduceDavid Gleich
 
Graph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimcGraph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimcDavid Gleich
 
Overlapping clusters for distributed computation
Overlapping clusters for distributed computationOverlapping clusters for distributed computation
Overlapping clusters for distributed computationDavid Gleich
 
Fast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreFast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreDavid Gleich
 
Graph based Semi Supervised Learning V1
Graph based Semi Supervised Learning V1Graph based Semi Supervised Learning V1
Graph based Semi Supervised Learning V1Neeta Pande
 
Direct tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architecturesDirect tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architecturesDavid Gleich
 
Iterative methods for network alignment
Iterative methods for network alignmentIterative methods for network alignment
Iterative methods for network alignmentDavid Gleich
 
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...David Gleich
 
A history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveA history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveDavid Gleich
 
A multithreaded method for network alignment
A multithreaded method for network alignmentA multithreaded method for network alignment
A multithreaded method for network alignmentDavid Gleich
 
The power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulantsThe power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulantsDavid Gleich
 
Tall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceTall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceDavid Gleich
 
MapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsMapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsDavid Gleich
 
Relaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksRelaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksDavid Gleich
 
Tall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architecturesTall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architecturesDavid Gleich
 
Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential David Gleich
 
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...David Gleich
 

En vedette (18)

How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...
 
Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduce
 
Graph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimcGraph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimc
 
Overlapping clusters for distributed computation
Overlapping clusters for distributed computationOverlapping clusters for distributed computation
Overlapping clusters for distributed computation
 
Fast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreFast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and more
 
Graph based Semi Supervised Learning V1
Graph based Semi Supervised Learning V1Graph based Semi Supervised Learning V1
Graph based Semi Supervised Learning V1
 
Direct tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architecturesDirect tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architectures
 
Iterative methods for network alignment
Iterative methods for network alignmentIterative methods for network alignment
Iterative methods for network alignment
 
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
 
A history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveA history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspective
 
A multithreaded method for network alignment
A multithreaded method for network alignmentA multithreaded method for network alignment
A multithreaded method for network alignment
 
The power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulantsThe power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulants
 
Tall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceTall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduce
 
MapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsMapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applications
 
Relaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksRelaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networks
 
Tall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architecturesTall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architectures
 
Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential
 
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
 

Similaire à Using Local Spectral Methods to Robustify Graph-Based Learning

Ara-Data visualization
Ara-Data visualizationAra-Data visualization
Ara-Data visualizationMazhar Syed
 
Simplicial closure and higher-order link prediction --- SIAMNS18
Simplicial closure and higher-order link prediction --- SIAMNS18Simplicial closure and higher-order link prediction --- SIAMNS18
Simplicial closure and higher-order link prediction --- SIAMNS18Austin Benson
 
Quant Data Analysis
Quant Data AnalysisQuant Data Analysis
Quant Data AnalysisSaad Chahine
 
Higher-order Link Prediction Syracuse
Higher-order Link Prediction SyracuseHigher-order Link Prediction Syracuse
Higher-order Link Prediction SyracuseAustin Benson
 
34th.余凯.机器学习进展及语音图像中的应用
34th.余凯.机器学习进展及语音图像中的应用34th.余凯.机器学习进展及语音图像中的应用
34th.余凯.机器学习进展及语音图像中的应用komunling
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesBig Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesKrishna Sankar
 
Minimal viable-datareuse-czi
Minimal viable-datareuse-cziMinimal viable-datareuse-czi
Minimal viable-datareuse-cziPaul Groth
 
Getting to Know Your Data with R
Getting to Know Your Data with RGetting to Know Your Data with R
Getting to Know Your Data with RStephen Withington
 
Understanding Deep Learning Requires Rethinking Generalization
Understanding Deep Learning Requires Rethinking GeneralizationUnderstanding Deep Learning Requires Rethinking Generalization
Understanding Deep Learning Requires Rethinking GeneralizationAhmet Kuzubaşlı
 
Hpai class 16 - learning - 041320
Hpai   class 16 - learning - 041320Hpai   class 16 - learning - 041320
Hpai class 16 - learning - 041320melendez321
 
Mixed Effects Models - Random Intercepts
Mixed Effects Models - Random InterceptsMixed Effects Models - Random Intercepts
Mixed Effects Models - Random InterceptsScott Fraundorf
 
Local collaborative autoencoders (WSDM2021)
Local collaborative autoencoders (WSDM2021)Local collaborative autoencoders (WSDM2021)
Local collaborative autoencoders (WSDM2021)민진 최
 
Jillian ms defense-4-14-14-ja-novideo
Jillian ms defense-4-14-14-ja-novideoJillian ms defense-4-14-14-ja-novideo
Jillian ms defense-4-14-14-ja-novideoJillian Aurisano
 
Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian Aurisano
 

Similaire à Using Local Spectral Methods to Robustify Graph-Based Learning (20)

Basics Gephi Tutorial
Basics Gephi TutorialBasics Gephi Tutorial
Basics Gephi Tutorial
 
Ara-Data visualization
Ara-Data visualizationAra-Data visualization
Ara-Data visualization
 
Simplicial closure and higher-order link prediction --- SIAMNS18
Simplicial closure and higher-order link prediction --- SIAMNS18Simplicial closure and higher-order link prediction --- SIAMNS18
Simplicial closure and higher-order link prediction --- SIAMNS18
 
Machine Learning Tools and Particle Swarm Optimization for Content-Based Sear...
Machine Learning Tools and Particle Swarm Optimization for Content-Based Sear...Machine Learning Tools and Particle Swarm Optimization for Content-Based Sear...
Machine Learning Tools and Particle Swarm Optimization for Content-Based Sear...
 
Quant Data Analysis
Quant Data AnalysisQuant Data Analysis
Quant Data Analysis
 
2014 aus-agta
2014 aus-agta2014 aus-agta
2014 aus-agta
 
Higher-order Link Prediction Syracuse
Higher-order Link Prediction SyracuseHigher-order Link Prediction Syracuse
Higher-order Link Prediction Syracuse
 
Everybody Lies
Everybody LiesEverybody Lies
Everybody Lies
 
34th.余凯.机器学习进展及语音图像中的应用
34th.余凯.机器学习进展及语音图像中的应用34th.余凯.机器学习进展及语音图像中的应用
34th.余凯.机器学习进展及语音图像中的应用
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesBig Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
 
Minimal viable-datareuse-czi
Minimal viable-datareuse-cziMinimal viable-datareuse-czi
Minimal viable-datareuse-czi
 
Getting to Know Your Data with R
Getting to Know Your Data with RGetting to Know Your Data with R
Getting to Know Your Data with R
 
Understanding Deep Learning Requires Rethinking Generalization
Understanding Deep Learning Requires Rethinking GeneralizationUnderstanding Deep Learning Requires Rethinking Generalization
Understanding Deep Learning Requires Rethinking Generalization
 
Hpai class 16 - learning - 041320
Hpai   class 16 - learning - 041320Hpai   class 16 - learning - 041320
Hpai class 16 - learning - 041320
 
Mixed Effects Models - Random Intercepts
Mixed Effects Models - Random InterceptsMixed Effects Models - Random Intercepts
Mixed Effects Models - Random Intercepts
 
Local collaborative autoencoders (WSDM2021)
Local collaborative autoencoders (WSDM2021)Local collaborative autoencoders (WSDM2021)
Local collaborative autoencoders (WSDM2021)
 
Infrastructures et recommandations pour les Humanités Numériques - Big Data e...
Infrastructures et recommandations pour les Humanités Numériques - Big Data e...Infrastructures et recommandations pour les Humanités Numériques - Big Data e...
Infrastructures et recommandations pour les Humanités Numériques - Big Data e...
 
[IJET-V1I3P6]
[IJET-V1I3P6] [IJET-V1I3P6]
[IJET-V1I3P6]
 
Jillian ms defense-4-14-14-ja-novideo
Jillian ms defense-4-14-14-ja-novideoJillian ms defense-4-14-14-ja-novideo
Jillian ms defense-4-14-14-ja-novideo
 
Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2
 

Dernier

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 

Dernier (20)

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 

Using Local Spectral Methods to Robustify Graph-Based Learning

  • 1. Using Local Spectral Methods to Robustify Graph-Based Learning David F. Gleich! Purdue University! Joint work with Michael Mahoney @ Berkeley supported by " NSF CAREER CCF-1149756 Code www.cs.purdue.edu/homes/dgleich/codes/robust-diffusions! KDD2015
  • 2. The graph-based data analysis pipeline 1 0 0 0 1 0 0 1 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 1 0 0 0 0 0 1 1 1 1 0 1 1 1 0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 0 1 0 1 1 1 1 0 1 0 0 1 1 1 0 0 1 1 1 1 1 0 1 1 1 1 1 " Raw data! •  Relationships •  Images •  Text records •  Etc. " Convert to a graph! •  Nearest neighs •  Kernels •  2-mode to 1-mode •  Etc. " Algorithm/Learning! •  Important nodes •  Infer features •  Clustering •  Etc. KDD2015 David Gleich · Purdue 2
  • 3. “Noise” in the initial data modeling decisions " Explicit graphs! are those that are given to a data analyst. “A social network” •  Known spam accounts included? •  Users not logged in for a year? •  Etc. A type of noise " Constructed graphs! are built based on some other primary data." “nearest neighbor graphs” •  K-NN or ε-NN •  Thresholding correlations to zero Often made for computational convenience! (Graph too big.) A different type a noise! " Labeled graphs! occur in information diffusion/propagation “function prediction” •  Labeled nodes •  Labeled edges •  Some are wrong A direct type of noise! Do these decisions matter? Our experience Yes! Dramatically so! KDD2015 David Gleich · Purdue 3
  • 4. The graph-based data analysis pipeline 1 0 0 0 1 0 0 1 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 1 0 0 0 0 0 1 1 1 1 0 1 1 1 0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 0 1 0 1 1 1 1 0 1 0 0 1 1 1 0 0 1 1 1 1 1 0 1 1 1 1 1 " Raw data! •  Relationships •  Images •  Text records •  Etc. " Convert to a graph! •  Nearest neighs •  Kernels •  2-mode to 1-mode •  Etc. " Algorithm/Learning! •  Important nodes •  Infer features •  Clustering •  Etc. Most algorithmic and statistical research happens here The potential downstream signal is determined by this step KDD2015 David Gleich · Purdue 4
  • 5. Our goal: towards an integrative analysis 1 0 0 0 1 0 0 1 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 1 0 0 0 0 0 1 1 1 1 0 1 1 1 0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 0 1 0 1 1 1 1 0 1 0 0 1 1 1 0 0 1 1 1 1 1 0 1 1 1 1 1 ! •  How does the graph creation process affect the outcomes of graph-based learning? •  Is there anything we can do to make this process more robust? KDD2015 David Gleich · Purdue 5
  • 6. Graph-based learning is usually only one component of a big pipeline KDD2015 David Gleich · Purdue 6 1 0 0 0 1 0 0 1 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 1 0 0 0 0 0 1 1 1 1 0 1 1 1 0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 0 1 0 1 1 1 1 0 1 0 0 1 1 1 0 0 1 1 1 1 1 0 1 1 1 1 1 1 0 0 0 1 0 0 1 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 1 0 0 0 0 0 1 1 1 1 0 1 1 1 0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 0 1 0 1 1 1 1 0 1 0 0 1 1 1 0 0 1 1 1 1 1 0 1 1 1 1 1 1 0 0 0 1 0 0 1 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 1 0 0 0 0 0 1 1 1 1 0 1 1 1 0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 0 1 0 1 1 1 1 0 1 0 0 1 1 1 0 0 1 1 1 1 1 0 1 1 1 1 1 1 0 0 0 1 0 0 1 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 1 0 0 0 0 0 1 1 1 1 0 1 1 1 0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 0 1 0 1 1 1 1 0 1 0 0 1 1 1 0 0 1 1 1 1 1 0 1 1 1 1 1 1 0 0 0 1 0 0 1 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 1 0 0 0 0 0 1 1 1 1 0 1 1 1 0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 0 1 0 1 1 1 1 0 1 0 0 1 1 1 0 0 1 1 1 1 1 0 1 1 1 1 1 1 0 0 0 1 0 0 1 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 1 0 0 0 0 0 1 1 1 1 0 1 1 1 0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 0 1 0 1 1 1 1 0 1 0 0 1 1 1 0 0 1 1 1 1 1 0 1 1 1 1 1 1 0 0 0 1 0 0 1 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 1 0 0 0 0 0 1 1 1 1 0 1 1 1 0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 0 1 0 1 1 1 1 0 1 0 0 1 1 1 0 0 1 1 1 1 1 0 1 1 1 1 1 1 0 0 0 1 0 0 1 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 1 0 0 0 0 0 1 1 1 1 0 1 1 1 0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 0 1 0 1 1 1 1 0 1 0 0 1 1 1 0 0 1 1 1 1 1 0 1 1 1 1 1 Many databases over" genes with survival rates" for various cancers List of possible genes" responsible for survival Cluster analysis Reinteration of data THIS STEP SHOULD! BE ROBUST TO ! VARIATIONS ABOVE!
  • 7. Scalable graph analytics Local methods are one of the most successful classes of scalable graph analytics They don’t even look at the entire graph. •  Andersen Chung Lang (ACL) Push method Conjecture" Local methods regularize some variant of the original algorithm or problem. Justification" For ACL and a few relatives this is exact! Impact?! Improved robustness to noise? KDD2015 David Gleich · Purdue 7 c For instance, to " answer “what function” is shared by the started node, we’d only look at the circled region.
  • 8. Our contributions We study these issues in the case of " semi-supervised learning (SSL) on graphs 1.  We illustrate a common mincut framework for a variety of SSL methods 2.  Show how to “localize” one (and make it scalable!) 3.  Provide a more robust SSL labeling method 4.  Identify a weakness in SSL methods: they cannot use extra edges! We find one useful way to do so. KDD2015 David Gleich · Purdue 8
  • 9. Semi-supervised " graph-based learning KDD2015 David Gleich · Purdue 9 Given a graph, and a few labeled nodes, predict the labels on the rest of the graph. Algorithm 1.  Run a diffusion for each label (possibly with neg. info from other classes) 2.  Assign new labels based on the value of each diffusion
  • 10. Semi-supervised " graph-based learning KDD2015 David Gleich · Purdue 10 Given a graph, and a few labeled nodes, predict the labels on the rest of the graph. Algorithm 1.  Run a diffusion for each label (possibly with neg. info from other classes) 2.  Assign new labels based on the value of each diffusion
  • 11. Semi-supervised " graph-based learning KDD2015 David Gleich · Purdue 11 Given a graph, and a few labeled nodes, predict the labels on the rest of the graph. Algorithm 1.  Run a diffusion for each label (possibly with neg. info from other classes) 2.  Assign new labels based on the value of each diffusion
  • 12. The diffusions proposed for semi- supervised learning are s,t-cut minorants KDD2015 David Gleich · Purdue 12 1 3 2 6 4 5 7 8 9 10 t s In the unweighted case, " solve via max-flow. In the weighted case, solve via network simplex or industrial LP. minimize qP ij2E Ci,j |xi xj |2 subject to xs = 1, xt = 0. minimize P ij2E Ci,j |xi xj | subject to xs = 1, xt = 0. MINCUT LP Spectral minorant – lin. sys.
  • 13. Representative cut problems KDD2015 David Gleich · Purdue 13 ∞ ∞ ∞ ∞ s t ZGL α α 4α 3α 4α 6α 3α 3α 5α 5α 5α 2α 5α 4α 5α s t Zhou et al. Positive label Neg. label Unlabeled Andersen-Lang weighting " variation too Joachims has a variation too. Zhou et al. NIPS 2003; Zhu et al., ICML 2003; Andersen Lang, SODA 2008; Joachims, ICML 2003 These help our intuition about the solutions All spectral minorants are linear systems.
  • 14. Implicit regularization views on the Zhou et al. diffusion KDD2015 David Gleich · Purdue 14 α α 4α 3α 4α 6α 3α 3α 5α 5α 5α 2α 5α 4α 5α s t Zhou et al. RESULT! The spectral minorant of Zhou is equivalent to the weakly-local MOV solution. PROOF! The two linear systems are the same (after working out a few equivalences). IMPORTANCE! We’d expect Zhou to be “more robust” minimize qP ij2E Ci,j |xi xj |2 subject to xs = 1, xt = 0. The Mahoney-Orecchia-Vishnoi (MOV) vector is a localized variation on the Fiedler vector to find a small conductance set nearby a seed set.
  • 15. A scalable, localized algorithm for Zhou et al’s diffusion. KDD2015 David Gleich · Purdue 15 RESULT! We can use a variation on coordinate descent methods related to the Andersen-Chung-Lang PUSH procedure to solve Zhou’s diffusion in a scalable manner. PROOF. See Gleich-Mahoney ICML ‘14 IMPORTANCE (1)! We should be able to make Zhou et al. scale. IMPORTANCE (2)! Using this algorithm adds another implicit regularization term that should further improve robustness! minimize qP ij2E Ci,j |xi xj |2 subject to xs = 1, xt = 0. minimize P ij2E Ci,j |xi xj |2 + ⌧ P i2V di xi subject to xs = 1, xt = 0, xi 0.
  • 16. Semi-supervised " graph-based learning KDD2015 David Gleich · Purdue 16 Given a graph, and a few labeled nodes, predict the labels on the rest of the graph. Algorithm 1.  Run a diffusion for each label (possibly with neg. info from other classes) 2.  Assign new labels based on the value of each diffusion
  • 17. Traditional rounding methods for SSL are value-based KDD2015 David Gleich · Purdue 17 Class 1 Class 2Class 3 Class 1 Class 2 Class 3 CLASS 1 CLASS 2 CLASS 3 VALUE-BASED Use the largest value of the diffusion to pick the label. Zhou’s diffusion
  • 18. But value based rounding doesn’t work for all diffusions KDD2015 David Gleich · Purdue 18 Class 1 Class 2Class 3 Class 1 Class 2 Class 3 Class 1 Class 2Class 3 Class 1 Class 2Class 3 Class 1 Class 2Class 3 (b) Zhou et al., l = 3 (c) Andersen-Lang, l = 3 (d) Joachims, l = 3 (e) ZGL, l = 3 Class 1 Class 2Class 3 Class 1 Class 2Class 3 Class 1 Class 2Class 3 Class 1 Class 2Class 3 (f) Zhou et al., l = 15 (g) Andersen-Lang, l = 15 (h) Joachims, l = 15 (i) ZGL, l = 15 CLASS 1 CLASS 3 CLASS 2 VALUE-BASED rounding fails for most of these diffusions BUT There is still a signal there! Adding more labels doesn’t help either, see the paper for those details
  • 19. Rank-based rounding is far more robust. KDD2015 David Gleich · Purdue 19 Class 1 Class 2Class 3 NEW IDEA! Look at the RANK of the item in each diffusion instead of it’s VALUE. JUSTIFICATION! Based on the idea of sweep-cut rounding in spectral methods (use the order induced by the eigenvector, not its values) IMPACT! Much more robust rounding to labels
  • 20. Rank-based rounding has a big impact on a real-study. KDD2015 David Gleich · Purdue 20 2 4 6 8 10 0 0.2 0.4 0.6 0.8 errorrate average training samples per class Zhou Zhou+Push 2 4 6 8 10 0 0.2 0.4 0.6 0.8 1 errorrate average training samples per class Zhou Zhou+Push We used the digit prediction task out of Zhou’s paper and added just a bit of noise" as label errors and switched parameters. VALUE-BASED RANK-BASED
  • 21. Main empirical results 1.  Zhou’s diffusion seems to work best for sparse graphs whereas the ZGL diffusion works best for dense 2.  On the digit’s dataset, dense graph constructions yield higher error rates 3.  Densifying the a super-sparse graph construction on the digits dataset yields lower error. 4.  And a similar fact holds on an Amazon co- purchasing network. KDD2015 David Gleich · Purdue 21
  • 22. 5 10 0 50 100 150 number of labels numberofmistakes An illustrative synthetic problem shows the differences. Two-class block-model, 150 nodes each" between prob = 0.02, " withinprob = 0.35 (dense) or 0.06 (sparse) Reveal labels for k nodes (varied) and we have different error rates (sparse 0/10% low/high) and dense (20%/60% for low-high) KDD2015 David Gleich · Purdue 22 5 10 0 50 100 150 number of labels numberofmistakes Joachims Zhou ZGL Real-world scenario" sparse graph, high error Sparse graph," low error 20 40 60 0 20 40 60 80 100 number of labels numberofmistakes 20 40 60 0 20 40 60 80 100 number of labels numberofmistakes Dense graph," low error rate Dense graph," high error rate
  • 23. Varying density in an SSL construction. KDD2015 David Gleich · Purdue 23 Ai,j = exp ✓ kdi dj k2 2 2 2 ◆ di dj = 2.5 = 1.25 We use the digits experiment from Zhou et al. 2003. 10 digits and a few label errors. We vary density either by the num. of nearest neighbors or by the kernel width.
  • 24. As density increases, the results just get worse KDD2015 David Gleich · Purdue 24 1 1.5 2 2.5 0 0.1 0.2 0.3 0.4 errorrate σ 0.8 1.2 1.5 1.8 2.1 2.5 Zhou Zhou+Push 10 2 0 0.1 0.2 0.3 0.4 errorrate nearest neighbors 5 10 25 50 100 150 200 250 Zhou Zhou+Push Varying kernel width Varying nearest neighors •  Adding “more” edges seems to only hurt. (Unless there is no signal). •  Zhou+Push seems to be slightly more robust (Maybe).
  • 25. Some observations and a question. Adding “more data” yields “worse results” for this procedure (in a simple setting). Suppose I have a real-world system that can work with up to E edges on some graph. Is there a way I can create new edges? KDD2015 David Gleich · Purdue 25
  • 26. Densifying the graph with path expansions KDD2015 David Gleich · Purdue 26 Ak = kX `=1 A` If A is the adjacency matrix, then this counts the total weight on all paths up to length k. We now repeat the nearest neighbor computation, but with paired parameters such that we have the same average degree. Zhou Zhou w. Push Avg. Deg k = 1 k 1 k = 1 k 1 19 0.163 0.114 0.156 0.117 41 0.156 0.132 0.158 0.113 53 0.183 0.142 0.179 0.136 104 0.193 0.145 0.178 0.144 138 0.216 0.102 0.204 0.101 k=4, nn = 3
  • 27. The same result holds for Amazon’s co-purchasing network KDD2015 David Gleich · Purdue 27 mean F1 Confidence intervals k Zhou Zhou w. Push Zhou Zhou w. Push 1 0.173 0.229 [0.15 0.19] [0.21 0.25] 2 0.197 0.231 [0.18 0.22] [0.21 0.25] 3 0.221 0.238 [0.17 0.27] [0.19 0.28] Amazon’s co-purchasing network (on Snap) is effectively a highly sparse nearest-neighbor network from their (denser) co-purchasing graph. We attempt to predict the items in a product category based on a small sample and study the F1 score for the predictions. Some small details missing – see the full paper. Ak = kX `=1 A`
  • 28. (a) K2 sparse (b) K2 dense (c) RK2 Figure 2: We artificially densify this graph to Ak based on a cons and dense di↵usions and regularization. The color indicates the circled nodes. The unavoidable errors are caused by a mislabeled regularizing di↵usions on dense graphs produces only a small Towards some theory, i.e. why are densified sparse graphs better? KDD2015 David Gleich · Purdue 28 How do sparsity, density, and regularization of a diffusion play into the results in a controlled setting? THE ERROR Labels
  • 29. (a) K2 sparse (b) K2 dense (c) RK2 Figure 2: We artificially densify this graph to Ak based on a cons and dense di↵usions and regularization. The color indicates the circled nodes. The unavoidable errors are caused by a mislabeled regularizing di↵usions on dense graphs produces only a small Towards some theory, i.e. why are densified sparse graphs better? KDD2015 David Gleich · Purdue 29 THE ERROR Labels Using Push algorithm P5 `=1 A` P5 `=1 A` Regularization Dense Dense
  • 30. Recap, discussion, future work Contributions 1.  Flow-setup for SSL diffusions 2.  New robust rounding rule for class selection 3.  Localized Zhou’s diffusion 4.  Empirical insights on density of graph constructions Observations •  Many of these insights translate to directed, weighted graphs with fuzzy labels and/or some parallel architectures. •  Weakness mainly empirical results on the density. •  We need a theoretical basis for the densification theory! KDD2015 David Gleich · Purdue 30 Supported by NSF, ARO, DARPA CODE www.cs.purdue.edu/homes/dgleich/codes/robust-diffusions