SlideShare une entreprise Scribd logo
1  sur  38
Télécharger pour lire hors ligne
CEPDR
CEPVR
IL2R
OLLR
RIAL
RIAR
RIVL
RIVR
RMDDR
RMDL
RMDR
RMDVL
RMFL
SMDDL
SMDDR
SMDVR
URBR
Higher-order organization !
of complex networks
9
10
8
72
0
4
3
11
6
5
1
David F. Gleich!
Purdue University!
Joint work with "
Austin Benson and Jure
Leskovec, Stanford "
Supported by NSF CAREER
CCF-1149756, IIS-1422918
DARPA SIMPLEX
PCMI2016
David Gleich · Purdue 
1
Code & Data snap.stanford.edu/higher-order"
github.com/arbenson/higher-order-organization-julia
Network analysis has two important
observations about real-world networks
Real-world networks have
modular organization!
Edge-based clustering and community
detection sometimes expose this
structure.
Control widgets are over-expressed
in complex networks. !
We can expose this motif or
graphlet analysis
PCMI2016
David Gleich · Purdue 
2
Milo et al., Science, 2002. 
Co-author network
Nodes and edges are not the fundamental
units of these networks. 

Why should we look for structure "
in terms of them?
PCMI2016
David Gleich · Purdue 
3
Idea Find clusters 



PCMI2016
David Gleich · Purdue 
4
Idea Find clusters of motifs



PCMI2016
David Gleich · Purdue 
5
In practice, motifs organize real-world networks !
amazing well and recover aquatic layers in food webs
Micronutrient !
sources!
Benthic Fishes!
Benthic Macroinvertibrates!
Pelagic fishes !
And benthic Prey!
http://marinebio.org/oceans/marine-zones/
We don’t know how to find
this structure based on
edge partitioning.
PCMI2016
David Gleich · Purdue 
6
Aside How did we get to this idea and looking
at this problem? 
•  Research is a journey.

PCMI2016
David Gleich · Purdue 
7
We can do motif-based clustering by
generalizing spectral clustering
Spectral clustering is a classic technique to partition
graphs by looking at eigenvectors.
M. Fiedler, 1973,
Algebraic connect-
ivity of graphs
Graph
 Laplacian
 Eigenvector
PCMI2016
David Gleich · Purdue 
8
Spectral clustering works based on
conductance
There are many ways to measure the quality of a set of
nodes of a graph to gauge how they partition the graph. 
cut(S) = 7 cut( ¯S) = 7
|S| = 15 | ¯S| = 20
vol(S) = 85 vol( ¯S) = 151
cut(S) = 7 cut( ¯S) = 7
|S| = 15 | ¯S| = 20
vol(S) = 85 vol( ¯S) = 151
cut(S) = 7/85 + 7/151 = 0.1287
cut sparsity(S) = 7/15 = 0.4667
(S) = cond(S) = 7/85 = 0.0824
n
(S) = cut(S)/ min(vol(S), vol( ¯S))
PCMI2016
David Gleich · Purdue 
9
Conductance sets in graphs 
PCMI2016
David Gleich · Purdue 
10
Conductance is one of the most important quality
scores [Schaeffer07]
used in Markov chain theory, bioinformatics, vision, etc.
PCMI Nelson showed how use you can this to get heavy-hitters in turnstile algs!
The conductance of a set of vertices is the ratio of
edges leaving to total edges:


Equivalently, it’s the probability that a random edge
leaves the set.
Small conductance ó Good set
(S) =
cut(S)
min vol(S), vol( ¯S)
(edges leaving the set)
(total edges
in the set)
cut(S) = 7
vol(S) = 33
vol( ¯S) = 11
(S) = 7/11
Spectral clustering has theoretical
guarantees


Cheeger Inequality
Finding the best conductance set
is NP-hard. L
•  Cheeger realized the eigenvalues of the
Laplacian provided a bound in manifolds
•  Alon and Milman independently realized
the same thing for a graph!
J. Cheeger, 1970,
A lower bound on
the smallest
eigenvalue of the
Laplacian
N. Alon, V. Milman
1985. λ1 isoperi-
metric inequalities
for graphs and
superconcentrators
Laplacian
 2
⇤/2  2  2 ⇤
0 = 1  2  ...  n  2
Eigenvalues of the Laplacian
⇤ = set of smallest conductance
PCMI2016
David Gleich · Purdue 
11
The sweep cut algorithm realizes the
guarantee
We can find a set S that achieves
the Cheeger bound. 
1.  Compute the eigenvector
associated with λ2.
2.  Sort the vertices by their values
in the eigenvector: σ1, σ2, … σn
3.  Let Sk = {σ1, …, σk} and
compute the conductance of
each Sk: φk = φ(Sk)
4.  Pick the minimum φm of φk . 
M. Mihail, 1989
Conductance and
convergence of
Markov chains
F. C. Graham,
1992, Spectral
Graph Theory.
m  4
p
⇤
PCMI2016
David Gleich · Purdue 
12
The sweep cut visualized
0 20 40
0
0.2
0.4
0.6
0.8
1
S
i
φi
(S) =
cut(S)
min vol(S), vol( ¯S)
PCMI2016
David Gleich · Purdue 
13
Demo…
PCMI2016
David Gleich · Purdue 
14
That’s spectral clustering
40+ years of ideas and successful applications
•  Fast algorithms that avoid eigenvectors "
(Graculus from Dhillon et al. 2007)
•  Local algorithms for seeded detection"
(Spielman & Teng 2004; Andersen, Chung, Lang 2006)"
PCMI: Kimon gave a talk about this yesterday!
•  Overlapping algorithms
•  Embeddings
•  And more!
PCMI2016
David Gleich · Purdue 
15
But current problems are much more rich
than when spectral was designed
Spectral clustering is theoretically justified for undirected, simple graphs"

Many datasets are directed, weighted, signed, colored, layered, 
R. Milo, 2002, Science
X
Y
X causes Y to be expressed
Z represses Y
X
Z
Y
+
– 
PCMI2016
David Gleich · Purdue 
16
Our contributions
1.  A generalized conductance metric for motifs
2.  A new spectral clustering algorithm to minimize the generalized
conductance.
3.  AND an associated Cheeger inequality.

4.  Aquatic layers in food webs
5.  Control structures in neural networks
6.  Hub structure in transportation networks
7.  Anomaly detection in Twitter
Benson, Gleich, Leskovec, Science 2016.
PCMI2016
David Gleich · Purdue 
17
Motif-based conductance generalizes !
edge-based conductance
Need notions of cut and volume!
(S) =
#(edges cut)
min(vol(S), vol( ¯S))
Edges cut! Triangles cut!
S S
S¯S ¯S
vol(S) = #(edge end points in S) volM (S) = #(triangle
end points in S)
M (S) =
#(triangles cut)
min(volM (S), volM ( ¯S))
PCMI2016
David Gleich · Purdue 
18
An example of motif-conductance
9
10
6
5
8
1
7
2
0
4
3
11
9
10
8
7
2
0
4
3
11
6
5
1
¯S
S
Motif
M (S) =
motifs cut
motif volume
=
1
10
PCMI2016
David Gleich · Purdue 
19
Going from motifs back to a matrix for
spectral clustering
9
10
6
5
8
1
7
2
0
4
3
11
9
10
6
5
8
1
7
2
0
4
3
11
1
1
1
1 1
1
1
1
1
1
1
1
1
1
1
1
2
3
A
W(M)
ij = counts co-occurrences of motif pattern between i, j
W(M)
PCMI2016
David Gleich · Purdue 
20
Going from motifs back to a matrix for
spectral clustering
9
10
6
5
8
1
7
2
0
4
3
11
1
1
1
1 1
1
1
1
1
1
1
1
1
1
1
1
2
3
W(M)
ij = counts co-occurrences of motif pattern between i, j
W(M)
KEY INSIGHT!
Spectral clustering on
W(M) yields results on
the new motif notion
of conductance
M (S) =
motifs cut
motif volume
=
1
10
PCMI2016
David Gleich · Purdue 
21
A motif-based clustering algorithm
1.  Form weighted graph W(M) 
2.  Compute the Fiedler vector associated with λ2 of the
motif-normalized Laplacian 
3.  Run a (motif-cond) sweep cut on f!
9
10
6
5
8
1
7
2
0
4
3
11
1
1
1
1 1
1
1
1
1
1
1
1
1
1
1
1
2
3
W(M)
D = diag(W(M)
e)
L(M)
= D 1/2
(D W(M)
)D 1/2
L(M)
z = 2z
f(M)
= D 1/2
z
PCMI2016
David Gleich · Purdue 
22
The sweep cut results
2 4 6 8 10
0
0.2
0.4
0.6
0.8
1
1
2
0
4
3
1
2
0
4
3
9
10
6
Best higher-
order cluster
2nd best higher-
order cluster
9
10
6
5
8
1
7
2
0
4
3
11
1
1
1
1 1
1
1
1
1
1
1
1
1
1
1
1
2
3
(Order from the Fiedler vector)
PCMI2016
David Gleich · Purdue 
23
The motif-based Cheeger inequality
THEOREM!
If the motif has three nodes, then the sweep procedure
on the weighted graph finds a set S of nodes for which




THEOREM For more than 4 nodes, we "
use a slightly altered conductance.

M (S)  4
q
⇤
M
cutM (S, G) =
X
{i,j,k}2M(G)
Indicator[xi , xj , xk not the same]
= quadratic in x
M(G) = {instances of M in G}
Key Proof Step!
PCMI2016
David Gleich · Purdue 
24
Awesome advantages
We inherit 40+ years of research!
•  Fast algorithms "
(ARPACK, etc.)!
•  Local methods!
•  Overlapping!

•  Easy to implement "
(20 lines of Matlab/Julia)
•  Scalable (1.4B edges graphs "
are not a prob.)
PCMI2016
David Gleich · Purdue 
25
12/13/2015 motif_example
function [S, conductances] = MotifClusterM36(A)
B = spones(A & A'); % bidirectional links
U = A - B; % unidirectional links
W = (B * U') .* U' + (U * B) .* U + (U' * U) .* B; % Motif M_3^6
D = diag(sum(W));
Ln = speye(size(W, 1)) - sqrt(D)^(-1) * W * sqrt(D)^(-1);
[Z, ~] = eigs(Ln, 2, 'sm');
[~, order] = sort(sqrt(D)^(-1) * Z(:, 2));
conductances = zeros(n, 1);
x = zeros(n, 1);
for i = 1:n
x(order(i)) = 1;
xn = ~x + 0;
conductances(i) = x' * (D - W) * x / min(x' * D * x, xn' * D * xn);
end
[~, split] = min(conductances);
S = order(1:split);
Error using motif_example (line 2)
Not enough input arguments.
Published with MATLAB® R2015a
Case studies
An intro note!

1.  Aquatic layers in food webs."
Signed patterns in regulatory networks
2.  Control structures in neural networks
3.  Hub structure in transportation networks. 
4.  Scaling and large data 
PCMI2016
David Gleich · Purdue 
26
NOTE !
The partition depends on the motif 
10
11
9
8
3
1
5
4
12
7
6
2
10
11
9
8
3
1
5
4
12
7
6
2
PCMI2016
David Gleich · Purdue 
27
Case study 1!
Motifs partition the food webs
Food webs model
energy exchange
in species of an
ecosystem
i -> j 
means i’s energy
goes to j "
(or j eats i) 

Via Cheeger, motif
conductance is
better than edge
conductance. 
PCMI2016
David Gleich · Purdue 
28
Demo
PCMI2016
David Gleich · Purdue 
29
Case study 1!
Motifs partition the food webs
Micronutrient !
sources!
Benthic Fishes!
Benthic Macroinvertebrates!
Pelagic fishes !
and benthic prey!
Motif M6 reveals
aquatic layers.
A
84% accuracy vs.
69% for other methods 
PCMI2016
David Gleich · Purdue 
30
Case study 2!
Nictation control in neural network
(d) From Nictation, a dispersal
behavior of the nematode
Caenorhabditis elegans, is regulated
by IL2 neurons, Lee et al. Nature
Neuroscience.
"
We find the control
mechanism that explains
this based on the bi-fan
motif (Milo et al. found it
over-expressed) 
A B
C
Nicatation – standing on a tail and waving 
A B
PCMI2016
David Gleich · Purdue 
31
Case study 3 !
Rich structure beyond clusters
North American air "
transport network

Nodes are airports
Edges reflect "
reachability, and "
are unweighted.
(Based on Frey"
et al.’s 2007)
PCMI2016
David Gleich · Purdue 
32
We can use complex motifs with non-
anchored nodes
	
D
C
B
A
Counts length-two walks
PCMI2016
David Gleich · Purdue 
33
The weighting alone reveals hub-like
structure
PCMI2016
David Gleich · Purdue 
34
The motif embedding shows this structure
and splits into east-west
Top 10
U.S. hubs
East coast non-hubs!
West coast non-hubs!
Primary spectral coordinate
Atlanta, the top hub, is 
next to Salina, a non-hub.
MOTIF SPECTRAL 

EMBEDDING
EDGE SPECTRAL 

EMBEDDING
PCMI2016
David Gleich · Purdue 
35
Case study 4!
Large scale stuff 
The up-linked triangle finds an
anomalous cluster in Twitter.
Anomalous cluster in the 1.4B edge Twitter graph. All nodes are holding accounts
for a company, and the orange nodes have incomplete profiles. 
PCMI2016
David Gleich · Purdue 
36
Related work. 
§  Laplacian we propose was originally proposed by Rodríguez
[2004] and again by Zhou et al. [2006]"
Our new theory (motif Cheeger inequality) explains why these
were good ideas.
§  Falls under general strategy of encoding hypergraph partitioning
problem as graph clustering problem [Agarwal+ 06]
§  Serrour, Arenas, and Gómez, Detecting communities of triangles
in complex networks using spectral optimization, 2011.
§  Arenas et al., Motif-based communities in complex networks,
2008.
PCMI2016
David Gleich · Purdue 
37
Paper!
Benson, Gleich, Leskovec!
Science, 2016

1.  A generalized conductance metric for motifs
2.  A new spectral clustering algorithm to
minimize the generalized conductance.
3.  AND an associated Cheeger inequality.
4.  Aquatic layers in food webs
5.  Control structures in neural networks
6.  Hub structure in transportation networks
7.  Anomaly detection in Twitter
8.  Lots of cool stuff on signed networks.
Thank you!
Joint work with "
Austin Benson and Jure
Leskovec, Stanford
Supported by NSF CAREER
CCF-1149756, IIS-1422918
IIS- DARPA SIMPLEX
9 10
8
7
2
0
4
3
11
6
5
1
PCMI2016
David Gleich · Purdue 
38

Contenu connexe

Tendances

Anti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutAnti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutDavid Gleich
 
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structuresDavid Gleich
 
Big data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsBig data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsDavid Gleich
 
PageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structuresPageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structuresDavid Gleich
 
Non-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansNon-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansDavid Gleich
 
Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential David Gleich
 
Personalized PageRank based community detection
Personalized PageRank based community detectionPersonalized PageRank based community detection
Personalized PageRank based community detectionDavid Gleich
 
Using Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningUsing Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningDavid Gleich
 
Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...David Gleich
 
Higher-order clustering coefficients
Higher-order clustering coefficientsHigher-order clustering coefficients
Higher-order clustering coefficientsAustin Benson
 
Spacey random walks from Householder Symposium XX 2017
Spacey random walks from Householder Symposium XX 2017Spacey random walks from Householder Symposium XX 2017
Spacey random walks from Householder Symposium XX 2017Austin Benson
 
"Principal Component Analysis - the original paper" presentation @ Papers We ...
"Principal Component Analysis - the original paper" presentation @ Papers We ..."Principal Component Analysis - the original paper" presentation @ Papers We ...
"Principal Component Analysis - the original paper" presentation @ Papers We ...Adrian Florea
 
A new generalized lindley distribution
A new generalized lindley distributionA new generalized lindley distribution
A new generalized lindley distributionAlexander Decker
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Mostafa G. M. Mostafa
 
A lattice-based consensus clustering
A lattice-based consensus clusteringA lattice-based consensus clustering
A lattice-based consensus clusteringDmitrii Ignatov
 
Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...zukun
 
High-Performance Approach to String Similarity using Most Frequent K Characters
High-Performance Approach to String Similarity using Most Frequent K CharactersHigh-Performance Approach to String Similarity using Most Frequent K Characters
High-Performance Approach to String Similarity using Most Frequent K CharactersHolistic Benchmarking of Big Linked Data
 
Tensor Train decomposition in machine learning
Tensor Train decomposition in machine learningTensor Train decomposition in machine learning
Tensor Train decomposition in machine learningAlexander Novikov
 
Pattern-based classification of demographic sequences
Pattern-based classification of demographic sequencesPattern-based classification of demographic sequences
Pattern-based classification of demographic sequencesDmitrii Ignatov
 
Relaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksRelaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksDavid Gleich
 

Tendances (20)

Anti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutAnti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCut
 
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structures
 
Big data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsBig data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphs
 
PageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structuresPageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structures
 
Non-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansNon-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-means
 
Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential
 
Personalized PageRank based community detection
Personalized PageRank based community detectionPersonalized PageRank based community detection
Personalized PageRank based community detection
 
Using Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningUsing Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based Learning
 
Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...
 
Higher-order clustering coefficients
Higher-order clustering coefficientsHigher-order clustering coefficients
Higher-order clustering coefficients
 
Spacey random walks from Householder Symposium XX 2017
Spacey random walks from Householder Symposium XX 2017Spacey random walks from Householder Symposium XX 2017
Spacey random walks from Householder Symposium XX 2017
 
"Principal Component Analysis - the original paper" presentation @ Papers We ...
"Principal Component Analysis - the original paper" presentation @ Papers We ..."Principal Component Analysis - the original paper" presentation @ Papers We ...
"Principal Component Analysis - the original paper" presentation @ Papers We ...
 
A new generalized lindley distribution
A new generalized lindley distributionA new generalized lindley distribution
A new generalized lindley distribution
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)
 
A lattice-based consensus clustering
A lattice-based consensus clusteringA lattice-based consensus clustering
A lattice-based consensus clustering
 
Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...
 
High-Performance Approach to String Similarity using Most Frequent K Characters
High-Performance Approach to String Similarity using Most Frequent K CharactersHigh-Performance Approach to String Similarity using Most Frequent K Characters
High-Performance Approach to String Similarity using Most Frequent K Characters
 
Tensor Train decomposition in machine learning
Tensor Train decomposition in machine learningTensor Train decomposition in machine learning
Tensor Train decomposition in machine learning
 
Pattern-based classification of demographic sequences
Pattern-based classification of demographic sequencesPattern-based classification of demographic sequences
Pattern-based classification of demographic sequences
 
Relaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksRelaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networks
 

En vedette

How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...David Gleich
 
Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduceDavid Gleich
 
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...David Gleich
 
Iterative methods for network alignment
Iterative methods for network alignmentIterative methods for network alignment
Iterative methods for network alignmentDavid Gleich
 
MapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsMapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsDavid Gleich
 
The power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulantsThe power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulantsDavid Gleich
 
Tall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceTall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceDavid Gleich
 
A history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveA history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveDavid Gleich
 
A multithreaded method for network alignment
A multithreaded method for network alignmentA multithreaded method for network alignment
A multithreaded method for network alignmentDavid Gleich
 
Direct tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architecturesDirect tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architecturesDavid Gleich
 
Tall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architecturesTall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architecturesDavid Gleich
 
A dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationA dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationDavid Gleich
 
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...David Gleich
 
MapReduce for scientific simulation analysis
MapReduce for scientific simulation analysisMapReduce for scientific simulation analysis
MapReduce for scientific simulation analysisDavid Gleich
 

En vedette (17)

How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...
 
Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduce
 
27 development
27 development27 development
27 development
 
Hadoop
HadoopHadoop
Hadoop
 
digital tv DTMB
digital tv DTMBdigital tv DTMB
digital tv DTMB
 
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
 
Iterative methods for network alignment
Iterative methods for network alignmentIterative methods for network alignment
Iterative methods for network alignment
 
MapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsMapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applications
 
The power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulantsThe power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulants
 
Tall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceTall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduce
 
A history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveA history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspective
 
A multithreaded method for network alignment
A multithreaded method for network alignmentA multithreaded method for network alignment
A multithreaded method for network alignment
 
Direct tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architecturesDirect tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architectures
 
Tall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architecturesTall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architectures
 
A dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationA dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportation
 
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
 
MapReduce for scientific simulation analysis
MapReduce for scientific simulation analysisMapReduce for scientific simulation analysis
MapReduce for scientific simulation analysis
 

Similaire à Higher-order organization of complex networks

Higher-order graph clustering at AMS Spring Western Sectional
Higher-order graph clustering at AMS Spring Western SectionalHigher-order graph clustering at AMS Spring Western Sectional
Higher-order graph clustering at AMS Spring Western SectionalAustin Benson
 
Strengthening support vector classifiers based on fuzzy logic and evolutionar...
Strengthening support vector classifiers based on fuzzy logic and evolutionar...Strengthening support vector classifiers based on fuzzy logic and evolutionar...
Strengthening support vector classifiers based on fuzzy logic and evolutionar...Reza Sadeghi
 
Tatyana Makhalova - 2015 - News clustering approach based on discourse text s...
Tatyana Makhalova - 2015 - News clustering approach based on discourse text s...Tatyana Makhalova - 2015 - News clustering approach based on discourse text s...
Tatyana Makhalova - 2015 - News clustering approach based on discourse text s...Association for Computational Linguistics
 
Minimizing cost in distributed multiquery processing applications
Minimizing cost in distributed multiquery processing applicationsMinimizing cost in distributed multiquery processing applications
Minimizing cost in distributed multiquery processing applicationsLuis Galárraga
 
Representing Simplicial Complexes with Mangroves
Representing Simplicial Complexes with MangrovesRepresenting Simplicial Complexes with Mangroves
Representing Simplicial Complexes with MangrovesDavid Canino
 
Nelly Litvak – Asymptotic behaviour of ranking algorithms in directed random ...
Nelly Litvak – Asymptotic behaviour of ranking algorithms in directed random ...Nelly Litvak – Asymptotic behaviour of ranking algorithms in directed random ...
Nelly Litvak – Asymptotic behaviour of ranking algorithms in directed random ...Yandex
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet AllocationMarco Righini
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Frank Nielsen
 
Tensor Spectral Clustering
Tensor Spectral ClusteringTensor Spectral Clustering
Tensor Spectral ClusteringAustin Benson
 
08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)dnac
 
Formulas for Surface Weighted Numbers on Graph
Formulas for Surface Weighted Numbers on GraphFormulas for Surface Weighted Numbers on Graph
Formulas for Surface Weighted Numbers on Graphijtsrd
 
A Diffusion Wavelet Approach For 3 D Model Matching
A Diffusion Wavelet Approach For 3 D Model MatchingA Diffusion Wavelet Approach For 3 D Model Matching
A Diffusion Wavelet Approach For 3 D Model Matchingrafi
 
Numerical Simulations of Optical Soliton Propagation under External Forcing
Numerical Simulations of Optical Soliton Propagation under External ForcingNumerical Simulations of Optical Soliton Propagation under External Forcing
Numerical Simulations of Optical Soliton Propagation under External Forcingijtsrd
 
Eighth Asian-European Workshop on Information Theory: Fundamental Concepts in...
Eighth Asian-European Workshop on Information Theory: Fundamental Concepts in...Eighth Asian-European Workshop on Information Theory: Fundamental Concepts in...
Eighth Asian-European Workshop on Information Theory: Fundamental Concepts in...Joe Suzuki
 
Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)Frank Nielsen
 
Friedlander et al. Evolution of Bow-Tie Architectures in Biology (2015)
Friedlander et al. Evolution of Bow-Tie Architectures in Biology (2015)Friedlander et al. Evolution of Bow-Tie Architectures in Biology (2015)
Friedlander et al. Evolution of Bow-Tie Architectures in Biology (2015)Thoma Itoh
 

Similaire à Higher-order organization of complex networks (20)

Higher-order graph clustering at AMS Spring Western Sectional
Higher-order graph clustering at AMS Spring Western SectionalHigher-order graph clustering at AMS Spring Western Sectional
Higher-order graph clustering at AMS Spring Western Sectional
 
Lausanne 2019 #4
Lausanne 2019 #4Lausanne 2019 #4
Lausanne 2019 #4
 
Interactive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social GraphsInteractive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social Graphs
 
Strengthening support vector classifiers based on fuzzy logic and evolutionar...
Strengthening support vector classifiers based on fuzzy logic and evolutionar...Strengthening support vector classifiers based on fuzzy logic and evolutionar...
Strengthening support vector classifiers based on fuzzy logic and evolutionar...
 
CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
 
Tatyana Makhalova - 2015 - News clustering approach based on discourse text s...
Tatyana Makhalova - 2015 - News clustering approach based on discourse text s...Tatyana Makhalova - 2015 - News clustering approach based on discourse text s...
Tatyana Makhalova - 2015 - News clustering approach based on discourse text s...
 
Minimizing cost in distributed multiquery processing applications
Minimizing cost in distributed multiquery processing applicationsMinimizing cost in distributed multiquery processing applications
Minimizing cost in distributed multiquery processing applications
 
Representing Simplicial Complexes with Mangroves
Representing Simplicial Complexes with MangrovesRepresenting Simplicial Complexes with Mangroves
Representing Simplicial Complexes with Mangroves
 
Nelly Litvak – Asymptotic behaviour of ranking algorithms in directed random ...
Nelly Litvak – Asymptotic behaviour of ranking algorithms in directed random ...Nelly Litvak – Asymptotic behaviour of ranking algorithms in directed random ...
Nelly Litvak – Asymptotic behaviour of ranking algorithms in directed random ...
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet Allocation
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...
 
Tensor Spectral Clustering
Tensor Spectral ClusteringTensor Spectral Clustering
Tensor Spectral Clustering
 
08 Exponential Random Graph Models (2016)
08 Exponential Random Graph Models (2016)08 Exponential Random Graph Models (2016)
08 Exponential Random Graph Models (2016)
 
08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)
 
Formulas for Surface Weighted Numbers on Graph
Formulas for Surface Weighted Numbers on GraphFormulas for Surface Weighted Numbers on Graph
Formulas for Surface Weighted Numbers on Graph
 
A Diffusion Wavelet Approach For 3 D Model Matching
A Diffusion Wavelet Approach For 3 D Model MatchingA Diffusion Wavelet Approach For 3 D Model Matching
A Diffusion Wavelet Approach For 3 D Model Matching
 
Numerical Simulations of Optical Soliton Propagation under External Forcing
Numerical Simulations of Optical Soliton Propagation under External ForcingNumerical Simulations of Optical Soliton Propagation under External Forcing
Numerical Simulations of Optical Soliton Propagation under External Forcing
 
Eighth Asian-European Workshop on Information Theory: Fundamental Concepts in...
Eighth Asian-European Workshop on Information Theory: Fundamental Concepts in...Eighth Asian-European Workshop on Information Theory: Fundamental Concepts in...
Eighth Asian-European Workshop on Information Theory: Fundamental Concepts in...
 
Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)
 
Friedlander et al. Evolution of Bow-Tie Architectures in Biology (2015)
Friedlander et al. Evolution of Bow-Tie Architectures in Biology (2015)Friedlander et al. Evolution of Bow-Tie Architectures in Biology (2015)
Friedlander et al. Evolution of Bow-Tie Architectures in Biology (2015)
 

Dernier

Krishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्रKrishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्रKrashi Coaching
 
biosynthesis of the cell wall and antibiotics
biosynthesis of the cell wall and antibioticsbiosynthesis of the cell wall and antibiotics
biosynthesis of the cell wall and antibioticsSafaFallah
 
RCPE terms and cycles scenarios as of March 2024
RCPE terms and cycles scenarios as of March 2024RCPE terms and cycles scenarios as of March 2024
RCPE terms and cycles scenarios as of March 2024suelcarter1
 
World Water Day 22 March 2024 - kiyorndlab
World Water Day 22 March 2024 - kiyorndlabWorld Water Day 22 March 2024 - kiyorndlab
World Water Day 22 March 2024 - kiyorndlabkiyorndlab
 
Application of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptxApplication of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptxRahulVishwakarma71547
 
Identification of Superclusters and Their Properties in the Sloan Digital Sky...
Identification of Superclusters and Their Properties in the Sloan Digital Sky...Identification of Superclusters and Their Properties in the Sloan Digital Sky...
Identification of Superclusters and Their Properties in the Sloan Digital Sky...Sérgio Sacani
 
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxTHE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxAkinrotimiOluwadunsi
 
Q3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptx
Q3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptxQ3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptx
Q3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptxArdeniel
 
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky WayShiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky WaySérgio Sacani
 
Substances in Common Use for Shahu College Screening Test
Substances in Common Use for Shahu College Screening TestSubstances in Common Use for Shahu College Screening Test
Substances in Common Use for Shahu College Screening TestAkashDTejwani
 
geometric quantization on coadjoint orbits
geometric quantization on coadjoint orbitsgeometric quantization on coadjoint orbits
geometric quantization on coadjoint orbitsHassan Jolany
 
PSP3 employability assessment form .docx
PSP3 employability assessment form .docxPSP3 employability assessment form .docx
PSP3 employability assessment form .docxmarwaahmad357
 
MARKER ASSISTED SELECTION IN CROP IMPROVEMENT
MARKER ASSISTED SELECTION IN CROP IMPROVEMENTMARKER ASSISTED SELECTION IN CROP IMPROVEMENT
MARKER ASSISTED SELECTION IN CROP IMPROVEMENTjipexe1248
 
Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)GRAPE
 
soft skills question paper set for bba ca
soft skills question paper set for bba casoft skills question paper set for bba ca
soft skills question paper set for bba caohsadfeeling
 
Lehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.pptLehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.pptSachin Teotia
 
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...PirithiRaju
 
Gene transfer in plants agrobacterium.pdf
Gene transfer in plants agrobacterium.pdfGene transfer in plants agrobacterium.pdf
Gene transfer in plants agrobacterium.pdfNetHelix
 
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdfSUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdfsantiagojoderickdoma
 
TORSION IN GASTROPODS- Anatomical event (Zoology)
TORSION IN GASTROPODS- Anatomical event (Zoology)TORSION IN GASTROPODS- Anatomical event (Zoology)
TORSION IN GASTROPODS- Anatomical event (Zoology)chatterjeesoumili50
 

Dernier (20)

Krishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्रKrishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्र
 
biosynthesis of the cell wall and antibiotics
biosynthesis of the cell wall and antibioticsbiosynthesis of the cell wall and antibiotics
biosynthesis of the cell wall and antibiotics
 
RCPE terms and cycles scenarios as of March 2024
RCPE terms and cycles scenarios as of March 2024RCPE terms and cycles scenarios as of March 2024
RCPE terms and cycles scenarios as of March 2024
 
World Water Day 22 March 2024 - kiyorndlab
World Water Day 22 March 2024 - kiyorndlabWorld Water Day 22 March 2024 - kiyorndlab
World Water Day 22 March 2024 - kiyorndlab
 
Application of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptxApplication of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptx
 
Identification of Superclusters and Their Properties in the Sloan Digital Sky...
Identification of Superclusters and Their Properties in the Sloan Digital Sky...Identification of Superclusters and Their Properties in the Sloan Digital Sky...
Identification of Superclusters and Their Properties in the Sloan Digital Sky...
 
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxTHE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
 
Q3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptx
Q3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptxQ3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptx
Q3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptx
 
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky WayShiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
 
Substances in Common Use for Shahu College Screening Test
Substances in Common Use for Shahu College Screening TestSubstances in Common Use for Shahu College Screening Test
Substances in Common Use for Shahu College Screening Test
 
geometric quantization on coadjoint orbits
geometric quantization on coadjoint orbitsgeometric quantization on coadjoint orbits
geometric quantization on coadjoint orbits
 
PSP3 employability assessment form .docx
PSP3 employability assessment form .docxPSP3 employability assessment form .docx
PSP3 employability assessment form .docx
 
MARKER ASSISTED SELECTION IN CROP IMPROVEMENT
MARKER ASSISTED SELECTION IN CROP IMPROVEMENTMARKER ASSISTED SELECTION IN CROP IMPROVEMENT
MARKER ASSISTED SELECTION IN CROP IMPROVEMENT
 
Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)
 
soft skills question paper set for bba ca
soft skills question paper set for bba casoft skills question paper set for bba ca
soft skills question paper set for bba ca
 
Lehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.pptLehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.ppt
 
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
 
Gene transfer in plants agrobacterium.pdf
Gene transfer in plants agrobacterium.pdfGene transfer in plants agrobacterium.pdf
Gene transfer in plants agrobacterium.pdf
 
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdfSUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
 
TORSION IN GASTROPODS- Anatomical event (Zoology)
TORSION IN GASTROPODS- Anatomical event (Zoology)TORSION IN GASTROPODS- Anatomical event (Zoology)
TORSION IN GASTROPODS- Anatomical event (Zoology)
 

Higher-order organization of complex networks

  • 1. CEPDR CEPVR IL2R OLLR RIAL RIAR RIVL RIVR RMDDR RMDL RMDR RMDVL RMFL SMDDL SMDDR SMDVR URBR Higher-order organization ! of complex networks 9 10 8 72 0 4 3 11 6 5 1 David F. Gleich! Purdue University! Joint work with " Austin Benson and Jure Leskovec, Stanford " Supported by NSF CAREER CCF-1149756, IIS-1422918 DARPA SIMPLEX PCMI2016 David Gleich · Purdue 1 Code & Data snap.stanford.edu/higher-order" github.com/arbenson/higher-order-organization-julia
  • 2. Network analysis has two important observations about real-world networks Real-world networks have modular organization! Edge-based clustering and community detection sometimes expose this structure. Control widgets are over-expressed in complex networks. ! We can expose this motif or graphlet analysis PCMI2016 David Gleich · Purdue 2 Milo et al., Science, 2002. Co-author network
  • 3. Nodes and edges are not the fundamental units of these networks. Why should we look for structure " in terms of them? PCMI2016 David Gleich · Purdue 3
  • 4. Idea Find clusters PCMI2016 David Gleich · Purdue 4
  • 5. Idea Find clusters of motifs PCMI2016 David Gleich · Purdue 5
  • 6. In practice, motifs organize real-world networks ! amazing well and recover aquatic layers in food webs Micronutrient ! sources! Benthic Fishes! Benthic Macroinvertibrates! Pelagic fishes ! And benthic Prey! http://marinebio.org/oceans/marine-zones/ We don’t know how to find this structure based on edge partitioning. PCMI2016 David Gleich · Purdue 6
  • 7. Aside How did we get to this idea and looking at this problem? •  Research is a journey. PCMI2016 David Gleich · Purdue 7
  • 8. We can do motif-based clustering by generalizing spectral clustering Spectral clustering is a classic technique to partition graphs by looking at eigenvectors. M. Fiedler, 1973, Algebraic connect- ivity of graphs Graph Laplacian Eigenvector PCMI2016 David Gleich · Purdue 8
  • 9. Spectral clustering works based on conductance There are many ways to measure the quality of a set of nodes of a graph to gauge how they partition the graph. cut(S) = 7 cut( ¯S) = 7 |S| = 15 | ¯S| = 20 vol(S) = 85 vol( ¯S) = 151 cut(S) = 7 cut( ¯S) = 7 |S| = 15 | ¯S| = 20 vol(S) = 85 vol( ¯S) = 151 cut(S) = 7/85 + 7/151 = 0.1287 cut sparsity(S) = 7/15 = 0.4667 (S) = cond(S) = 7/85 = 0.0824 n (S) = cut(S)/ min(vol(S), vol( ¯S)) PCMI2016 David Gleich · Purdue 9
  • 10. Conductance sets in graphs PCMI2016 David Gleich · Purdue 10 Conductance is one of the most important quality scores [Schaeffer07] used in Markov chain theory, bioinformatics, vision, etc. PCMI Nelson showed how use you can this to get heavy-hitters in turnstile algs! The conductance of a set of vertices is the ratio of edges leaving to total edges: Equivalently, it’s the probability that a random edge leaves the set. Small conductance ó Good set (S) = cut(S) min vol(S), vol( ¯S) (edges leaving the set) (total edges in the set) cut(S) = 7 vol(S) = 33 vol( ¯S) = 11 (S) = 7/11
  • 11. Spectral clustering has theoretical guarantees Cheeger Inequality Finding the best conductance set is NP-hard. L •  Cheeger realized the eigenvalues of the Laplacian provided a bound in manifolds •  Alon and Milman independently realized the same thing for a graph! J. Cheeger, 1970, A lower bound on the smallest eigenvalue of the Laplacian N. Alon, V. Milman 1985. λ1 isoperi- metric inequalities for graphs and superconcentrators Laplacian 2 ⇤/2  2  2 ⇤ 0 = 1  2  ...  n  2 Eigenvalues of the Laplacian ⇤ = set of smallest conductance PCMI2016 David Gleich · Purdue 11
  • 12. The sweep cut algorithm realizes the guarantee We can find a set S that achieves the Cheeger bound. 1.  Compute the eigenvector associated with λ2. 2.  Sort the vertices by their values in the eigenvector: σ1, σ2, … σn 3.  Let Sk = {σ1, …, σk} and compute the conductance of each Sk: φk = φ(Sk) 4.  Pick the minimum φm of φk . M. Mihail, 1989 Conductance and convergence of Markov chains F. C. Graham, 1992, Spectral Graph Theory. m  4 p ⇤ PCMI2016 David Gleich · Purdue 12
  • 13. The sweep cut visualized 0 20 40 0 0.2 0.4 0.6 0.8 1 S i φi (S) = cut(S) min vol(S), vol( ¯S) PCMI2016 David Gleich · Purdue 13
  • 15. That’s spectral clustering 40+ years of ideas and successful applications •  Fast algorithms that avoid eigenvectors " (Graculus from Dhillon et al. 2007) •  Local algorithms for seeded detection" (Spielman & Teng 2004; Andersen, Chung, Lang 2006)" PCMI: Kimon gave a talk about this yesterday! •  Overlapping algorithms •  Embeddings •  And more! PCMI2016 David Gleich · Purdue 15
  • 16. But current problems are much more rich than when spectral was designed Spectral clustering is theoretically justified for undirected, simple graphs" Many datasets are directed, weighted, signed, colored, layered, R. Milo, 2002, Science X Y X causes Y to be expressed Z represses Y X Z Y + – PCMI2016 David Gleich · Purdue 16
  • 17. Our contributions 1.  A generalized conductance metric for motifs 2.  A new spectral clustering algorithm to minimize the generalized conductance. 3.  AND an associated Cheeger inequality. 4.  Aquatic layers in food webs 5.  Control structures in neural networks 6.  Hub structure in transportation networks 7.  Anomaly detection in Twitter Benson, Gleich, Leskovec, Science 2016. PCMI2016 David Gleich · Purdue 17
  • 18. Motif-based conductance generalizes ! edge-based conductance Need notions of cut and volume! (S) = #(edges cut) min(vol(S), vol( ¯S)) Edges cut! Triangles cut! S S S¯S ¯S vol(S) = #(edge end points in S) volM (S) = #(triangle end points in S) M (S) = #(triangles cut) min(volM (S), volM ( ¯S)) PCMI2016 David Gleich · Purdue 18
  • 19. An example of motif-conductance 9 10 6 5 8 1 7 2 0 4 3 11 9 10 8 7 2 0 4 3 11 6 5 1 ¯S S Motif M (S) = motifs cut motif volume = 1 10 PCMI2016 David Gleich · Purdue 19
  • 20. Going from motifs back to a matrix for spectral clustering 9 10 6 5 8 1 7 2 0 4 3 11 9 10 6 5 8 1 7 2 0 4 3 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 A W(M) ij = counts co-occurrences of motif pattern between i, j W(M) PCMI2016 David Gleich · Purdue 20
  • 21. Going from motifs back to a matrix for spectral clustering 9 10 6 5 8 1 7 2 0 4 3 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 W(M) ij = counts co-occurrences of motif pattern between i, j W(M) KEY INSIGHT! Spectral clustering on W(M) yields results on the new motif notion of conductance M (S) = motifs cut motif volume = 1 10 PCMI2016 David Gleich · Purdue 21
  • 22. A motif-based clustering algorithm 1.  Form weighted graph W(M) 2.  Compute the Fiedler vector associated with λ2 of the motif-normalized Laplacian 3.  Run a (motif-cond) sweep cut on f! 9 10 6 5 8 1 7 2 0 4 3 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 W(M) D = diag(W(M) e) L(M) = D 1/2 (D W(M) )D 1/2 L(M) z = 2z f(M) = D 1/2 z PCMI2016 David Gleich · Purdue 22
  • 23. The sweep cut results 2 4 6 8 10 0 0.2 0.4 0.6 0.8 1 1 2 0 4 3 1 2 0 4 3 9 10 6 Best higher- order cluster 2nd best higher- order cluster 9 10 6 5 8 1 7 2 0 4 3 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 (Order from the Fiedler vector) PCMI2016 David Gleich · Purdue 23
  • 24. The motif-based Cheeger inequality THEOREM! If the motif has three nodes, then the sweep procedure on the weighted graph finds a set S of nodes for which THEOREM For more than 4 nodes, we " use a slightly altered conductance. M (S)  4 q ⇤ M cutM (S, G) = X {i,j,k}2M(G) Indicator[xi , xj , xk not the same] = quadratic in x M(G) = {instances of M in G} Key Proof Step! PCMI2016 David Gleich · Purdue 24
  • 25. Awesome advantages We inherit 40+ years of research! •  Fast algorithms " (ARPACK, etc.)! •  Local methods! •  Overlapping! •  Easy to implement " (20 lines of Matlab/Julia) •  Scalable (1.4B edges graphs " are not a prob.) PCMI2016 David Gleich · Purdue 25 12/13/2015 motif_example function [S, conductances] = MotifClusterM36(A) B = spones(A & A'); % bidirectional links U = A - B; % unidirectional links W = (B * U') .* U' + (U * B) .* U + (U' * U) .* B; % Motif M_3^6 D = diag(sum(W)); Ln = speye(size(W, 1)) - sqrt(D)^(-1) * W * sqrt(D)^(-1); [Z, ~] = eigs(Ln, 2, 'sm'); [~, order] = sort(sqrt(D)^(-1) * Z(:, 2)); conductances = zeros(n, 1); x = zeros(n, 1); for i = 1:n x(order(i)) = 1; xn = ~x + 0; conductances(i) = x' * (D - W) * x / min(x' * D * x, xn' * D * xn); end [~, split] = min(conductances); S = order(1:split); Error using motif_example (line 2) Not enough input arguments. Published with MATLAB® R2015a
  • 26. Case studies An intro note! 1.  Aquatic layers in food webs." Signed patterns in regulatory networks 2.  Control structures in neural networks 3.  Hub structure in transportation networks. 4.  Scaling and large data PCMI2016 David Gleich · Purdue 26
  • 27. NOTE ! The partition depends on the motif 10 11 9 8 3 1 5 4 12 7 6 2 10 11 9 8 3 1 5 4 12 7 6 2 PCMI2016 David Gleich · Purdue 27
  • 28. Case study 1! Motifs partition the food webs Food webs model energy exchange in species of an ecosystem i -> j means i’s energy goes to j " (or j eats i) Via Cheeger, motif conductance is better than edge conductance. PCMI2016 David Gleich · Purdue 28
  • 30. Case study 1! Motifs partition the food webs Micronutrient ! sources! Benthic Fishes! Benthic Macroinvertebrates! Pelagic fishes ! and benthic prey! Motif M6 reveals aquatic layers. A 84% accuracy vs. 69% for other methods PCMI2016 David Gleich · Purdue 30
  • 31. Case study 2! Nictation control in neural network (d) From Nictation, a dispersal behavior of the nematode Caenorhabditis elegans, is regulated by IL2 neurons, Lee et al. Nature Neuroscience. " We find the control mechanism that explains this based on the bi-fan motif (Milo et al. found it over-expressed) A B C Nicatation – standing on a tail and waving A B PCMI2016 David Gleich · Purdue 31
  • 32. Case study 3 ! Rich structure beyond clusters North American air " transport network Nodes are airports Edges reflect " reachability, and " are unweighted. (Based on Frey" et al.’s 2007) PCMI2016 David Gleich · Purdue 32
  • 33. We can use complex motifs with non- anchored nodes D C B A Counts length-two walks PCMI2016 David Gleich · Purdue 33
  • 34. The weighting alone reveals hub-like structure PCMI2016 David Gleich · Purdue 34
  • 35. The motif embedding shows this structure and splits into east-west Top 10 U.S. hubs East coast non-hubs! West coast non-hubs! Primary spectral coordinate Atlanta, the top hub, is next to Salina, a non-hub. MOTIF SPECTRAL 
 EMBEDDING EDGE SPECTRAL 
 EMBEDDING PCMI2016 David Gleich · Purdue 35
  • 36. Case study 4! Large scale stuff The up-linked triangle finds an anomalous cluster in Twitter. Anomalous cluster in the 1.4B edge Twitter graph. All nodes are holding accounts for a company, and the orange nodes have incomplete profiles. PCMI2016 David Gleich · Purdue 36
  • 37. Related work. §  Laplacian we propose was originally proposed by Rodríguez [2004] and again by Zhou et al. [2006]" Our new theory (motif Cheeger inequality) explains why these were good ideas. §  Falls under general strategy of encoding hypergraph partitioning problem as graph clustering problem [Agarwal+ 06] §  Serrour, Arenas, and Gómez, Detecting communities of triangles in complex networks using spectral optimization, 2011. §  Arenas et al., Motif-based communities in complex networks, 2008. PCMI2016 David Gleich · Purdue 37
  • 38. Paper! Benson, Gleich, Leskovec! Science, 2016 1.  A generalized conductance metric for motifs 2.  A new spectral clustering algorithm to minimize the generalized conductance. 3.  AND an associated Cheeger inequality. 4.  Aquatic layers in food webs 5.  Control structures in neural networks 6.  Hub structure in transportation networks 7.  Anomaly detection in Twitter 8.  Lots of cool stuff on signed networks. Thank you! Joint work with " Austin Benson and Jure Leskovec, Stanford Supported by NSF CAREER CCF-1149756, IIS-1422918 IIS- DARPA SIMPLEX 9 10 8 7 2 0 4 3 11 6 5 1 PCMI2016 David Gleich · Purdue 38