SlideShare a Scribd company logo
1 of 21
Download to read offline
Distance Measures for Dynamic Citation Networks

                        M. Bommarito                 D. Katz          J. Zelner           J. Fowler


                                                     May 21, 2010




M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                    May 21, 2010   1 / 21
Outline

    1   Goals
          Supreme Court Citation Network

    2   Citation Dynamics and Sinks

    3   Distance Measures for Dynamic Citation Networks

    4   How does the “sink” method perform?
          Simulation Results
          United States Supreme Court

    5   Conclusion and Future Directions


M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                May 21, 2010   2 / 21
Goals    Supreme Court Citation Network


   Goals & Data

   Goal: Can we uncover various mesoscopic patterns within the
   jurisprudence of the United States Supreme Court?
     1 |V | ≈ 36k, |E| ≈ 280k

     2 1791-2005




M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                     May 21, 2010   3 / 21
Goals    Supreme Court Citation Network


   Standard Solution

   Standard Solution: Obtain vertex community membership by
   applying an out-of-the-box community detection method.
   Methods:
       1   Edge-Betweenness (Girvan & Newman 2002)
       2   Fast-Greedy (Clauset et al. 2004)
       3   Leading (or more) Eigenvector (Newman 2006, Richardson et al.
           2009)
       4   Walktrap (Pons & Latapy 2006)




M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                     May 21, 2010   4 / 21
Goals    Supreme Court Citation Network


   Expectations

   Expectation: Dyadic relationships should be fairly stable.

   If two vertices are in the same community m at t, they should be in the
   same community n (not necessarily identical to m) at t + 1.

   Formally, this can be written as “pairwise stability” σ:

                       σ =P(Cit+1 = Cj |Cit = Cj )
                                     t+1       t

                       Cit :community membership of vertex i at time t

   This conception of stability avoids many issues with community tracking.




M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                     May 21, 2010   5 / 21
Goals    Supreme Court Citation Network


   Results

                          Fast-Greedy                                             Eigenvector




   The results of these approaches do not match our expectation.




M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                     May 21, 2010   6 / 21
Goals    Supreme Court Citation Network


   Research Source

   Title: On the Stability of Community Detection Algorithms on
   Longitudinal Citation Data.
   Michael J. Bommarito II, Daniel M. Katz, Jonathan L. Zelner.
   Forthcoming in Proceedings of ASNA 2009 (ETH-Zurich).


   Goal: Compare out-of-the-box community detection methods under
   different parameters of a citation model w.r.t.:
       1   Average number of resulting communities across all time steps
       2   Average pairwise stability of all vertex pairs across all time steps




M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                     May 21, 2010   7 / 21
Goals    Supreme Court Citation Network


   Results




M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                     May 21, 2010   8 / 21
Goals    Supreme Court Citation Network


   Implications

   Citation networks are different.

       1   Patterns within citation networks are not well-revealed by these
           methods.
       2   Qualitative conclusions may vary dramatically based on the chosen
           method.
       3   The “appropriateness” of each method may depend on parameters of
           the generating process.




M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                     May 21, 2010   9 / 21
Citation Dynamics and Sinks


   Citation Dynamics


   What are the basic growth rules of a citation network?
       1   Documents and their citations are introduced into the network in
           sequence.
       2   Documents cannot create new outbound citations after introduction.

   These rules guarantee that any resulting network is an acyclic digraph.
   The simplest topological ordering is just the order of vertex introduction.




M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                May 21, 2010   10 / 21
Citation Dynamics and Sinks


   Dynamic Acyclic Digraphs


   What properties do we have?

       1   Each component has at least one “sink” and one “source.”
       2   Sinks are vertices with zero out-degree. The first vertex in a
           topological ordering must be a sink.
       3   Sources are vertices with zero in-degree. The last vertex in a
           topological ordering must be a source.




M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                May 21, 2010   11 / 21
Citation Dynamics and Sinks


   Sinks

   If sinks have zero out-degree, they must represent the point at
   which at least one idea is introduced into the network.

   Either the document “invents” the idea or the head of the citation arc was
   not sampled in the dataset.

   Weak vs. Strong - Dimensional Data can help identify Weak Sinks




M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                May 21, 2010   12 / 21
Citation Dynamics and Sinks


   Six Degrees of Marbury v. Madison




M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                May 21, 2010   13 / 21
Distance Measures for Dynamic Citation Networks


   Basic Idea of the Distance Measure


   If two vertices share more “ideas,” they should be more similar.

   Alternative Example: Articles in Political Science
       1   American Politics
       2   Congress
       3   Committee Assignments
       4   Formal Theory

   We want to be able to use clustering methods, so we then construct a
   distance measure from this basic premise.




M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                May 21, 2010   14 / 21
Distance Measures for Dynamic Citation Networks


   A Simple Distance Measure


   Simplest Distance Measure: Proportion of Possibly Shared Ideas

                                        |Si ∩ Sj |
                           Di,j    =1 −
                                        |Si ∪ Sj |
                                Si :the set of sink vertex IDs for vertex i

   Note that this is only one way to translate from similarity to distance.

   Also note that distance between vertices i and j don’t change over
   time.




M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                May 21, 2010   15 / 21
Distance Measures for Dynamic Citation Networks


   Flexible Framework for More Detailed Specifications

   What if the story is more complicated?

       1   Minimum path length to a sink
       2   Number of paths to a sink
       3   Total number of shared ancestors
       4   Total elapsed time along path

   Example with arbitrary f for path length and number of shared
   ancestors:
                           
                             s∈Si ∩Sj f (Ai,s , Pi,s , Aj,s , Pj,s )
                Di,j =1 − 
                             s∈Si ∪Sj f (Ai,s , Pi,s , Aj,s , Pj,s )




M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                May 21, 2010   16 / 21
How does the “sink” method perform?     Simulation Results


   Simulation
       1   Directed
       2   Two vertex types
       3   Asymmetric vertex connection probabilities
       4   Preferential attachment mechanism (Two-Dimensional)




M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                May 21, 2010   17 / 21
How does the “sink” method perform?     Simulation Results


   Simulation Results




M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                May 21, 2010   18 / 21
How does the “sink” method perform?     United States Supreme Court


   United States Supreme Court




                                   Movie Available @
                               computationallegalstudies.com




   The Early Years of the United States Supreme Court
M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                  May 21, 2010   19 / 21
How does the “sink” method perform?     United States Supreme Court


   Supreme Court Results Using the Sink Method




M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                  May 21, 2010   20 / 21
Conclusion and Future Directions


   Conclusion

       1   There are issues with existing community detection methods in
           dynamic citation networks.

       2   Our sink-based method provides more reasonable qualitative results
           than other methods we’ve tried.

       3   Application to a larger segment of the SCOTUS data together with
           qualitative strategy designed to evaluate the outputs




M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks
                                            ()                                                May 21, 2010   21 / 21

More Related Content

What's hot

Social network analysis intro part I
Social network analysis intro part ISocial network analysis intro part I
Social network analysis intro part ITHomas Plotkowiak
 
ICPSR - Complex Systems Models in the Social Sciences - Lecture 2 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 2 - Professor...ICPSR - Complex Systems Models in the Social Sciences - Lecture 2 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 2 - Professor...Daniel Katz
 
10 More than a Pretty Picture: Visual Thinking in Network Studies
10 More than a Pretty Picture: Visual Thinking in Network Studies10 More than a Pretty Picture: Visual Thinking in Network Studies
10 More than a Pretty Picture: Visual Thinking in Network Studiesdnac
 
Social network analysis part ii
Social network analysis part iiSocial network analysis part ii
Social network analysis part iiTHomas Plotkowiak
 
TCI 2015 What Do Links Mean in Innovation Clusters? ‘Relational Dialectics’
TCI 2015 What Do Links Mean in Innovation Clusters? ‘Relational Dialectics’TCI 2015 What Do Links Mean in Innovation Clusters? ‘Relational Dialectics’
TCI 2015 What Do Links Mean in Innovation Clusters? ‘Relational Dialectics’TCI Network
 
05 Communities in Network
05 Communities in Network05 Communities in Network
05 Communities in Networkdnac
 
06 Regression with Networks – EGO Networks and Randomization (2017)
06 Regression with Networks – EGO Networks and Randomization (2017)06 Regression with Networks – EGO Networks and Randomization (2017)
06 Regression with Networks – EGO Networks and Randomization (2017)Duke Network Analysis Center
 
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...Denis Parra Santander
 
Dissertation Social Network Sites
Dissertation Social Network SitesDissertation Social Network Sites
Dissertation Social Network SitesXenia K-i
 
Mining and analyzing social media part 2 - hicss47 tutorial - dave king
Mining and analyzing social media   part 2 - hicss47 tutorial - dave kingMining and analyzing social media   part 2 - hicss47 tutorial - dave king
Mining and analyzing social media part 2 - hicss47 tutorial - dave kingDave King
 
How to conduct a social network analysis: A tool for empowering teams and wor...
How to conduct a social network analysis: A tool for empowering teams and wor...How to conduct a social network analysis: A tool for empowering teams and wor...
How to conduct a social network analysis: A tool for empowering teams and wor...Jeromy Anglim
 
Multidimensional Patterns of Disturbance in Digital Social Networks
Multidimensional Patterns of Disturbance in Digital Social NetworksMultidimensional Patterns of Disturbance in Digital Social Networks
Multidimensional Patterns of Disturbance in Digital Social NetworksDimitar Denev
 

What's hot (20)

04 Network Data Collection
04 Network Data Collection04 Network Data Collection
04 Network Data Collection
 
Social network analysis intro part I
Social network analysis intro part ISocial network analysis intro part I
Social network analysis intro part I
 
ICPSR - Complex Systems Models in the Social Sciences - Lecture 2 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 2 - Professor...ICPSR - Complex Systems Models in the Social Sciences - Lecture 2 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 2 - Professor...
 
10 More than a Pretty Picture: Visual Thinking in Network Studies
10 More than a Pretty Picture: Visual Thinking in Network Studies10 More than a Pretty Picture: Visual Thinking in Network Studies
10 More than a Pretty Picture: Visual Thinking in Network Studies
 
Social network analysis part ii
Social network analysis part iiSocial network analysis part ii
Social network analysis part ii
 
13 Community Detection
13 Community Detection13 Community Detection
13 Community Detection
 
TCI 2015 What Do Links Mean in Innovation Clusters? ‘Relational Dialectics’
TCI 2015 What Do Links Mean in Innovation Clusters? ‘Relational Dialectics’TCI 2015 What Do Links Mean in Innovation Clusters? ‘Relational Dialectics’
TCI 2015 What Do Links Mean in Innovation Clusters? ‘Relational Dialectics’
 
05 Communities in Network
05 Communities in Network05 Communities in Network
05 Communities in Network
 
06 Regression with Networks – EGO Networks and Randomization (2017)
06 Regression with Networks – EGO Networks and Randomization (2017)06 Regression with Networks – EGO Networks and Randomization (2017)
06 Regression with Networks – EGO Networks and Randomization (2017)
 
03 Communities in Networks (2017)
03 Communities in Networks (2017)03 Communities in Networks (2017)
03 Communities in Networks (2017)
 
CSE509 Lecture 6
CSE509 Lecture 6CSE509 Lecture 6
CSE509 Lecture 6
 
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
 
15 Network Visualization and Communities
15 Network Visualization and Communities15 Network Visualization and Communities
15 Network Visualization and Communities
 
Dissertation Social Network Sites
Dissertation Social Network SitesDissertation Social Network Sites
Dissertation Social Network Sites
 
04 Data Visualization (2017)
04 Data Visualization (2017)04 Data Visualization (2017)
04 Data Visualization (2017)
 
Mining and analyzing social media part 2 - hicss47 tutorial - dave king
Mining and analyzing social media   part 2 - hicss47 tutorial - dave kingMining and analyzing social media   part 2 - hicss47 tutorial - dave king
Mining and analyzing social media part 2 - hicss47 tutorial - dave king
 
How to conduct a social network analysis: A tool for empowering teams and wor...
How to conduct a social network analysis: A tool for empowering teams and wor...How to conduct a social network analysis: A tool for empowering teams and wor...
How to conduct a social network analysis: A tool for empowering teams and wor...
 
01 Network Data Collection (2017)
01 Network Data Collection (2017)01 Network Data Collection (2017)
01 Network Data Collection (2017)
 
Pluss
PlussPluss
Pluss
 
Multidimensional Patterns of Disturbance in Digital Social Networks
Multidimensional Patterns of Disturbance in Digital Social NetworksMultidimensional Patterns of Disturbance in Digital Social Networks
Multidimensional Patterns of Disturbance in Digital Social Networks
 

Similar to Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

Financial Network Analysis @ Central Bank of Bolivia
Financial Network Analysis @ Central Bank of BoliviaFinancial Network Analysis @ Central Bank of Bolivia
Financial Network Analysis @ Central Bank of BoliviaKimmo Soramaki
 
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...Daniel Katz
 
Analytic tools for higher-order data
Analytic tools for higher-order dataAnalytic tools for higher-order data
Analytic tools for higher-order dataAustin Benson
 
Data mining based social network
Data mining based social networkData mining based social network
Data mining based social networkFiras Husseini
 
Characteristics of the Dynamics of Mobile Networks -- Bionetics09
Characteristics of the Dynamics of Mobile Networks -- Bionetics09Characteristics of the Dynamics of Mobile Networks -- Bionetics09
Characteristics of the Dynamics of Mobile Networks -- Bionetics09Eric Fleury
 
Trend Makers and Trend Spotters in a Mobile Application
Trend Makers and Trend Spotters in a Mobile ApplicationTrend Makers and Trend Spotters in a Mobile Application
Trend Makers and Trend Spotters in a Mobile ApplicationDaniele Quercia
 

Similar to Sinks Method Paper Presentation @ Duke Political Networks Conference 2010 (8)

Financial Network Analysis @ Central Bank of Bolivia
Financial Network Analysis @ Central Bank of BoliviaFinancial Network Analysis @ Central Bank of Bolivia
Financial Network Analysis @ Central Bank of Bolivia
 
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
 
Analytic tools for higher-order data
Analytic tools for higher-order dataAnalytic tools for higher-order data
Analytic tools for higher-order data
 
What isa border_kings
What isa border_kingsWhat isa border_kings
What isa border_kings
 
Data mining based social network
Data mining based social networkData mining based social network
Data mining based social network
 
Block nets2018
Block nets2018Block nets2018
Block nets2018
 
Characteristics of the Dynamics of Mobile Networks -- Bionetics09
Characteristics of the Dynamics of Mobile Networks -- Bionetics09Characteristics of the Dynamics of Mobile Networks -- Bionetics09
Characteristics of the Dynamics of Mobile Networks -- Bionetics09
 
Trend Makers and Trend Spotters in a Mobile Application
Trend Makers and Trend Spotters in a Mobile ApplicationTrend Makers and Trend Spotters in a Mobile Application
Trend Makers and Trend Spotters in a Mobile Application
 

More from Daniel Katz

Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...
Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...
Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...Daniel Katz
 
Can Law Librarians Help Law Become More Data Driven ? An Open Question in Ne...
Can Law Librarians Help Law Become More Data Driven ?  An Open Question in Ne...Can Law Librarians Help Law Become More Data Driven ?  An Open Question in Ne...
Can Law Librarians Help Law Become More Data Driven ? An Open Question in Ne...Daniel Katz
 
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...Daniel Katz
 
Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...
Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...
Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...Daniel Katz
 
Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...
Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...
Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...Daniel Katz
 
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...Daniel Katz
 
Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...
Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...
Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...Daniel Katz
 
Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...
Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...
Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...Daniel Katz
 
Artificial Intelligence and Law - 
A Primer
Artificial Intelligence and Law - 
A Primer Artificial Intelligence and Law - 
A Primer
Artificial Intelligence and Law - 
A Primer Daniel Katz
 
Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...
Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...
Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...Daniel Katz
 
Technology, Data and Computation Session @ The World Bank - Law, Justice, and...
Technology, Data and Computation Session @ The World Bank - Law, Justice, and...Technology, Data and Computation Session @ The World Bank - Law, Justice, and...
Technology, Data and Computation Session @ The World Bank - Law, Justice, and...Daniel Katz
 
LexPredict - Empowering the Future of Legal Decision Making
LexPredict - Empowering the Future of Legal Decision MakingLexPredict - Empowering the Future of Legal Decision Making
LexPredict - Empowering the Future of Legal Decision MakingDaniel Katz
 
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...Daniel Katz
 
Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...
Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...
Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...Daniel Katz
 
Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...
Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...
Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...Daniel Katz
 
Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...
Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...
Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...Daniel Katz
 
Legal Analytics Course - Class 9 - Clustering Algorithms (K-Means & Hierarch...
Legal Analytics Course - Class 9 -  Clustering Algorithms (K-Means & Hierarch...Legal Analytics Course - Class 9 -  Clustering Algorithms (K-Means & Hierarch...
Legal Analytics Course - Class 9 - Clustering Algorithms (K-Means & Hierarch...Daniel Katz
 
Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...
Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...
Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...Daniel Katz
 
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...Daniel Katz
 
Legal Analytics Course - Class 6 - Overfitting, Underfitting, & Cross-Validat...
Legal Analytics Course - Class 6 - Overfitting, Underfitting, & Cross-Validat...Legal Analytics Course - Class 6 - Overfitting, Underfitting, & Cross-Validat...
Legal Analytics Course - Class 6 - Overfitting, Underfitting, & Cross-Validat...Daniel Katz
 

More from Daniel Katz (20)

Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...
Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...
Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...
 
Can Law Librarians Help Law Become More Data Driven ? An Open Question in Ne...
Can Law Librarians Help Law Become More Data Driven ?  An Open Question in Ne...Can Law Librarians Help Law Become More Data Driven ?  An Open Question in Ne...
Can Law Librarians Help Law Become More Data Driven ? An Open Question in Ne...
 
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...
 
Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...
Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...
Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...
 
Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...
Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...
Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...
 
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
 
Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...
Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...
Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...
 
Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...
Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...
Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...
 
Artificial Intelligence and Law - 
A Primer
Artificial Intelligence and Law - 
A Primer Artificial Intelligence and Law - 
A Primer
Artificial Intelligence and Law - 
A Primer
 
Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...
Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...
Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...
 
Technology, Data and Computation Session @ The World Bank - Law, Justice, and...
Technology, Data and Computation Session @ The World Bank - Law, Justice, and...Technology, Data and Computation Session @ The World Bank - Law, Justice, and...
Technology, Data and Computation Session @ The World Bank - Law, Justice, and...
 
LexPredict - Empowering the Future of Legal Decision Making
LexPredict - Empowering the Future of Legal Decision MakingLexPredict - Empowering the Future of Legal Decision Making
LexPredict - Empowering the Future of Legal Decision Making
 
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
 
Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...
Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...
Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...
 
Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...
Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...
Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...
 
Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...
Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...
Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...
 
Legal Analytics Course - Class 9 - Clustering Algorithms (K-Means & Hierarch...
Legal Analytics Course - Class 9 -  Clustering Algorithms (K-Means & Hierarch...Legal Analytics Course - Class 9 -  Clustering Algorithms (K-Means & Hierarch...
Legal Analytics Course - Class 9 - Clustering Algorithms (K-Means & Hierarch...
 
Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...
Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...
Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...
 
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...
 
Legal Analytics Course - Class 6 - Overfitting, Underfitting, & Cross-Validat...
Legal Analytics Course - Class 6 - Overfitting, Underfitting, & Cross-Validat...Legal Analytics Course - Class 6 - Overfitting, Underfitting, & Cross-Validat...
Legal Analytics Course - Class 6 - Overfitting, Underfitting, & Cross-Validat...
 

Sinks Method Paper Presentation @ Duke Political Networks Conference 2010

  • 1. Distance Measures for Dynamic Citation Networks M. Bommarito D. Katz J. Zelner J. Fowler May 21, 2010 M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 1 / 21
  • 2. Outline 1 Goals Supreme Court Citation Network 2 Citation Dynamics and Sinks 3 Distance Measures for Dynamic Citation Networks 4 How does the “sink” method perform? Simulation Results United States Supreme Court 5 Conclusion and Future Directions M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 2 / 21
  • 3. Goals Supreme Court Citation Network Goals & Data Goal: Can we uncover various mesoscopic patterns within the jurisprudence of the United States Supreme Court? 1 |V | ≈ 36k, |E| ≈ 280k 2 1791-2005 M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 3 / 21
  • 4. Goals Supreme Court Citation Network Standard Solution Standard Solution: Obtain vertex community membership by applying an out-of-the-box community detection method. Methods: 1 Edge-Betweenness (Girvan & Newman 2002) 2 Fast-Greedy (Clauset et al. 2004) 3 Leading (or more) Eigenvector (Newman 2006, Richardson et al. 2009) 4 Walktrap (Pons & Latapy 2006) M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 4 / 21
  • 5. Goals Supreme Court Citation Network Expectations Expectation: Dyadic relationships should be fairly stable. If two vertices are in the same community m at t, they should be in the same community n (not necessarily identical to m) at t + 1. Formally, this can be written as “pairwise stability” σ: σ =P(Cit+1 = Cj |Cit = Cj ) t+1 t Cit :community membership of vertex i at time t This conception of stability avoids many issues with community tracking. M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 5 / 21
  • 6. Goals Supreme Court Citation Network Results Fast-Greedy Eigenvector The results of these approaches do not match our expectation. M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 6 / 21
  • 7. Goals Supreme Court Citation Network Research Source Title: On the Stability of Community Detection Algorithms on Longitudinal Citation Data. Michael J. Bommarito II, Daniel M. Katz, Jonathan L. Zelner. Forthcoming in Proceedings of ASNA 2009 (ETH-Zurich). Goal: Compare out-of-the-box community detection methods under different parameters of a citation model w.r.t.: 1 Average number of resulting communities across all time steps 2 Average pairwise stability of all vertex pairs across all time steps M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 7 / 21
  • 8. Goals Supreme Court Citation Network Results M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 8 / 21
  • 9. Goals Supreme Court Citation Network Implications Citation networks are different. 1 Patterns within citation networks are not well-revealed by these methods. 2 Qualitative conclusions may vary dramatically based on the chosen method. 3 The “appropriateness” of each method may depend on parameters of the generating process. M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 9 / 21
  • 10. Citation Dynamics and Sinks Citation Dynamics What are the basic growth rules of a citation network? 1 Documents and their citations are introduced into the network in sequence. 2 Documents cannot create new outbound citations after introduction. These rules guarantee that any resulting network is an acyclic digraph. The simplest topological ordering is just the order of vertex introduction. M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 10 / 21
  • 11. Citation Dynamics and Sinks Dynamic Acyclic Digraphs What properties do we have? 1 Each component has at least one “sink” and one “source.” 2 Sinks are vertices with zero out-degree. The first vertex in a topological ordering must be a sink. 3 Sources are vertices with zero in-degree. The last vertex in a topological ordering must be a source. M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 11 / 21
  • 12. Citation Dynamics and Sinks Sinks If sinks have zero out-degree, they must represent the point at which at least one idea is introduced into the network. Either the document “invents” the idea or the head of the citation arc was not sampled in the dataset. Weak vs. Strong - Dimensional Data can help identify Weak Sinks M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 12 / 21
  • 13. Citation Dynamics and Sinks Six Degrees of Marbury v. Madison M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 13 / 21
  • 14. Distance Measures for Dynamic Citation Networks Basic Idea of the Distance Measure If two vertices share more “ideas,” they should be more similar. Alternative Example: Articles in Political Science 1 American Politics 2 Congress 3 Committee Assignments 4 Formal Theory We want to be able to use clustering methods, so we then construct a distance measure from this basic premise. M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 14 / 21
  • 15. Distance Measures for Dynamic Citation Networks A Simple Distance Measure Simplest Distance Measure: Proportion of Possibly Shared Ideas |Si ∩ Sj | Di,j =1 − |Si ∪ Sj | Si :the set of sink vertex IDs for vertex i Note that this is only one way to translate from similarity to distance. Also note that distance between vertices i and j don’t change over time. M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 15 / 21
  • 16. Distance Measures for Dynamic Citation Networks Flexible Framework for More Detailed Specifications What if the story is more complicated? 1 Minimum path length to a sink 2 Number of paths to a sink 3 Total number of shared ancestors 4 Total elapsed time along path Example with arbitrary f for path length and number of shared ancestors: s∈Si ∩Sj f (Ai,s , Pi,s , Aj,s , Pj,s ) Di,j =1 − s∈Si ∪Sj f (Ai,s , Pi,s , Aj,s , Pj,s ) M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 16 / 21
  • 17. How does the “sink” method perform? Simulation Results Simulation 1 Directed 2 Two vertex types 3 Asymmetric vertex connection probabilities 4 Preferential attachment mechanism (Two-Dimensional) M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 17 / 21
  • 18. How does the “sink” method perform? Simulation Results Simulation Results M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 18 / 21
  • 19. How does the “sink” method perform? United States Supreme Court United States Supreme Court Movie Available @ computationallegalstudies.com The Early Years of the United States Supreme Court M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 19 / 21
  • 20. How does the “sink” method perform? United States Supreme Court Supreme Court Results Using the Sink Method M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 20 / 21
  • 21. Conclusion and Future Directions Conclusion 1 There are issues with existing community detection methods in dynamic citation networks. 2 Our sink-based method provides more reasonable qualitative results than other methods we’ve tried. 3 Application to a larger segment of the SCOTUS data together with qualitative strategy designed to evaluate the outputs M. Bommarito, D. Katz, J. Zelner, J. Fowler Distance Measures for Dynamic Citation Networks () May 21, 2010 21 / 21