This document provides an overview of legal analytics and empirical legal studies, comparing their methods and goals. Legal analytics uses machine learning and predictive modeling to predict outcomes, while empirical legal studies uses social science methods like regression analysis to determine the causal impact of policies. Both aim to improve legal rules and institutions but use different tools - prediction vs. causal inference. Prediction is important for tasks like litigation strategies, while causal inference is best for evaluating policy interventions. Recent work has shown growing interest in both rigorous predictive modeling of legal data as well as combining predictive and causal techniques.
The Ultimate Guide to Drafting Your Separation Agreement with a Template
Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Prediction Redux
1. legal analytics vs.
empirical legal studies
daniel martin katz
blog | ComputationalLegalStudies.com
corp | LexPredict.com
page | DanielMartinKatz.com
edu | chicago-kent college of law
lab | theLawLab.com
-or- causal inference vs prediction redux
2. ELS and Legal Analytics
Never the Twain
Shall Meet?
Partners in the
Same Pursuit
-OR-
3. I thought I might offer
a quick landscape
orientation regarding
terminology, methods, etc.
9. Tools:
Use Traditional
Social Science Methods
Instrumental Variables, Propensity Score
Matching, Rubin Causal Model, Regression
Discontinuity, Difference in Differences, etc.
(typically econometric) tools
22. Some would make the
epistemological /
methodological
case to be made
that prediction >
causal inference
23. Part of that case comes from
finance, trading, etc.
where causal inference tools
are generally not used
24. Andrew D. Martin, Kevin M. Quinn, Theodore W. Ruger & Pauline T. Kim,
Competing Approaches to Predicting Supreme Court Decision Making,
2 Perspectives on Politics 761 (2004).
“the best test of an explanatory theory is its
ability to predict future events. To the extent
that scholars in both disciplines (social
science and law) seek to explain court
behavior, they ought to test their theories
not only against cases already decided, but
against future outcomes as well.”
25. Andrew D. Martin, Kevin M. Quinn, Theodore W. Ruger & Pauline T. Kim,
Competing Approaches to Predicting Supreme Court Decision Making,
2 Perspectives on Politics 761 (2004).
“the best test of an explanatory theory
is its ability to predict future
events. To the extent that scholars in both
disciplines (social science and law) seek to
explain court behavior, they ought to
test their theories not only against cases
already decided, but against future
outcomes as well.”
26. Andrew D. Martin, Kevin M. Quinn, Theodore W. Ruger & Pauline T. Kim,
Competing Approaches to Predicting Supreme Court Decision Making,
2 Perspectives on Politics 761 (2004).
“the best test of an explanatory theory is its
ability to predict future events. To the extent
that scholars in both disciplines (social
science and law) seek to explain court
behavior behavior, they ought to test their
theories not only against cases already
decided, but against future outcomes as
well.”
56. “…I study choice of law by
analyzing the nearly 1,000,000
contracts that have been disclosed
to the Securities and Exchange
Commission between 1996–2012.”
57. In this paper, we analyze over 4.5 million references to
U.S. Federal Acts and Agencies contained within these
10-K reports to build a mean-field measurement of
temperature and diversity in this regulatory ecosystem
58. There has also been
a significant amount
of commercial interest
linked to legal analytics
59. For example,
here are just a few
predictions
that lawyers are trying to
accomplish on a daily basis
61. #Predict Relevant Documents
Data Driven EDiscovery/Due Diligence
(Predictive Coding)
#Predict Contract Terms/Outcomes
Data Driven Transactional Work
62. #Predict Relevant Documents
Data Driven EDiscovery/Due Diligence
(Predictive Coding)
Data Driven Compliance
#Predict Contract Terms/Outcomes
Data Driven Transactional Work
#Predict Rogue Behavior
63. #Predict Relevant Documents
#Predict Case Outcomes
Data Driven Legal Underwriting
Data Driven EDiscovery/Due Diligence
(Predictive Coding)
#Predict Rogue Behavior
Data Driven Compliance
#Predict Contract Terms/Outcomes
Data Driven Transactional Work
64. #Predict Relevant Documents
#Predict Case Outcomes
Data Driven Legal Underwriting
Data Driven EDiscovery/Due Diligence
(Predictive Coding)
Data Driven Compliance
#Predict Contract Terms/Outcomes
Data Driven Transactional Work
#Predict Regulatory Outcomes
Data Driven Lobbying, etc.
#Predict Rogue Behavior
66. Not only law firms but also
the large enterprise clients …
67. 35!
“From!se)lement!informa0on!and!
contracts! to! sensi0ve! client! data!
and! beyond,! Liberty! Mutual!
creates! and! stores! ever:growing!
volumes! of! unorganized! data!
across! its! worldwide! offices! and!
databases.”!
“I've!seen!a!real!transforma0on!in!
the! legal! department! just! having!
t h a t! i n f o r m a 0 o n! v i s u a l l y!
available."!
“The' legal' department' is' now'
w o r k i n g' p r e d i c 7 v e' a n d'
prescrip7ve' analy7cs,"' i.e.' ways'
to' analyze' data' that' enable'
forecas7ng'for'legal'issues.”'
70. 33!
“Now! we! have! program! managers,!
data! analysts,! business! analysts,!
data! scien9sts,! opera9ons!
managers,!I!mean,!we!have!a!ton!of!
stuff.! That's! the! key! for! me,! is!
thinking! about! the! right! people!
doing! the! right! tasks.! That's! the!
people!part.!And!then!how!they!do!
them,! is! the! process,! and! then,!
automa9ng! parts,! is! kind! of! that!
next,!final!step.!!
"
And$ all$ of$ that$ is$ underpinned$ by$
d a t a ." Y o u$ c a n ' t$ d o$ a n y$
improvements$ unless$ you$ have$
data.$ You$ can't$ automate$ unless$
you$have$good$data.”!
71. In so much as prediction is
the task in question …
#LegalTech #FinTech
#Fin(Legal)Tech
72. “The real roll-up of all this isn’t robot lawyers,
it’s financialization, with law becoming an
applied branch of finance and insurance.”
Daniel Martin Katz, professor, Illinois Tech’s Chicago Kent College of Law
http://www.ozy.com/fast-forward/why-artificial-intelligence-might-replace-your-lawyer/75435
84. Columbia Law Review
October, 2004
Theodore W. Ruger, Pauline T. Kim,
Andrew D. Martin, Kevin M. Quinn
Legal and Political
Science Approaches to
Predicting Supreme
Court Decision Making
The Supreme Court
Forecasting Project:
112. “Software developers were asked on
two separate days to estimate the
completion time for a given task, the
hours they projected differed by 71%,
on average.
W h e n p a t h o l o g i s t s m a d e t wo
assessments of the severity of biopsy
results, the correlation between their
ratings was only .61 (out of a perfect
1.0), indicating that they made
inconsistent diagnoses quite frequently.
Judgments made by different people
are even more likely to diverge.”
114. (most pundits did not
identify as a serious
candidate him until
mid-January 2017)
Neil Gorsuch was #1
o n o u r F a n t a s y
Platform 12 Days after
Donald Trump was
elected President
(i.e Nov 20)
118. Columbia Law Review
October, 2004
Theodore W. Ruger, Pauline T. Kim,
Andrew D. Martin, Kevin M. Quinn
Legal and Political Science
Approaches to Predicting
Supreme Court Decision
Making
The Supreme Court
Forecasting Project:
119. Ruger, et al (2004)
relied upon
Brieman(1984)
(as partially shown below)
125. Random forest is an approach to
aggregate weak learners into
collective strong learners
(using a combo of bagging and random substrates)
(think of it as crowd sourcing of models)
137. Our Model Against the Null Models
Some commentators had suggested using a heuristic rule of
‘always guess reverse’ as a baseline
(Null Model 1 ) the always guess Reverse model
Turns out it is a lousy
model prior to ~1950
Because the reversal rate
is not stable over time
138. Our Model Against the Null Models
(Null Model 2 ) memory window = inf
This is our model against Null Model 2
What about memory window that selects the most frequent
historical outcome?
(Green = our model out performs)
139. Our Model Against the Null Models
(Null Model 3 ) finite memory window = 10
We in-sample optimize using future information to select a
null model that is among the best performing of all null models
as it is using in-sample info this is a deeply unfair null
140. Over past century, we outperform
M=10 by nearly 5% and have
significant temporal stability at both
the justice, case, term level
149. expert
forecast
crowd
forecast
learning problem is to discover how to blend streams of intelligence
algorithm
forecast
ensemble method
ensemble model
via back testing we can learn the weights
to apply for particular problems
150. By the way, you
might ask why does
one care about
marginal improvements
in prediction ?
#Fin(Legal)Tech
151. Given our ability to offer
forecasts of judicial
outcomes, we wondered
if this information could
inform an event driven
trading strategy ?
152. Revise + Resubmit @
http://arxiv.org/abs/1508.05751
available at
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2649726
155. The Three Forms
of (Legal) Prediction
The ‘Empirical Turn’
in Legal Scholarship
The Diversity of Tasks
that Lawyers Undertake
Some Epistemological
Issues / Questions
The Other Type
of Work
That Lawyers Do
Quantitative
Legal
Prediction
160. Daniel Martin Katz
@ computational
computationallegalstudies.com
lexpredict.com
danielmartinkatz.com
illinois tech - chicago kent college of law@
thelawlab.com