SlideShare une entreprise Scribd logo
1  sur  83
Short URLs, Big Fun:
Understanding the World in Realtime


            Hilary Mason
         Chief Scientist, bitly

              @hmason
              h@bit.ly
http://www.pcworld.com/article/223409/move_over_dr_soong_
girls_can_build_android_apps_too.html




                        http://bit.ly/hOnbWg
[fireplace]
How do we change the world?
Can we understand the world, first?
Big Data
Data
10s of millions of URLs per day
100s of millions of clicks per day



10s of billions of URLs
encodes
{"g": "zalAU0",
 "i": "173.213.X.X",
"h": "zalAU0",
"l": "bitly",
"u": "http://www.amazon.com/Country-Life-Cal-Mag-
Potassium-Target-
Tablets/dp/B0001VUZ3A?SubscriptionId=AKIAJGA7
AAB6QE7WENSQ&tag=mycellrevi-
20&linkCode=sp1&camp=2025&creative=165953&cr
eativeASIN=B0001VUZ3A", "t": 1328266799,
"_id": "4f2bbe2f-0035d-063a1-3d1cf10a"}
decode
{"a": "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)",
"c": "US",
"nk": 1,
"tz": "America/New_York",
"gr": "NY",
"g": "xNaZ9h",
"i": "98.118.X.X",
"h": "wXxuKW",
"k": "4eefe4be-003e4-X-X",
"l": "moma",
"al": "en-US", "hh":
"bit.ly",
"r":
"http://www.facebook.com/l.php?u=http%3A%2F%2Fbit.ly%2FwXxuKW&h=rAQG_ZZ2G
AQGQth1IOej-9_KHmbpEvh6FlllZjsAqg6A7Rw",
"u": "http://www.brainpickings.org/index.php/2012/02/02/jackson-pollock-father-letter/",
"t": 1328272481,
"hc": 1328232072,
"cy": "East Amherst",
"ll": [43.044101715087891, -78.694900512695312]}
a link
•   URL
•   Content
•   Ref distribution
•   Geo distribution
•   Language
•   Key phrases
•   Topic
Data Science?




Analytics   Science
Data Science?




Things you can                   Things you
just count.                      can’t.
Data scientists?

     engineering
                                math

                     nerds


           nerds               nerds



                     nerds
comp sci
                             hacking




                   awesome nerds
bitly science team!
What can we learn from a lot of
 people talking to each other?
A few things that we can count...
How do people use different
        devices?
What happens on the internet when
       society isn’t stable?
Revolution.
(Silly Things on the Internet)
the cutest kitten
A few things that we can count...

            cleverly.
What spoken languages are in a
           page?
raw data
"es"
"en-us,en;q=0.5"
"pt-BR,pt;q=0.8,en-US;q=0.6,en;q=0.4"
"en-gb,en;q=0.5"
"en-US,en;q=0.5"
"es-es,es;q=0.8,en-us;q=0.5,en;q=0.3”
"de, en-gb;q=0.9, en;q=0.8"
entropy calculation
def ghash2lang(g, Ri, min_count=3, max_entropy=0.2):
  ""”
  returns the majority vote of a langauge for a given hash
  ""”
  lang = R.zrevrange(g,0,0)[0]
  # let's calculate the entropy!
  # possible languages
  x = R.zrange(g,0,-1)
  # distribution over those languages
  p = np.array([R.zscore(g,langi) for langi in x])
  p /= p.sum()
  # info content
  I = [pi*np.log(pi) for pi in p]
  # entropy: smaller the more certain we are! - i.e. the lower our surprise
  H = -sum(I)/len(I) #in nats!
  # note that this will give a perfect zero for a single count in one language
  # or for 5K counts in one language. So we also need the count..
               count = R.zscore(g,lang)
  if count < min_count and H > max_entropy:
      return lang, count
  else:
      return None, 1
http://4sq.com/96kc1O
What’s the context
 around a link?
[demo]
Things we have to think about.

           (science)
What’s a human?
normal click distributions
abnormal click distributions
Organic vs Inorganic?
AT SCALE
1. Research offline

2. Do fancy math – find the shortcuts

3. Design infrastructure

4. Re-design to run at scale and speed
Realtime Search
Realtime Search
Attributes calculated either at index time or
query time.

Rankings can vary by second.
[demo]
What are people paying
attention to right now?
actual rate of clicks on phrases
vs
expected rate of clicks on phrases
Dragoneye
We calculate clickrate with a sort of moving
average:




          where
Dragoneye
We represent as a sum of delta spikes.

This simplifies to:
Dragoneye
Choosing    is important.

It must be interpretable, and smooth (but not
too smooth).

We use a distribution for that is a function
that sums to 1. The function is 0 at the
origin.
[demo]
philosophy
simple math > fancy math
How do we know
when we’ve won?
How do we communicate what
  we’ve learned effectively?
Ask the crazy questions.
Thank you!




h@bit.ly
@hmason

Contenu connexe

Similaire à Short URLs, Big Fun

Data visualization for development
Data visualization for developmentData visualization for development
Data visualization for developmentSara-Jayne Terp
 
OpenFest 2012 : Leveraging the public internet
OpenFest 2012 : Leveraging the public internetOpenFest 2012 : Leveraging the public internet
OpenFest 2012 : Leveraging the public internettkisason
 
Just the basics_strata_2013
Just the basics_strata_2013Just the basics_strata_2013
Just the basics_strata_2013Ken Mwai
 
Webinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with FusionWebinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with FusionLucidworks
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?Frank van Harmelen
 
Understanding Artificial Intelligence
Understanding Artificial Intelligence Understanding Artificial Intelligence
Understanding Artificial Intelligence St. Petersburg College
 
October hug
October hugOctober hug
October hughuguk
 
Python 101 for Data Science to Absolute Beginners
Python 101 for Data Science to Absolute BeginnersPython 101 for Data Science to Absolute Beginners
Python 101 for Data Science to Absolute BeginnersSai Linn Thu
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273Abutest
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273Abutest
 
Artificial Assistants: How can I help you? by Christopher Currin
Artificial Assistants: How can I help you? by Christopher CurrinArtificial Assistants: How can I help you? by Christopher Currin
Artificial Assistants: How can I help you? by Christopher CurrinChristopher Currin
 
Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011Eli White
 
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...PyData
 
Intro to Python for Data Science
Intro to Python for Data ScienceIntro to Python for Data Science
Intro to Python for Data ScienceTJ Stalcup
 
Web technology: Web search
Web technology: Web searchWeb technology: Web search
Web technology: Web searchVictor de Boer
 
Intro to Python for Data Science
Intro to Python for Data ScienceIntro to Python for Data Science
Intro to Python for Data ScienceTJ Stalcup
 
Get connected with python
Get connected with pythonGet connected with python
Get connected with pythonJan Kroon
 
Pandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data SciencePandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data ScienceKrishna Sankar
 

Similaire à Short URLs, Big Fun (20)

Data visualization for development
Data visualization for developmentData visualization for development
Data visualization for development
 
OpenFest 2012 : Leveraging the public internet
OpenFest 2012 : Leveraging the public internetOpenFest 2012 : Leveraging the public internet
OpenFest 2012 : Leveraging the public internet
 
Just the basics_strata_2013
Just the basics_strata_2013Just the basics_strata_2013
Just the basics_strata_2013
 
Webinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with FusionWebinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with Fusion
 
Progressing and enhancing
Progressing and enhancingProgressing and enhancing
Progressing and enhancing
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?
 
Understanding Artificial Intelligence
Understanding Artificial Intelligence Understanding Artificial Intelligence
Understanding Artificial Intelligence
 
October hug
October hugOctober hug
October hug
 
Python 101 for Data Science to Absolute Beginners
Python 101 for Data Science to Absolute BeginnersPython 101 for Data Science to Absolute Beginners
Python 101 for Data Science to Absolute Beginners
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
Artificial Assistants: How can I help you? by Christopher Currin
Artificial Assistants: How can I help you? by Christopher CurrinArtificial Assistants: How can I help you? by Christopher Currin
Artificial Assistants: How can I help you? by Christopher Currin
 
Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011
 
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
 
Intro to Python for Data Science
Intro to Python for Data ScienceIntro to Python for Data Science
Intro to Python for Data Science
 
Web technology: Web search
Web technology: Web searchWeb technology: Web search
Web technology: Web search
 
Intro to Python for Data Science
Intro to Python for Data ScienceIntro to Python for Data Science
Intro to Python for Data Science
 
Get connected with python
Get connected with pythonGet connected with python
Get connected with python
 
Machine Learning for dummies!
Machine Learning for dummies!Machine Learning for dummies!
Machine Learning for dummies!
 
Pandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data SciencePandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data Science
 

Plus de Hilary Mason

PyCon 2011 Keynote
PyCon 2011 KeynotePyCon 2011 Keynote
PyCon 2011 KeynoteHilary Mason
 
Machine Learning for Web Data
Machine Learning for Web DataMachine Learning for Web Data
Machine Learning for Web DataHilary Mason
 
A Data-driven Look at the Realtime Web
A Data-driven Look at the Realtime WebA Data-driven Look at the Realtime Web
A Data-driven Look at the Realtime WebHilary Mason
 
IgniteNYC: How to Replace Yourself With a Very Small Shell Script
IgniteNYC: How to Replace Yourself With a Very Small Shell ScriptIgniteNYC: How to Replace Yourself With a Very Small Shell Script
IgniteNYC: How to Replace Yourself With a Very Small Shell ScriptHilary Mason
 
Practical Data Analysis in Python
Practical Data Analysis in PythonPractical Data Analysis in Python
Practical Data Analysis in PythonHilary Mason
 
Have data? What now?!
Have data? What now?!Have data? What now?!
Have data? What now?!Hilary Mason
 
JWU Guest Talk: JavaScript and AJAX
JWU Guest Talk: JavaScript and AJAXJWU Guest Talk: JavaScript and AJAX
JWU Guest Talk: JavaScript and AJAXHilary Mason
 
Analytics for Virtual Worlds
Analytics for Virtual WorldsAnalytics for Virtual Worlds
Analytics for Virtual WorldsHilary Mason
 
Experiential Learning in Second Life
Experiential Learning in Second LifeExperiential Learning in Second Life
Experiential Learning in Second LifeHilary Mason
 
Virtual Worlds in Education
Virtual Worlds in EducationVirtual Worlds in Education
Virtual Worlds in EducationHilary Mason
 

Plus de Hilary Mason (10)

PyCon 2011 Keynote
PyCon 2011 KeynotePyCon 2011 Keynote
PyCon 2011 Keynote
 
Machine Learning for Web Data
Machine Learning for Web DataMachine Learning for Web Data
Machine Learning for Web Data
 
A Data-driven Look at the Realtime Web
A Data-driven Look at the Realtime WebA Data-driven Look at the Realtime Web
A Data-driven Look at the Realtime Web
 
IgniteNYC: How to Replace Yourself With a Very Small Shell Script
IgniteNYC: How to Replace Yourself With a Very Small Shell ScriptIgniteNYC: How to Replace Yourself With a Very Small Shell Script
IgniteNYC: How to Replace Yourself With a Very Small Shell Script
 
Practical Data Analysis in Python
Practical Data Analysis in PythonPractical Data Analysis in Python
Practical Data Analysis in Python
 
Have data? What now?!
Have data? What now?!Have data? What now?!
Have data? What now?!
 
JWU Guest Talk: JavaScript and AJAX
JWU Guest Talk: JavaScript and AJAXJWU Guest Talk: JavaScript and AJAX
JWU Guest Talk: JavaScript and AJAX
 
Analytics for Virtual Worlds
Analytics for Virtual WorldsAnalytics for Virtual Worlds
Analytics for Virtual Worlds
 
Experiential Learning in Second Life
Experiential Learning in Second LifeExperiential Learning in Second Life
Experiential Learning in Second Life
 
Virtual Worlds in Education
Virtual Worlds in EducationVirtual Worlds in Education
Virtual Worlds in Education
 

Dernier

Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimaginedpanagenda
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxFIDO Alliance
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandIES VE
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireExakis Nelite
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfFIDO Alliance
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfFIDO Alliance
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfFIDO Alliance
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!Memoori
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...FIDO Alliance
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxFIDO Alliance
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe中 央社
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessUXDXConf
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch TuesdayIvanti
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctBrainSell Technologies
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfSrushith Repakula
 
Your enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jYour enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jNeo4j
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxFIDO Alliance
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxFIDO Alliance
 

Dernier (20)

Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Your enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jYour enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4j
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 

Short URLs, Big Fun

Notes de l'éditeur

  1. This is going to be a talk for people who love the internet.
  2. The true story of bitly, engineering, data science, loveHow to do data science at scaleBuilding teams and keeping people happyClever tricks
  3. Thomas Karaganis at MS Research 1% of new URLs per day
  4. Shortened links get shared on different platforms and methods
  5. Messages move across platforms in complicated ways
  6. We have a lot of growing up to do
  7. http://www.flickr.com/photos/wanderingnome/73328967/sizes/l/in/photostream/Philosphy, science, engineering, cool tricks
  8. Pow! Surprise! Here we are!
  9. …first, we understand it
  10. …first, we understand it
  11. Asking questions.
  12. http://www.flickr.com/photos/32443746@N07/4753829490/
  13. Egypt.
  14. Tunisia.
  15. …and we can do the same thing for geo data
  16. Studied offline, using hadoopBuild a supervised classifier over timeseriesBuild a random forest ensemble decision tree classifier
  17. The data system fortune cookie gameCreative commons: http://www.flickr.com/photos/mzn37/308048794/sizes/o/in/photostream/
  18. The simplification is important for three reasons:1) A continuous function of time that simplifies to \\phi2) it’s linear, so the sum of the click rates on each page with a phrase is the click rate per phrase3) IT’S FAST
  19. The 0 at the origin insures that we have seen sustained click rates on a phrase before we think it’s anything useful.