SlideShare une entreprise Scribd logo
1  sur  87
Real-Time Geo #rtgeo
Who am i?
Giving a real-t ime geo talk at
@where  20. How do you    build stuff?
#rtgeo.
19 Apr via Twitter for iPhone



  from Santa Clara Convention Center
  50 01 Great America Parkway
  Santa Clara, CA 95054
   View Tweets at this place
Background                         [] raffi@
                                   wherehoo
                                   wherehoo
                                             ~/: cat /
                                                       etc/servi
                                                    5859/udp
                                                                 ces | gre
                                                                  # WHEREHO
                                                                           p whereho
                                                                                     o
                                                    5859/tcp                O
Wherehoo (2000)                                                   # WHEREHO
                                                                            O
⇢ “The Stuff Around You”
⇢ “Wherehoo Server: An interactive location service for software agents and
   intelligent systems” - J.Youll, R.Krikorian
⇢ In your /etc/services file!
BusRadio (2004)
⇢ Designed mobile computers to play media while also transmitting telemetry
⇢ Looked and sounded like a radio - but really a Linux computer
OneHop (2007)
⇢ Bluetooth proximity-based social networking
Background
Twitter
⇢Originally tech lead of API / Platform team
⇢Built the first geo-based infrastructure before acquisition of
  Mixer Labs in December of 2009
⇢Now lead of the Application Services group
⇢Runs five teams focused on scalable infrastructure around
  “core” data objects
 ⇢Tweets, users, timelines, places, etc.
 ⇢Delivery, authentication, APIs, etc.
Table of contents
Background
⇢ Why are we interested in this?
Twitter’s geo APIs
⇢ How do we allow people to talk about place?
⇢ Context around “place”
Problem statement
⇢ What do we want our system to do?
Infrastructure
⇢ How is Twitter solving this problem?
People want to talk about places
What’s happening here?
Twitter’s Geo APIs
Original attempts
Adding it to the tweet
⇢ Use myloc.me, et. al. to add text to the tweet
⇢ Puts location “in band”
⇢ Takes from the 140 characters
Setting profile level locations
⇢ Set the user/location of a Twitter user
⇢ There’s an API for that!
⇢ Not a per-tweet basis
⇢ Not intended for high frequency alterations
Profile level changes
 [] raffi@~/: twurl -d location="San Francisco, California" 
 http://twitter.com/account/update_location.xml

 <user>
   <id>8285392</id>
   <name>raffi</name>
   <screen_name>raffi</screen_name>
   <location>San Francisco, California</location>
   ...
 </user>
Geotagging API
Geotagging API
Adding it to the tweet
⇢ Per-tweet basis
⇢ Out of band and pure metadata
⇢ Does not take from the 140 characters
Native Twitter support
⇢ Simple way to update status with location data
⇢ Ability to remove geotags from your tweets en masse
⇢ Using GeoRSS and GeoJSON as the encoding format
⇢ Across all Twitter APIs (REST, Search, and Streaming)
status/update
 [] raffi@~/: twurl -d "status=hey-ho&lat=37.3&long=-121.9" 
 http://api.twitter.com/1/status/update.xml
 <status>
   <text>hey-ho</text>
   ...
   <geo xmlns:georss="http://www.georss.org/georss>
     <georss:point>37.3 -121.9</georss:point>
   </geo>
   ...
 </user>
geocode
                                     “latitud       parame
Search                                        e,longit
                                        radius h      ude,rad
                                                 as units
                                                             ter take
                                                                     s
                                                               ius” wh
                                                          of mi or     ere
                                                                    km
 [] raffi@~/: curl "http://search.twitter.com/search.atom?
 geocode=40.757929%2C-73.985506%2C25km&source=foursquare"
 ...
 <title>On the way to ace now, so whenever you can make it I'll be
 there. (@ Port Imperial Ferry in Weehawken) http://4sq.com/
 2rq0vO</title>
 ...
 <twitter:geo>
    <georss:point>40.7759 -74.0129</georss:point>
 </twitter:geo>
 ...
geohose
location filtering
 [] raffi@~/: curl "http://stream.twitter.com/1/statuses/filter.xml?
 locations=-74.5129,40.2759,-73.5019,41.2759"




              locations is a b
                               ounding box s
               “long1,lat1,lon                 pecified by
                              g2,lat2” and ca
               to 10 location                 n track up
                               s that are mos
              square (~60 m                   t 1 degree
                                iles square an
                to cover most                 d enough
                                 metropolitan
                                                areas)
Trends API
Trends API
Global Trends
⇢Analysis of “hot conversations”
⇢Does not take from the 140 characters
Location specific trends
⇢Tweets being localized through a variety of means internally
⇢Locations exposed over the API as WOEIDs and Twitter IDs
⇢Can ask for available trends sorted by distnace
available locations
 [] raffi@~/: curl "http://api.twitter.com/1/trends/available.xml"
 <locations type=”array”>
    <location>
        <woeid>2487956</woeid>
        <name>San Francisco</name>
        <placeTypeName code=”7”>Town</placeTypeName>
        <country type=”Country” code=”US”>United States</country>
        <url>http://where.yahooapis.com/v1/place/2487956</url>
                                                           ke a lat and long
                                                   nally ta
    </location>
                                        C an optio                trends
                                                         to have
    ...
                                             parameter              ted, as
                                                            ed, sor
 </locations>
                                          location s return
                                                dista nce from you.
Look up a tren
a Local trend                                        WOEID
                                                           d at a given


 [] raffi@~/: curl "http://api.twitter.com/1/trends/2487956.xml"
 <matching_trends type=”array”>
   <trends as_of=”2009-12-15T20:19:09Z”>
     ...
        <trend url=”http://search.twitter.com/search?q=Golden+Globe
 +nominations” query=”Golden+Globe+nominations”>Golden Globe nominations</
 trend>
        <trend url=”http://search.twitter.com/search?q=%23somethingaintright”
 query=”%23somethingaintright”>#somethingaintright</trend>
     ...
   </trends>
 </matching_trends>
What’s in a name?
A place is a name
5001 Great America Parkway, Santa Clara, CA 95054
Great America Parkway and Tasman Drive
The Bay Area
Santa Clara convention center
Twitter ID 3b7dd0d93e661e18
how do users what to share “where”?
Sharing coordinates
More aptly named “geotagging”
Good for sharing photos
Possibly good for talking about a specific place
(e.g. store, restaurant)
People don’t understand numbers and without
a map, there is a lack of context
Huge privacy implications
Sharing polygons
Privacy implications are
potentially better
If you thought sharing one pair
of numbers was bad...
Questions around polygon
definition
Still unable to visualize unless
on a map
Sharing names

Has the potential to make a connection with users
Distinguishes a “named place” from simply a “place”
Inverse relationship between granularity and connection
Rather large internationalization / context implications
Geo-place API
Geo-place API
Support for “names”
⇢Not just coordinates
⇢More contextually relevant
⇢Positive privacy benefits
Increased comlexity
⇢Need to be able to look up a list of places
⇢Requires a “reverse geocoder”
⇢Human driven tagging and not possible to be fully automatic
Search
  [] raffi@~/: curl http://api.twitter.com/1/geo/search.json&lat=37.3&long=-121.9
    ...
    "place_type":"neighborhood",
    "country_code":"US",
    "contained_within": [...]
    "full_name":"Willow Glen",
    "bounding_box": {
      "type":"Polygon",
      "coordinates": [[
        [-121.92481908, 37.275903], [-121.88083608, 37.275903],
        [-121.88083608, 37.31548203], [-121.92481908, 37.31548203]
      ]]
    },
    "name":"Willow Glen",
    "id":"46bc64ecd1da2a46",
    ...
Tweeting with a place
     [] raffi@~/: twurl -d "status=hey-ho&place_id=46bc64ecd1da2a46" 
     http://api.twitter.com/1/status/update.xml
     <status>
       <text>hey-ho</text>
       ...
       <place xmlns:georss="http://www.georss.org/georss>
         <id>46bc64ecd1da2a46</id>
         <name>Willow Glen</name>
         <full_name>Willow Glen</full_name>
         <place_type>neighborhood</place_type>
         <url>http://api.twitter.com/1/geo/id/46bc64ecd1da2a46.json</url>
         <country code=”US”>United States</country>
       </place>
       ...
     </user>
Problem statement
What do we want our system to do?
what do we need to build?
Database of places
⇢Given a real-world location, find places
⇢Spatial search
Method to store places with content
⇢Per user basis
⇢Per tweet basis
spatial lookup and index
as background... MySQL + GIS
Ability to index points and do a spatial query
⇢For example, get points within a bounding rectangle
⇢SELECT MBRContains(GeomFromText(‘Polygon(0            0, 0
  3, 3 3, 3 0, 0 0))’), coord) FROM geometry

Hard to cache the spatial query
Possibly requires a DB hit on every query
options
Grid / quad-tree
⇢ Create a grid (possibly nested) of the entire Earth
Geohash
⇢ Arbitrarily precise and hierarhical spatial data reference
Space filling curves
⇢ Mapping 2D space into 1D while preserving locality
R-Tree
⇢ Spatial access data structure
Grid / Quad-Tree
Grid / Quad-Tree
Recursively subdivide regions
Trie Structure to store
“prefixes”
Spatially oriented data
structure
Geohash
geohash
37o18’N 121o54’W = 9q9k4


Hierarchical spatial data structure
Precision encoded
Distance captured
⇢Nearby places (usually) share the same prefix
⇢The longer the string match, the closer the places are
Geohash
9q9k4 = 01001 / 10110 / 01001 / 10010 / 00100
Longitude bits = 0010100101010
⇢ -90.0 (0), -135.0 (0), -112.5 (1), -123.75 (0), -118.125 (1), -120.9375 (0),
   -122.34375 (0), -121.640625 (1), -121.9921875 (0), -121.81640625 (1),
   -121.904296875 (0), -121.8603515625(1), -121.88232421875 (0) =
   121o53’W


Latitude bits = 1011010100000
⇢ 45.0 (1), 22.5 (0), 33.75 (1), 39.375 (1), 36.5625 (0), 37.96875 (1),
   37.265625 (0), 37.617185 (1), 37.4414025 (0), 37.35351125 (0),
   37.309565625 (0), 37.287692813 (0) = 37  o17’N
Geohash

Possible to do range query in database
⇢Matching based on prefix will return all the points that fit in
 the “grid”
⇢Able to store 2D data in a 1D space
Space filling curve
Space filling curve
Generalization of geohash
⇢2D to 1D mapping
⇢Nearness is captured
Recurisvely can fill up space
depending on resolution required
Fractal-like pattern can be used
to take up as much room as
possiblE
R-Tree
R-Tree
Height-balanced tree data
structure for spatial data
Users hierarchically nested
bounding boxes
nearby elements are placed in the
same node
Representations
GeoRSS / GeoJSON
http://www.georss.org/ & http://geojson.org/
<georss:point>37.3 -121.9</georss:point>

{
    “type”:”Point”,
    “coordinates”:[-121.9, 37.3]
}
How do you store precision?
“Precision” is a hard thing to encode
Accuracy can be encoded with an error radius
Twitter opts for tracking the number of decimals passed
⇢140.0 != 140.00
⇢DecimalTrackingFloat
Twitter infrastructure

Ruby on Rails-ish frontend
Scala-based services backend
MySQL and soon to be Cassandra as the store
RPC to back-end or put items into queues
Simplified architecture

R-Tree for spatial lookup
⇢Data provider for front-end lookups
⇢Store place object with envelope of place in R-Tree
Mapping from ID to place object
Java Toplogy Suite (JTS)
http://www.vividsolutions.com/jts/jtshome.htm
Open source
Good for representing and manipulating “geometries”
Has support for fundamental geometric operations
⇢ contains
⇢ envelope
Has a R-Tree implementation
pointI
       nside
pointO       in pol
      utside        ygon?
             in pol       true
                   ygon?
                          false
at (0.
           0, 0.0
      -- reg      )
   at (1.    ion 1
          0, 1.0
     -- reg      )
            ion 1
     -- reg
  at (2.    ion 2
         0, 2.0
    -- reg      )
           ion 1
    -- reg
 at (3.    ion 2
        0, 3.0
   -- reg      )
at (4.    ion 2
       0, 4.0
  -- emp      )
         ty
Java Topology Suite (JTS)

Serializers and deserializers
⇢Well-known text (WKT)
⇢Well-known binary (WKB)
⇢No GeoRSS or GeoJSON support
interface / RPC
RockDove is a backend service
⇢Data provider for front-end lookups
⇢Uses some form of RPC (Thrift, Avro, etc.) to communicate
 with
⇢Data could be cached on frontend to prevent lookups
Simple RPC interface
⇢get(id)
⇢containedWithin(lat,        long)
Interface / RPC
Watch those RPC queues!
Fail fast and potentially throw “over capacity” messages
⇢get(id) throws OverCapacity
⇢containedWithin(lat, long) throws OverCapacity
Distinguish between write path and read path
georuby
http://georuby.rubyforge.org/
Open source
OpenGIS Simple Features Interface Standard
Only good for representing geometric entities
GeoRuby::SimpleFeatures::Geometry::from_ewkb

No GeoJSON serializers
“front-end”
where do you acutally get location from?
Triangulation: Cellular
200m to 1km accuracy
Measuring signal strength to cell towers with known locations
If can only see one cellular tower, then fallback to cellular tower
identification - better than nothing, but really inaccurate
Requires cellular modem, software, and lookups
Triangulation: Wifi

Sub 20m accuracy
Works indoors and in urban areas
Doesn’t need dedicated hardware just a 802.11 radio
Relatively quick time to get a position
Triangulation: GPS
Sub 1m accuracy
Need dedicated GPS hardware
Prone to multi-path confusion especially in cities
Needs line of sight to the sky
Doesn’t work well indoors
Potentially takes a few minutes to get a lock
Association
IP address to geographical mapping
All done on the server side
Maybe “good” for city level
⇢ Maxmind has 83% at 40km
⇢ Very error prone
⇢ Gets wonky when dealing with cellular
   connections or rather large ISPs

Database needs to be refreshed fairly
frequently
Extraction
Read the text and understand intent
Hard to understand whether talking
from
a place, or about a place
Running text through a geocoder
(Google, Yahoo, Geocoder.us)
Parsing structured URLs and then
crawling “place pages”
location in browser
Geolocation API Specification for JavaScript
navigator.geolocation.getCurrentPosition
Does a callback with a position object
position.coords has
⇢ latitude and longitude
⇢ accuracy
⇢ other stuff
Support in Firefox 3.5, Chrome 5, Opera 10.6, and others with Google Gears
Follow me at
Questions?   twitter.com/raffi

Contenu connexe

En vedette

West Hollywood Residence Phase II
West Hollywood Residence Phase IIWest Hollywood Residence Phase II
West Hollywood Residence Phase IIguest4f02fc0
 
Macroestructura textual Jose Castillo
Macroestructura textual Jose CastilloMacroestructura textual Jose Castillo
Macroestructura textual Jose CastilloJoseCastillo1989
 
Aln alu-presentation-07-feb-2013-final
Aln alu-presentation-07-feb-2013-finalAln alu-presentation-07-feb-2013-final
Aln alu-presentation-07-feb-2013-finalPrasad Prabhakaran
 
Twitter: Engineering for Real-Time (Stanford ACM 2011)
Twitter: Engineering for Real-Time (Stanford ACM 2011)Twitter: Engineering for Real-Time (Stanford ACM 2011)
Twitter: Engineering for Real-Time (Stanford ACM 2011)Raffi Krikorian
 
Securing Your Ecosystem (FOWA Las Vegas 2011)
Securing Your Ecosystem (FOWA Las Vegas 2011)Securing Your Ecosystem (FOWA Las Vegas 2011)
Securing Your Ecosystem (FOWA Las Vegas 2011)Raffi Krikorian
 
Twitter and the Real-Time Web
Twitter and the Real-Time WebTwitter and the Real-Time Web
Twitter and the Real-Time WebRaffi Krikorian
 
Developing for @twitterapi (Techcrunch Disrupt Hackathon)
Developing for @twitterapi (Techcrunch Disrupt Hackathon)Developing for @twitterapi (Techcrunch Disrupt Hackathon)
Developing for @twitterapi (Techcrunch Disrupt Hackathon)Raffi Krikorian
 
Exemples de bones pràctiques: L'evolució
Exemples de bones pràctiques: L'evolucióExemples de bones pràctiques: L'evolució
Exemples de bones pràctiques: L'evolucióCFA Jacint Verdaguer
 
Twitter for CS10 @ Berkeley (Spring 2011)
Twitter for CS10 @ Berkeley (Spring 2011)Twitter for CS10 @ Berkeley (Spring 2011)
Twitter for CS10 @ Berkeley (Spring 2011)Raffi Krikorian
 
Real-time systems at Twitter (Velocity 2012)
Real-time systems at Twitter (Velocity 2012)Real-time systems at Twitter (Velocity 2012)
Real-time systems at Twitter (Velocity 2012)Raffi Krikorian
 

En vedette (13)

THE DECORATIVE DEERHOUND.
THE DECORATIVE DEERHOUND.THE DECORATIVE DEERHOUND.
THE DECORATIVE DEERHOUND.
 
West Hollywood Residence Phase II
West Hollywood Residence Phase IIWest Hollywood Residence Phase II
West Hollywood Residence Phase II
 
Macroestructura textual Jose Castillo
Macroestructura textual Jose CastilloMacroestructura textual Jose Castillo
Macroestructura textual Jose Castillo
 
Aln alu-presentation-07-feb-2013-final
Aln alu-presentation-07-feb-2013-finalAln alu-presentation-07-feb-2013-final
Aln alu-presentation-07-feb-2013-final
 
Sterk autentisering med feide elverum kommune
Sterk autentisering med feide elverum kommuneSterk autentisering med feide elverum kommune
Sterk autentisering med feide elverum kommune
 
Twitter: Engineering for Real-Time (Stanford ACM 2011)
Twitter: Engineering for Real-Time (Stanford ACM 2011)Twitter: Engineering for Real-Time (Stanford ACM 2011)
Twitter: Engineering for Real-Time (Stanford ACM 2011)
 
Securing Your Ecosystem (FOWA Las Vegas 2011)
Securing Your Ecosystem (FOWA Las Vegas 2011)Securing Your Ecosystem (FOWA Las Vegas 2011)
Securing Your Ecosystem (FOWA Las Vegas 2011)
 
Users and Geo
Users and GeoUsers and Geo
Users and Geo
 
Twitter and the Real-Time Web
Twitter and the Real-Time WebTwitter and the Real-Time Web
Twitter and the Real-Time Web
 
Developing for @twitterapi (Techcrunch Disrupt Hackathon)
Developing for @twitterapi (Techcrunch Disrupt Hackathon)Developing for @twitterapi (Techcrunch Disrupt Hackathon)
Developing for @twitterapi (Techcrunch Disrupt Hackathon)
 
Exemples de bones pràctiques: L'evolució
Exemples de bones pràctiques: L'evolucióExemples de bones pràctiques: L'evolució
Exemples de bones pràctiques: L'evolució
 
Twitter for CS10 @ Berkeley (Spring 2011)
Twitter for CS10 @ Berkeley (Spring 2011)Twitter for CS10 @ Berkeley (Spring 2011)
Twitter for CS10 @ Berkeley (Spring 2011)
 
Real-time systems at Twitter (Velocity 2012)
Real-time systems at Twitter (Velocity 2012)Real-time systems at Twitter (Velocity 2012)
Real-time systems at Twitter (Velocity 2012)
 

Similaire à #rtgeo (Where 2.0 2011)

Handling Real-time Geostreams
Handling Real-time GeostreamsHandling Real-time Geostreams
Handling Real-time Geostreamsguest35660bc
 
Handling Real-time Geostreams
Handling Real-time GeostreamsHandling Real-time Geostreams
Handling Real-time GeostreamsRaffi Krikorian
 
Beyond Google Maps - FOWA 2008 London
Beyond Google Maps - FOWA 2008 LondonBeyond Google Maps - FOWA 2008 London
Beyond Google Maps - FOWA 2008 LondonAndrew Turner
 
Where20 2008 Ruby Tutorial
Where20 2008 Ruby TutorialWhere20 2008 Ruby Tutorial
Where20 2008 Ruby TutorialShoaib Burq
 
Beyond Googlemaps - Andrew Turner
Beyond Googlemaps - Andrew TurnerBeyond Googlemaps - Andrew Turner
Beyond Googlemaps - Andrew TurnerCarsonified Team
 
Os Fetterupdated
Os FetterupdatedOs Fetterupdated
Os Fetterupdatedoscon2007
 
Terraform at Scale - All Day DevOps 2017
Terraform at Scale - All Day DevOps 2017Terraform at Scale - All Day DevOps 2017
Terraform at Scale - All Day DevOps 2017Jonathon Brouse
 
GDM 2011 - Neo4j and real world apps.
GDM 2011 - Neo4j and real world apps.GDM 2011 - Neo4j and real world apps.
GDM 2011 - Neo4j and real world apps.Peter Neubauer
 
Cleanliness is Next to Domain-Specificity
Cleanliness is Next to Domain-SpecificityCleanliness is Next to Domain-Specificity
Cleanliness is Next to Domain-SpecificityBen Scofield
 
Cleanliness is Next to Domain-Specificity
Cleanliness is Next to Domain-SpecificityCleanliness is Next to Domain-Specificity
Cleanliness is Next to Domain-SpecificityViget Labs
 
Tutorial ESWC2011 Building Semantic Sensor Web - 04 - Querying_semantic_strea...
Tutorial ESWC2011 Building Semantic Sensor Web - 04 - Querying_semantic_strea...Tutorial ESWC2011 Building Semantic Sensor Web - 04 - Querying_semantic_strea...
Tutorial ESWC2011 Building Semantic Sensor Web - 04 - Querying_semantic_strea...Jean-Paul Calbimonte
 
RefreshDC - The How Of Geo
RefreshDC - The How Of GeoRefreshDC - The How Of Geo
RefreshDC - The How Of GeoAndrew Turner
 
Real-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsReal-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsJoshua Shinavier
 
Cloud burst tutorial
Cloud burst tutorialCloud burst tutorial
Cloud burst tutorial주영 송
 
Shkrubbel for Open Web Camp 3
Shkrubbel for Open Web Camp 3Shkrubbel for Open Web Camp 3
Shkrubbel for Open Web Camp 3kitthod
 
Hive dirty/beautiful hacks in TD
Hive dirty/beautiful hacks in TDHive dirty/beautiful hacks in TD
Hive dirty/beautiful hacks in TDSATOSHI TAGOMORI
 
NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...
NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...
NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...InfiniteGraph
 

Similaire à #rtgeo (Where 2.0 2011) (20)

Handling Real-time Geostreams
Handling Real-time GeostreamsHandling Real-time Geostreams
Handling Real-time Geostreams
 
Handling Real-time Geostreams
Handling Real-time GeostreamsHandling Real-time Geostreams
Handling Real-time Geostreams
 
What's happening here?
What's happening here?What's happening here?
What's happening here?
 
Where in the world
Where in the worldWhere in the world
Where in the world
 
Beyond Google Maps - FOWA 2008 London
Beyond Google Maps - FOWA 2008 LondonBeyond Google Maps - FOWA 2008 London
Beyond Google Maps - FOWA 2008 London
 
Where20 2008 Ruby Tutorial
Where20 2008 Ruby TutorialWhere20 2008 Ruby Tutorial
Where20 2008 Ruby Tutorial
 
Beyond Googlemaps - Andrew Turner
Beyond Googlemaps - Andrew TurnerBeyond Googlemaps - Andrew Turner
Beyond Googlemaps - Andrew Turner
 
Os Fetterupdated
Os FetterupdatedOs Fetterupdated
Os Fetterupdated
 
Terraform at Scale - All Day DevOps 2017
Terraform at Scale - All Day DevOps 2017Terraform at Scale - All Day DevOps 2017
Terraform at Scale - All Day DevOps 2017
 
GDM 2011 - Neo4j and real world apps.
GDM 2011 - Neo4j and real world apps.GDM 2011 - Neo4j and real world apps.
GDM 2011 - Neo4j and real world apps.
 
Cleanliness is Next to Domain-Specificity
Cleanliness is Next to Domain-SpecificityCleanliness is Next to Domain-Specificity
Cleanliness is Next to Domain-Specificity
 
Cleanliness is Next to Domain-Specificity
Cleanliness is Next to Domain-SpecificityCleanliness is Next to Domain-Specificity
Cleanliness is Next to Domain-Specificity
 
Tutorial ESWC2011 Building Semantic Sensor Web - 04 - Querying_semantic_strea...
Tutorial ESWC2011 Building Semantic Sensor Web - 04 - Querying_semantic_strea...Tutorial ESWC2011 Building Semantic Sensor Web - 04 - Querying_semantic_strea...
Tutorial ESWC2011 Building Semantic Sensor Web - 04 - Querying_semantic_strea...
 
RefreshDC - The How Of Geo
RefreshDC - The How Of GeoRefreshDC - The How Of Geo
RefreshDC - The How Of Geo
 
Real-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsReal-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter Annotations
 
Cloud burst tutorial
Cloud burst tutorialCloud burst tutorial
Cloud burst tutorial
 
Shkrubbel for Open Web Camp 3
Shkrubbel for Open Web Camp 3Shkrubbel for Open Web Camp 3
Shkrubbel for Open Web Camp 3
 
Hive dirty/beautiful hacks in TD
Hive dirty/beautiful hacks in TDHive dirty/beautiful hacks in TD
Hive dirty/beautiful hacks in TD
 
Kamailio - API Based SIP Routing
Kamailio - API Based SIP RoutingKamailio - API Based SIP Routing
Kamailio - API Based SIP Routing
 
NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...
NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...
NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...
 

Plus de Raffi Krikorian

How to use Geolocation in your webapp @ FOWA Dublin 2010
How to use Geolocation in your webapp @ FOWA Dublin 2010How to use Geolocation in your webapp @ FOWA Dublin 2010
How to use Geolocation in your webapp @ FOWA Dublin 2010Raffi Krikorian
 
Intro to developing for @twitterapi
Intro to developing for @twitterapiIntro to developing for @twitterapi
Intro to developing for @twitterapiRaffi Krikorian
 
"What's Happening" to "What's Happening Here" @ Chirp
"What's Happening" to "What's Happening Here" @ Chirp"What's Happening" to "What's Happening Here" @ Chirp
"What's Happening" to "What's Happening Here" @ ChirpRaffi Krikorian
 
Adding the "Where" to the "When"
Adding the "Where" to the "When"Adding the "Where" to the "When"
Adding the "Where" to the "When"Raffi Krikorian
 
Scala + WattzOn, sitting in a tree....
Scala + WattzOn, sitting in a tree....Scala + WattzOn, sitting in a tree....
Scala + WattzOn, sitting in a tree....Raffi Krikorian
 
WattzOn Whole Earth Simulator
WattzOn Whole Earth SimulatorWattzOn Whole Earth Simulator
WattzOn Whole Earth SimulatorRaffi Krikorian
 
Broken Hearts: How Valentine's Day causes global warming
Broken Hearts: How Valentine's Day causes global warmingBroken Hearts: How Valentine's Day causes global warming
Broken Hearts: How Valentine's Day causes global warmingRaffi Krikorian
 
WattzOn presentation @ Web 2.0 Summit
WattzOn presentation @ Web 2.0 SummitWattzOn presentation @ Web 2.0 Summit
WattzOn presentation @ Web 2.0 SummitRaffi Krikorian
 

Plus de Raffi Krikorian (14)

500Startups @ Twitter
500Startups @ Twitter500Startups @ Twitter
500Startups @ Twitter
 
Twitter by the Numbers
Twitter by the NumbersTwitter by the Numbers
Twitter by the Numbers
 
How to use Geolocation in your webapp @ FOWA Dublin 2010
How to use Geolocation in your webapp @ FOWA Dublin 2010How to use Geolocation in your webapp @ FOWA Dublin 2010
How to use Geolocation in your webapp @ FOWA Dublin 2010
 
Intro to developing for @twitterapi
Intro to developing for @twitterapiIntro to developing for @twitterapi
Intro to developing for @twitterapi
 
Twitter API Annotations
Twitter API AnnotationsTwitter API Annotations
Twitter API Annotations
 
"What's Happening" to "What's Happening Here" @ Chirp
"What's Happening" to "What's Happening Here" @ Chirp"What's Happening" to "What's Happening Here" @ Chirp
"What's Happening" to "What's Happening Here" @ Chirp
 
Energy / Tweet
Energy / TweetEnergy / Tweet
Energy / Tweet
 
Adding the "Where" to the "When"
Adding the "Where" to the "When"Adding the "Where" to the "When"
Adding the "Where" to the "When"
 
WattzOn @ ETech 2009
WattzOn @ ETech 2009WattzOn @ ETech 2009
WattzOn @ ETech 2009
 
Scala + WattzOn, sitting in a tree....
Scala + WattzOn, sitting in a tree....Scala + WattzOn, sitting in a tree....
Scala + WattzOn, sitting in a tree....
 
WattzOn Whole Earth Simulator
WattzOn Whole Earth SimulatorWattzOn Whole Earth Simulator
WattzOn Whole Earth Simulator
 
Broken Hearts: How Valentine's Day causes global warming
Broken Hearts: How Valentine's Day causes global warmingBroken Hearts: How Valentine's Day causes global warming
Broken Hearts: How Valentine's Day causes global warming
 
WattzOn presentation @ Web 2.0 Summit
WattzOn presentation @ Web 2.0 SummitWattzOn presentation @ Web 2.0 Summit
WattzOn presentation @ Web 2.0 Summit
 
holmz @ Ignite! NYC
holmz @ Ignite! NYCholmz @ Ignite! NYC
holmz @ Ignite! NYC
 

Dernier

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 

Dernier (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

#rtgeo (Where 2.0 2011)

  • 3. Giving a real-t ime geo talk at @where 20. How do you build stuff? #rtgeo. 19 Apr via Twitter for iPhone from Santa Clara Convention Center 50 01 Great America Parkway Santa Clara, CA 95054 View Tweets at this place
  • 4. Background [] raffi@ wherehoo wherehoo ~/: cat / etc/servi 5859/udp ces | gre # WHEREHO p whereho o 5859/tcp O Wherehoo (2000) # WHEREHO O ⇢ “The Stuff Around You” ⇢ “Wherehoo Server: An interactive location service for software agents and intelligent systems” - J.Youll, R.Krikorian ⇢ In your /etc/services file! BusRadio (2004) ⇢ Designed mobile computers to play media while also transmitting telemetry ⇢ Looked and sounded like a radio - but really a Linux computer OneHop (2007) ⇢ Bluetooth proximity-based social networking
  • 5. Background Twitter ⇢Originally tech lead of API / Platform team ⇢Built the first geo-based infrastructure before acquisition of Mixer Labs in December of 2009 ⇢Now lead of the Application Services group ⇢Runs five teams focused on scalable infrastructure around “core” data objects ⇢Tweets, users, timelines, places, etc. ⇢Delivery, authentication, APIs, etc.
  • 6.
  • 7. Table of contents Background ⇢ Why are we interested in this? Twitter’s geo APIs ⇢ How do we allow people to talk about place? ⇢ Context around “place” Problem statement ⇢ What do we want our system to do? Infrastructure ⇢ How is Twitter solving this problem?
  • 8. People want to talk about places
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 15. Original attempts Adding it to the tweet ⇢ Use myloc.me, et. al. to add text to the tweet ⇢ Puts location “in band” ⇢ Takes from the 140 characters Setting profile level locations ⇢ Set the user/location of a Twitter user ⇢ There’s an API for that! ⇢ Not a per-tweet basis ⇢ Not intended for high frequency alterations
  • 16.
  • 17. Profile level changes [] raffi@~/: twurl -d location="San Francisco, California" http://twitter.com/account/update_location.xml <user> <id>8285392</id> <name>raffi</name> <screen_name>raffi</screen_name> <location>San Francisco, California</location> ... </user>
  • 19. Geotagging API Adding it to the tweet ⇢ Per-tweet basis ⇢ Out of band and pure metadata ⇢ Does not take from the 140 characters Native Twitter support ⇢ Simple way to update status with location data ⇢ Ability to remove geotags from your tweets en masse ⇢ Using GeoRSS and GeoJSON as the encoding format ⇢ Across all Twitter APIs (REST, Search, and Streaming)
  • 20. status/update [] raffi@~/: twurl -d "status=hey-ho&lat=37.3&long=-121.9" http://api.twitter.com/1/status/update.xml <status> <text>hey-ho</text> ... <geo xmlns:georss="http://www.georss.org/georss> <georss:point>37.3 -121.9</georss:point> </geo> ... </user>
  • 21. geocode “latitud parame Search e,longit radius h ude,rad as units ter take s ius” wh of mi or ere km [] raffi@~/: curl "http://search.twitter.com/search.atom? geocode=40.757929%2C-73.985506%2C25km&source=foursquare" ... <title>On the way to ace now, so whenever you can make it I'll be there. (@ Port Imperial Ferry in Weehawken) http://4sq.com/ 2rq0vO</title> ... <twitter:geo> <georss:point>40.7759 -74.0129</georss:point> </twitter:geo> ...
  • 22.
  • 23.
  • 24.
  • 25.
  • 27. location filtering [] raffi@~/: curl "http://stream.twitter.com/1/statuses/filter.xml? locations=-74.5129,40.2759,-73.5019,41.2759" locations is a b ounding box s “long1,lat1,lon pecified by g2,lat2” and ca to 10 location n track up s that are mos square (~60 m t 1 degree iles square an to cover most d enough metropolitan areas)
  • 28.
  • 30.
  • 31. Trends API Global Trends ⇢Analysis of “hot conversations” ⇢Does not take from the 140 characters Location specific trends ⇢Tweets being localized through a variety of means internally ⇢Locations exposed over the API as WOEIDs and Twitter IDs ⇢Can ask for available trends sorted by distnace
  • 32. available locations [] raffi@~/: curl "http://api.twitter.com/1/trends/available.xml" <locations type=”array”> <location> <woeid>2487956</woeid> <name>San Francisco</name> <placeTypeName code=”7”>Town</placeTypeName> <country type=”Country” code=”US”>United States</country> <url>http://where.yahooapis.com/v1/place/2487956</url> ke a lat and long nally ta </location> C an optio trends to have ... parameter ted, as ed, sor </locations> location s return dista nce from you.
  • 33. Look up a tren a Local trend WOEID d at a given [] raffi@~/: curl "http://api.twitter.com/1/trends/2487956.xml" <matching_trends type=”array”> <trends as_of=”2009-12-15T20:19:09Z”> ... <trend url=”http://search.twitter.com/search?q=Golden+Globe +nominations” query=”Golden+Globe+nominations”>Golden Globe nominations</ trend> <trend url=”http://search.twitter.com/search?q=%23somethingaintright” query=”%23somethingaintright”>#somethingaintright</trend> ... </trends> </matching_trends>
  • 34. What’s in a name?
  • 35. A place is a name 5001 Great America Parkway, Santa Clara, CA 95054 Great America Parkway and Tasman Drive The Bay Area Santa Clara convention center Twitter ID 3b7dd0d93e661e18
  • 36. how do users what to share “where”?
  • 37. Sharing coordinates More aptly named “geotagging” Good for sharing photos Possibly good for talking about a specific place (e.g. store, restaurant) People don’t understand numbers and without a map, there is a lack of context Huge privacy implications
  • 38. Sharing polygons Privacy implications are potentially better If you thought sharing one pair of numbers was bad... Questions around polygon definition Still unable to visualize unless on a map
  • 39. Sharing names Has the potential to make a connection with users Distinguishes a “named place” from simply a “place” Inverse relationship between granularity and connection Rather large internationalization / context implications
  • 41. Geo-place API Support for “names” ⇢Not just coordinates ⇢More contextually relevant ⇢Positive privacy benefits Increased comlexity ⇢Need to be able to look up a list of places ⇢Requires a “reverse geocoder” ⇢Human driven tagging and not possible to be fully automatic
  • 42. Search [] raffi@~/: curl http://api.twitter.com/1/geo/search.json&lat=37.3&long=-121.9 ... "place_type":"neighborhood", "country_code":"US", "contained_within": [...] "full_name":"Willow Glen", "bounding_box": { "type":"Polygon", "coordinates": [[ [-121.92481908, 37.275903], [-121.88083608, 37.275903], [-121.88083608, 37.31548203], [-121.92481908, 37.31548203] ]] }, "name":"Willow Glen", "id":"46bc64ecd1da2a46", ...
  • 43. Tweeting with a place [] raffi@~/: twurl -d "status=hey-ho&place_id=46bc64ecd1da2a46" http://api.twitter.com/1/status/update.xml <status> <text>hey-ho</text> ... <place xmlns:georss="http://www.georss.org/georss> <id>46bc64ecd1da2a46</id> <name>Willow Glen</name> <full_name>Willow Glen</full_name> <place_type>neighborhood</place_type> <url>http://api.twitter.com/1/geo/id/46bc64ecd1da2a46.json</url> <country code=”US”>United States</country> </place> ... </user>
  • 44.
  • 45. Problem statement What do we want our system to do?
  • 46. what do we need to build? Database of places ⇢Given a real-world location, find places ⇢Spatial search Method to store places with content ⇢Per user basis ⇢Per tweet basis
  • 48. as background... MySQL + GIS Ability to index points and do a spatial query ⇢For example, get points within a bounding rectangle ⇢SELECT MBRContains(GeomFromText(‘Polygon(0 0, 0 3, 3 3, 3 0, 0 0))’), coord) FROM geometry Hard to cache the spatial query Possibly requires a DB hit on every query
  • 49. options Grid / quad-tree ⇢ Create a grid (possibly nested) of the entire Earth Geohash ⇢ Arbitrarily precise and hierarhical spatial data reference Space filling curves ⇢ Mapping 2D space into 1D while preserving locality R-Tree ⇢ Spatial access data structure
  • 51. Grid / Quad-Tree Recursively subdivide regions Trie Structure to store “prefixes” Spatially oriented data structure
  • 53. geohash 37o18’N 121o54’W = 9q9k4 Hierarchical spatial data structure Precision encoded Distance captured ⇢Nearby places (usually) share the same prefix ⇢The longer the string match, the closer the places are
  • 54. Geohash 9q9k4 = 01001 / 10110 / 01001 / 10010 / 00100 Longitude bits = 0010100101010 ⇢ -90.0 (0), -135.0 (0), -112.5 (1), -123.75 (0), -118.125 (1), -120.9375 (0), -122.34375 (0), -121.640625 (1), -121.9921875 (0), -121.81640625 (1), -121.904296875 (0), -121.8603515625(1), -121.88232421875 (0) = 121o53’W Latitude bits = 1011010100000 ⇢ 45.0 (1), 22.5 (0), 33.75 (1), 39.375 (1), 36.5625 (0), 37.96875 (1), 37.265625 (0), 37.617185 (1), 37.4414025 (0), 37.35351125 (0), 37.309565625 (0), 37.287692813 (0) = 37 o17’N
  • 55. Geohash Possible to do range query in database ⇢Matching based on prefix will return all the points that fit in the “grid” ⇢Able to store 2D data in a 1D space
  • 57. Space filling curve Generalization of geohash ⇢2D to 1D mapping ⇢Nearness is captured Recurisvely can fill up space depending on resolution required Fractal-like pattern can be used to take up as much room as possiblE
  • 59. R-Tree Height-balanced tree data structure for spatial data Users hierarchically nested bounding boxes nearby elements are placed in the same node
  • 61. GeoRSS / GeoJSON http://www.georss.org/ & http://geojson.org/ <georss:point>37.3 -121.9</georss:point> { “type”:”Point”, “coordinates”:[-121.9, 37.3] }
  • 62. How do you store precision? “Precision” is a hard thing to encode Accuracy can be encoded with an error radius Twitter opts for tracking the number of decimals passed ⇢140.0 != 140.00 ⇢DecimalTrackingFloat
  • 63.
  • 64.
  • 65.
  • 66. Twitter infrastructure Ruby on Rails-ish frontend Scala-based services backend MySQL and soon to be Cassandra as the store RPC to back-end or put items into queues
  • 67.
  • 68. Simplified architecture R-Tree for spatial lookup ⇢Data provider for front-end lookups ⇢Store place object with envelope of place in R-Tree Mapping from ID to place object
  • 69. Java Toplogy Suite (JTS) http://www.vividsolutions.com/jts/jtshome.htm Open source Good for representing and manipulating “geometries” Has support for fundamental geometric operations ⇢ contains ⇢ envelope Has a R-Tree implementation
  • 70. pointI nside pointO in pol utside ygon? in pol true ygon? false
  • 71. at (0. 0, 0.0 -- reg ) at (1. ion 1 0, 1.0 -- reg ) ion 1 -- reg at (2. ion 2 0, 2.0 -- reg ) ion 1 -- reg at (3. ion 2 0, 3.0 -- reg ) at (4. ion 2 0, 4.0 -- emp ) ty
  • 72. Java Topology Suite (JTS) Serializers and deserializers ⇢Well-known text (WKT) ⇢Well-known binary (WKB) ⇢No GeoRSS or GeoJSON support
  • 73. interface / RPC RockDove is a backend service ⇢Data provider for front-end lookups ⇢Uses some form of RPC (Thrift, Avro, etc.) to communicate with ⇢Data could be cached on frontend to prevent lookups Simple RPC interface ⇢get(id) ⇢containedWithin(lat, long)
  • 74.
  • 75. Interface / RPC Watch those RPC queues! Fail fast and potentially throw “over capacity” messages ⇢get(id) throws OverCapacity ⇢containedWithin(lat, long) throws OverCapacity Distinguish between write path and read path
  • 76. georuby http://georuby.rubyforge.org/ Open source OpenGIS Simple Features Interface Standard Only good for representing geometric entities GeoRuby::SimpleFeatures::Geometry::from_ewkb No GeoJSON serializers
  • 77.
  • 79. where do you acutally get location from?
  • 80. Triangulation: Cellular 200m to 1km accuracy Measuring signal strength to cell towers with known locations If can only see one cellular tower, then fallback to cellular tower identification - better than nothing, but really inaccurate Requires cellular modem, software, and lookups
  • 81. Triangulation: Wifi Sub 20m accuracy Works indoors and in urban areas Doesn’t need dedicated hardware just a 802.11 radio Relatively quick time to get a position
  • 82. Triangulation: GPS Sub 1m accuracy Need dedicated GPS hardware Prone to multi-path confusion especially in cities Needs line of sight to the sky Doesn’t work well indoors Potentially takes a few minutes to get a lock
  • 83. Association IP address to geographical mapping All done on the server side Maybe “good” for city level ⇢ Maxmind has 83% at 40km ⇢ Very error prone ⇢ Gets wonky when dealing with cellular connections or rather large ISPs Database needs to be refreshed fairly frequently
  • 84. Extraction Read the text and understand intent Hard to understand whether talking from a place, or about a place Running text through a geocoder (Google, Yahoo, Geocoder.us) Parsing structured URLs and then crawling “place pages”
  • 85. location in browser Geolocation API Specification for JavaScript navigator.geolocation.getCurrentPosition Does a callback with a position object position.coords has ⇢ latitude and longitude ⇢ accuracy ⇢ other stuff Support in Firefox 3.5, Chrome 5, Opera 10.6, and others with Google Gears
  • 86.
  • 87. Follow me at Questions? twitter.com/raffi