Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
Benchmark Index is BACK!
Next
Download to read offline and view in fullscreen.

Share

Glue Conference

Download to read offline

Related Books

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

Glue Conference

  1. 1. Integrating Disparate Data May 27, 2010<br />Steve Newman – CTO/Gist.com<br />
  2. 2. the WHY? What we believe in…<br />All your important people already reside in email, calendar, contact lists, social sites<br />The web is a rich source of information about the people you care about<br />One tool should exist that can pull all this together in a single, rich, integrated experience<br />
  3. 3. Pain Points (External)<br />Disparate Data/API sources and protocols<br />e.g. GNIP<br />Change notification (when/what)<br />e.g. Linked Open Data Dataset Dynamics, pubsubhub<br />Standard entity data structures<br />e.g. Portable Contacts, vcard, hcard<br />3<br />
  4. 4. The Problem (Internal)<br />Need a single, disambiguated set of entities where an entity itself contains accurate/disambiguated attributes<br />Entity attributes can be sourced from one or more endpoints<br />Email<br />Twitter/Facebook <br />Calendar<br />Google Contacts, Outlook Contacts, Plaxo<br />Google Social Graph API<br />Rapleaf API<br />
  5. 5. The Problem (Internal)<br />Now that we have this data, we need to process and make sense of it<br />Need to support reoccurring updates<br />Merge and unmerge support<br />Recursive derivation is a huge win if done correctly<br />Historical Tracking is necessary both to drive operations but also for debugging (and it’s a cool user feature)<br />
  6. 6. How we did it<br />Enhancers<br />Execute the request and creation of attribute data<br />Can be called synch or asynch<br />Cached, Logged, Rate Limited<br />Meta data about attributes<br />Source, Source Type, When created, Derived?, Derived Source, Score<br />Rules for ‘enhancement’<br />Rules for recursion<br />Scoring methodology (accuracy and relative prioritization)<br />6<br />
  7. 7. Example – Email Enhancer<br />“Brad Feld” vs “Brad”<br />Data/Time<br />Score<br />State<br />Value<br />
  8. 8. Key Takeaways<br />Worry about integration both external and internal to your application<br />Lots of good work on the external issues…take advantage of it!<br />Create a strong object model for internal data representation (workers, meta data, engines) so you can perform concise/discrete operations<br />
  9. 9. Additional Info<br />GIST API coming out this Summer<br />Direct interface to Fragments <br />Standard and Third party Enhancer support<br />@stevepnewman, @gist <br />
  10. 10. « We know now that the source of wealth is something specificallyhuman : knowledge. Applied to tasksthatwealready know how to do, itbecomes'productivity'. Applied to tasksthat are new and differentwe call it'innovation'. Onlyknowledgeallows us to achievethesetwo goals. »<br />Peter Drucker<br />Management challenges of the XXIst Century-1999<br />

Views

Total views

696

On Slideshare

0

From embeds

0

Number of embeds

2

Actions

Downloads

2

Shares

0

Comments

0

Likes

0

×