The document discusses integrating disparate data from various sources like email, calendars, contacts, and social media into a single experience. It identifies challenges like different data formats, change notifications, and standard entity structures. The solution involves enhancing entities by sourcing attributes from multiple endpoints, processing and making sense of the data over time, and tracking changes historically. Key aspects are enhancers that execute requests, caching results, and using rules and scoring for recursion and accuracy.
2. the WHY? What we believe in… All your important people already reside in email, calendar, contact lists, social sites The web is a rich source of information about the people you care about One tool should exist that can pull all this together in a single, rich, integrated experience
3. Pain Points (External) Disparate Data/API sources and protocols e.g. GNIP Change notification (when/what) e.g. Linked Open Data Dataset Dynamics, pubsubhub Standard entity data structures e.g. Portable Contacts, vcard, hcard 3
4. The Problem (Internal) Need a single, disambiguated set of entities where an entity itself contains accurate/disambiguated attributes Entity attributes can be sourced from one or more endpoints Email Twitter/Facebook Calendar Google Contacts, Outlook Contacts, Plaxo Google Social Graph API Rapleaf API
5. The Problem (Internal) Now that we have this data, we need to process and make sense of it Need to support reoccurring updates Merge and unmerge support Recursive derivation is a huge win if done correctly Historical Tracking is necessary both to drive operations but also for debugging (and it’s a cool user feature)
6. How we did it Enhancers Execute the request and creation of attribute data Can be called synch or asynch Cached, Logged, Rate Limited Meta data about attributes Source, Source Type, When created, Derived?, Derived Source, Score Rules for ‘enhancement’ Rules for recursion Scoring methodology (accuracy and relative prioritization) 6
7. Example – Email Enhancer “Brad Feld” vs “Brad” Data/Time Score State Value
8. Key Takeaways Worry about integration both external and internal to your application Lots of good work on the external issues…take advantage of it! Create a strong object model for internal data representation (workers, meta data, engines) so you can perform concise/discrete operations
9. Additional Info GIST API coming out this Summer Direct interface to Fragments Standard and Third party Enhancer support @stevepnewman, @gist
10. « We know now that the source of wealth is something specificallyhuman : knowledge. Applied to tasksthatwealready know how to do, itbecomes'productivity'. Applied to tasksthat are new and differentwe call it'innovation'. Onlyknowledgeallows us to achievethesetwo goals. » Peter Drucker Management challenges of the XXIst Century-1999