Lecture3: What is the DATA on the Social Web (VU Amsterdam Social Web Course)
1. Lecture III: What DATA looks like on the Social Web?
Davide Ceolin and Lora Aroyo
The Network Institute
VU University Amsterdam
Social Web
2015
Social Web 2015, Lora Aroyo and Davide Ceolin
2. Assignment 1: Q& A
• Your own vision based on your analysis of what are the prime privacy-
related issues & initiatives on the (Social) Web.
• Summarise all the legal contexts for privacy & ownership.
• Compare initiatives according to their advantages & disadvantages.
Include also your own advise to policy makers (position).
• Use all the mind maps from lecture 1 and 2.You can merge everything
into one mind map for this assignment.
• Write for people who didn’t attend the course
• All visuals, e.g. screenshots, diagrams should be included in appendix
• Submit only 1 file in PDF
Social Web 2015, Lora Aroyo and Davide Ceolin
3. History of Blogs
• evolved from online diaries in 1980’s
• ‘weblog’ Jorn Barger (1997) & ‘BLOG’ Peter Merholz (1999)
• one of the first ways to contribute (unstructured user-
generated) content on the Web
• Justin Hall recognized as pioneer blogger (1994)
• Nature: political, technical, art, journalistic, cultural, personal
• Software: WordPress, Blogger, LifeJournal
Social Web 2015, Lora Aroyo and Davide Ceolin
4. • single- or multi-authored
• photo-blog,Video-blog, Audio-blog
• life (b)log, now - microlifeblog (twitter)
• lifecasting: in 2007 by Justin Kan: webcam on a cap
• Gordon Bell MyLifeBits: Microsoft SenseCam http://www.justin.tv/
http://research.microsoft.com/en-us/projects/mylifebits/
Types of Blogs
Social Web 2015, Lora Aroyo and Davide Ceolin
5. http://www.flickr.com/photos/kables/1220574200/
• Wiki in Hawaiian meaning fast/quick
• "the simplest online database that could possibly work" (Ward Cunningham)1995
• first wiki software: WikiWikiWeb (the QuickWeb)
• first example for a large scale collaborative editing = software + process
• commonly implemented software package is MediaWiki (known from Wikipedia)
• pages structure & formatting: simplified markup language - wikitext, or HTMLtags,
WYSIWYG editing
Wikis
http://c2.com/cgi/wiki?WikiWikiWeb
Social Web 2015, Lora Aroyo and Davide Ceolin
7. Exploiting the crowd
• in wiki applications crowd contributes with collective
intelligence (primarily textual)
• later also other media & recourses emerged, e.g., photo
video, music
• crowdsourcing
Social Web 2015, Lora Aroyo and Davide Ceolin
8. Mechanical Turk
• 1760 Wolfgang von Kempelen: TheTurk
• 2005 Amazon: Amazon MechanicalTurk
• marketplace for work; people perform tasks
computers are lousy at, e.g. identifying items in
a photo/video, writing product descriptions,
transcribing podcasts
• HITs = human intelligence tasks
• require little time & offer little compensation
• workers & requesters
Social Web 2015, Lora Aroyo and Davide Ceolin
12. Was the $ million Netflix prize a victory for
crowdsourcing?
Social Web 2015, Lora Aroyo and Davide Ceolin
13. Folksonomy
• On the social web the user-generated content is organized in
light-weight ontologies, i.e., folksonomies
• Community-based semantics = a relationship between Users,Tags
& Resources
• user-created, bottom-up classification/categorization of
(domain) terms / user-labels, e.g., tags
• tagging = the social process where lay users attach labels to
resources (as opposed to annotation by professional experts)
Social Web 2015, Lora Aroyo and Davide Ceolin
16. • cleaning messy data
• transforming data from one format to another
• fetching missing data
Social Web 2015, Lora Aroyo and Davide Ceolin
17. Question?
How critical is the quality of the data on theWeb?
Does structured mark-up help?
How do we measure the quality?
Social Web 2015, Lora Aroyo and Davide Ceolin
19. Vocabularies on the (Social)
Web
• to create interfaces or exchange data between applications
the software needs to know the terms in the data
• vocabularies define set of terms in a certain domain, e.g.,
describing people, relationships, content of different type
Social Web 2015, Lora Aroyo and Davide Ceolin
20.
21. FOAF
• FOAF = Friend of a Friend, http://www.foaf-
project.org/,
• a machine-readable ontology describing persons, their
activities & their relations to other people and objects
• an open, decentralized technology for connecting social
Web sites, & the people they describe
• Since mid-2000
• Stable core of classes & properties
• New terms may be added at any time
• FOAF RDF namespace URI is fixed
• http://xmlns.com/foaf/spec/
• model for publishing simple factual data
via a networked of linked RDF
documents
• FOAF is an attempt to use the Web to:
• integrate factual information with
information in human-oriented
documents (e.g. videos, books,
spreadsheets, 3d models)
• and info that is still in people's
heads
• linking networks of information with
networks of people
Linked Data & FOAF
Social Web 2014, Lora Aroyo!
22. FOAF Example
• there is a foaf:Person
• with a foaf:name property of 'Dan Brickley'
• in foaf:homepage and foaf:openid relationships to a thing called http://danbri.org/
• in foaf:img relationship to a thing referenced by a relative URI of /images/me.jpg
Create your own FOAF file: http://www.ldodds.com/foaf/foaf-a-matic
Social Web 2015, Lora Aroyo and Davide Ceolin
24. FOAF Auto-Discovery
• If you publish a FOAF self-description (e.g. using
foaf-a-matic) you can make it easier for tools to
find your FOAF by putting markup in the head of
your HTML homepage
• Common filename foaf.rdf is a common choice
Social Web 2015, Lora Aroyo and Davide Ceolin
25. SIOC
• Semantically-Interlinked Online Communities
• ontology for representing rich data from Social Web in RDF
• a standard way for expressing user-generated content
• methods for interconnecting discussions, e.g., blogs, forums & mailing lists; and
enable the integration of online community information
• used in conjunction with FOAF vocabulary for expressing personal profile &
social networking information
• http://sioc-project.org/
Social Web 2015, Lora Aroyo and Davide Ceolin
26. <sioc:Post rdf:about="http://jbreslin.com/blog/2006/09/07/creating-connections">
<dc:title>Creating connections between discussion clouds with SIOC</dc:title>
<dcterms:created>2006-09-07T09:33:30Z</dcterms:created>
<sioc:has_container rdf:resource="http://jbreslin.com/blog/index.php?sioc_type=site#weblog"/>
<sioc:has_creator>
<sioc:UserAccount rdf:about="http://jbreslin.com/blog/author/cloud/" rdfs:label="Cloud">
<rdfs:seeAlso rdf:resource="http://jbreslin.com/blog/index.php?sioc_type=user&sioc_id=1"/>
</sioc:UserAccount>
</sioc:has_creator>
<foaf:maker rdf:resource="http://jbreslin.com/blog/author/cloud/#foaf"/>
<sioc:content>SIOC provides a unified vocabulary for content and interaction description: a semantic la
that can co-exist with existing discussion platforms.
</sioc:content>
<sioc:topic rdfs:label="Semantic Web" rdf:resource="http://jbreslin.com/blog/category/semantic-web/"/>
<sioc:topic rdfs:label="Blogs" rdf:resource="http://jbreslin.com/blog/category/blogs/"/>
<sioc:has_reply>
<sioc:Post rdf:about="http://jbreslin.com/blog/2006/09/07/creating-connections/#comment-123928">
<rdfs:seeAlso rdf:resource="http://johnbreslin.com/blog/index.php?
sioc_type=comment&sioc_id=123928"/>
</sioc:Post>
</sioc:has_reply>
</sioc:Post>
• A post (1) titled "Creating connections between discussion clouds with SIOC" (2) created at
09:33:30 on 2006-09-07 (3) written by user "Cloud" (4) on topics "Blogs" and "Semantic Web"
(5) with contents described in sioc:content.
• (6) More information about its author at http://johnbreslin.com/blog/index.php?
sioc_type=user&sioc_id=1
• The post has (7) a reply and (8) detailed SIOC information about this reply can be found at
http://johnbreslin.com/blog/index.php?
sioc_type=comment&sioc_id=123928
1
2
3
4
5
6
8
7
Social Web 2015, Lora Aroyo and Davide Ceolin
29. Activity Streams
• A list of recent activities performed by someone on a website
• Example: Facebook News Feed
• Activity Streams project aims at an activity stream protocol to
syndicate activities across socialWeb applications
• Major websites with activity stream implementations have
already opened up their activity streams to developers to use, e.g.,
Facebook and MySpace
• http://activitystrea.ms/
Social Web 2015, Lora Aroyo and Davide Ceolin
30. Activity Streams
Specification
• an actor, a verb, an object and a target
• person performing an action on/with an object
• Geraldine posted a photo to her album
• John shared a video
• activity metadata to present to a user in a rich human-friendly format, e.g.
constructing readable sentences about the activity that occurred, visual
representations of the activity, or combining similar activities for display
• Activities are serialized using the JSON format
• There is also an ATOM-oriented specification
Social Web 2015, Lora Aroyo and Davide Ceolin
31. Verbs, Objects, Mapping
Verbs Objects
http://wiki.activitystrea.ms/w/page/1359319/Verb%20Mapping
Social Web 2015, Lora Aroyo and Davide Ceolin
32. XFN
• Xhtml Friends Network
• defining a small set of values that describe personal relationships
In HTML and XHTML, these are given as values for rel attribute on a hyperlink. XFN
allows authors to indicate which weblogs belong to friends, whom they've physically met,
and other personal relationships. XFN values allow to humanize blogrolls and link pages.
• using XFN can easily style all links of a particular type, e.g, friends could be
boldfaced, co-workers italicized, etc.
• http://gmpg.org/xfn/
Social Web 2015, Lora Aroyo and Davide Ceolin
33. XFN Example
• Joe has a set of five links in his blogroll: his girlfriend Jane; his
friends Dave and Darryl; industry expert James, who Joe briefly
met once at a conference; and MetaFilter.
• MetaFilter gets no value since it is not an actual person
http://gmpg.org/xfn/introSocial Web 2015, Lora Aroyo and Davide Ceolin
34. Open Graph
• protocol originally developed in Facebook,“Like” button
• enables web pages to become a rich object in a social graph, i.e. any web
page to have the same functionality as any other object on Facebook
• prefix="og: http://ogp.me/ns#" specifies the OGP vocabulary
Social Web 2015, Lora Aroyo and Davide Ceolin
35. Microformats
• simple, open data formats built upon existing widely adopted standards
• Designed for humans first & machines second
• Highly correlated with semantic XHTML (aka the real world semantics,
lowercase semantic web, lossless XHTML)
• “An evolutionary revolution”, by ryan king
Social Web 2014, Lora Aroyo!Social Web 2015, Lora Aroyo and Davide Ceolin
36. Your first microformat
• You can put a microformat on your website in less than 5 mins
• Example: putting an hCard (online business card) on your site
http://microformats.org/get-started
1. Find your name somewhere on your website
2. Wrap your name in an fn (formatted name)
<span class="fn">Jamie Jones</span>
3. Wrap it all in a vcard (declares that everything inside is the hCard microformat):
<span class="vcard"><span class="fn">Jamie Jones</span></span>
<address class="vcard"><span class="fn">Jamie Jones</span></address>
The address element indicates that the person in the hCard is the contact for the page
<p class="vcard">My name is <span class="fn">Jamie Jones</span> I dig
microformats!</p>
Social Web 2014, Lora Aroyo!
37. HTML Microdata
• allows machine-readable data to be embedded in HTML documents in an easy-
to-write manner, with an unambiguous parsing model
• compatible with numerous data formats, including RDF and JSON
• consists of a group of name-value pairs.
the groups are called items, and each name-value pair is a property
• itemscope is used to create an item
• itemprop is used to add a property to an item
• Microdata DOM API
• http://www.w3.org/TR/microdata/
Social Web 2015, Lora Aroyo and Davide Ceolin
38. schema.org
• Google,Yahoo!, Bing
• a common vocabulary for
structured data markup on web
pages
• improve how sites appear in major
search engines
• Google rich snippets of reviews,
people, recipes, events in 2005
• superseded Microformats
Social Web 2015, Lora Aroyo and Davide Ceolin
39. Add schema.org to
HTML using Microdata
<div>
<h1>Avatar</h1>
<span>Director: James Cameron (born August 16, 1954)</span>
<span>Science fiction</span>
<a href="../movies/avatar-theatrical-trailer.html">Trailer</a>
</div>
<div itemscope itemtype ="http://schema.org/Movie">
<h1 itemprop="name"&g;Avatar</h1>
<div itemprop="director" itemscope itemtype="http://schema.org/Person">
Director: <span itemprop="name">James Cameron</span> (born <span
itemprop="birthDate">August 16, 1954)</span>
</div>
<span itemprop="genre">Science fiction</span>
<a href="../movies/avatar-theatrical-trailer.html" itemprop="trailer">Trailer</a>
</div>
Social Web 2015, Lora Aroyo and Davide Ceolin
40. RDFa
• another syntax for RDF
• HTML5 extension for People, Places, Events, Recipes, Reviews markup
specify that a text is the name of a product, or person, or event = “adding semantic markup”.
• RDFa 1.1 = specified for XHTML and HTML5 (for any XML-based language, e.g., SVG)
• RDFa Lite = “a small subset of RDFa consisting of a few attributes that may be applied to most
simple to moderate structured data markup tasks.”
• Publish your data as Linked Data through RDFa --> link to other URIs (others can link to your
HTML+RDFa)
http://rdfa.info/play/
Social Web 2015, Lora Aroyo and Davide Ceolin
41. Quick Structured Data for
Your website
Social Web 2015, Lora Aroyo and Davide Ceolin
42.
43. Knowledge
Graph
• graph that understands real-world entities
and their relationships to one another:
things, not strings
• more than 500 million things
• more than 3.5 billion facts about and
relationships between these different
things
• tuned based on what people search for
• http://www.google.com/insidesearch/
features/search/knowledge.html
results in 2013
Social Web 2015, Lora Aroyo and Davide Ceolin
48. Question?
For which things on the social web would more
vocabularies for embedded semantics be needed
(besides what we have already seen)?
Social Web 2015, Lora Aroyo and Davide Ceolin
49. image source: http://www.flickr.com/photos/bionicteaching/1375254387/
Hands-on Teaser
• mining data in various social web formats
• see the differences in what each of the formats can
contain & what purpose they serve
• start: simple search where we pull in some XFN data and
visualise a graph of people that we find on a website
• check: software you will be working with on the website
Social Web 2015, Lora Aroyo and Davide Ceolin