26. E
E
step 1
term extraction from event descriptors
generates “high precision” queries
e. g. “andrew bird, opening gala,
celebrate brooklyn, prospect park”
27. E
E
step 2
use “high precision” corpus to generate
more general queries to improve recall
e. g. “andrew bird concert”, “state farm
insurance”
28. E
E
recall-oriented queries
Benefits:
- Works cross-site
- Works with short content
Challenges:
- Introduces noise
- Potentially large set of queries
29. E
E
post-filtering
use known event model (topics, time,
location)
use queries with a result set that
matches known model
30. E
E
for example...
120"
100"
80"
60"
40"
20"
0"
6/7/11" 6/8/11" 6/9/11" 6/10/11" 6/11/11" 6/12/11" 6/13/11"
[andrew"bird"concert]" [state"farm"insurance]"
33. overview E Multi-site content
(WSDM 2012)
Vox Civitas
Multiplayer
35. research questions
can Twitter content around broadcast
news events inform journalistic inquiry?
what insights and analyses can we
enable through visual analytic tools?
[with postdoctoral fellow Nick Diakopoulos]
36. supporting analysis
direct attention to relevant information
automatic content analysis for filtering
– relevance
– uniqueness / novelty
– sentiment
– keyword extraction
38. how to evaluate?
directly evaluate the output of the
algorithms (quantitative)
deep, extensive evaluation of users’
interaction with the system (qualitative)
read more: Olsen (UIST ’07)
Naaman (MTAP ’12)
39. Vox evaluation goals
• How effective for generating story ideas?
• What kind of insights/analysis are
supported?
• Shortcomings and how features are
used?