I want to be a Data DJ!

•Télécharger en tant que PPT, PDF•

3 j'aime•1,212 vues

This talk provides an overview of my work towards enabling Data DJs. That is enabling users to create, remix, record, and share their data analyses as easily as DJs make and share mixes. The talk touches on a variety of topics including linked data, scientific workflows, provenance, enterprise mashups and Facebook. It draws these topics into a unified research framework and discusses future research directions.

Technologie Formation

Paul Groth | Vrije Universiteit Amsterdam | pgroth@few.vu.nl Image: http://www.flickr.com/photos/tomk32/2988993409 / All images are under a creative commons license

Image: http://www.flickr.com/photos/lyza/2487848260/sizes/l/

Image: http://www.flickr.com/photos/gigi_murru/2757085392/sizes/l

[object Object],[object Object],[object Object],[object Object],3. http://www.flickr.com/photos/oskay/1364146497/sizes/m/ 2. http://www.flickr.com/photos/cwalker71/1041784395/sizes/l / 1. http://www.flickr.com/photos/restlessglobetrotter/448362507/sizes/m/ 1 2 3

[object Object],[object Object],[object Object]

Image: http://www.flickr.com/photos/davestfu/2157396025/sizes/l/ Image: http://www.flickr.com/photos/danielleblue/170497153/sizes/o/

[object Object],[object Object],[object Object],[object Object]

1. Records 2. Turntables and mixers 3. Recording equipment

Image: http://www.flickr.com/photos/melodramababs/2446537799/sizes/l/

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],No more conversion components

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Image: http://www.flickr.com/photos/danielleblue/170497153/sizes/o/

[object Object],[object Object],[object Object],[object Object],[object Object]

Title: BLASTP with simplified results returned Description: This workflow Performs a blastp search on protein sequence, extracts sequence id within the blast report and retrives the corresponding seuqences.[sic] ≅

- myexperiment.org - 2300 users - 750 workflows - 160 groups

[object Object],[object Object],[object Object],[e-science 09]

Data (triples) How were they produced? Which ones should I trust? Who’s responsible? From Chris Bizer From pipes.deri.org

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],http://www.ifixit.com/Teardown/iPod-touch-3rd-Generation/1158/1

Image: http://www.flickr.com/photos/seidsvag/122718624/sizes/l/

[object Object],[object Object],[object Object],[object Object],[IEEE TPDS Groth 08]

[object Object],[object Object],[object Object],[object Object],[object Object],[ACM Toit 08: Groth, Moreau, Miles]

http://www.flickr.com/photos/newbirth/2834643961/

http://www.flickr.com/photos/el_ramon/3804532661/

Content http://www.flickr.com/photos/ogcodes/2095054686/

Content Nice Letterhead Official Seal A particular statement is present

[object Object],Trust of new workflow components

The Community http://www.flickr.com/photos/dunechaser/142079357/sizes/o/

[object Object],[object Object],1. Data and Data Discovery 2. Component exposure and composition 3. Process capture and organization

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Contenu connexe

En vedette

Hxh291Elfam

Bleach 386Elfam

9Tauseef Jawaid

University Of MichiganCenter for Entrepreneurship & Technology, University of California, Berkeley

Salary ExchangeBellpenny

mcguest3106983

Article Pengenalan Konsep Xml Web ServicesFredy Budimansyah

The Socializers - A Thousand True Fans - IMH Communications 2011 CyprusThe Socializers

[1041] basileios boulgaroktonosPetros Michailidis

Texas S Ta R ChartTally Jo Stout

Uk graphs-2011Petros Michailidis

Greek 1 4-2011Petros Michailidis

Does your brand need a mobile strategy? (Digiday Brand Summit 2012)Kayla Green

Keynote Steve Outram Elluminate 12 NovJISC SSBR

Impresii icl2012Jalobeanu Mihai

Romanian questionnaire 5 8-2011Petros Michailidis

En vedette (16)

Hxh291

Bleach 386

University Of Michigan

Salary Exchange

Article Pengenalan Konsep Xml Web Services

The Socializers - A Thousand True Fans - IMH Communications 2011 Cyprus

[1041] basileios boulgaroktonos

Texas S Ta R Chart

Uk graphs-2011

Greek 1 4-2011

Does your brand need a mobile strategy? (Digiday Brand Summit 2012)

Keynote Steve Outram Elluminate 12 Nov

Impresii icl2012

Romanian questionnaire 5 8-2011

Similaire à I want to be a Data DJ!

Simulation Modelling Practice and Theory 47 (2014) 28–45Cont.docxedgar6wallace88877

Rock OverviewSylvain Joyeux

How do I aggregate oersR. John Robertson

Object oriented software_enggAnnie Thomas

Overview of modern software ecosystem for big data analysisMichael Bryzek

IoT Open Platformsbutler-iot

Cloud Reliability: Decreasing outage frequency using fault injectionJorge Cardoso

OpenTelemetry 101 FTWNGINX, Inc.

Top10 Characteristics of Awesome AppsCasey Lee

[WSO2 Summit Sydney 2019] Emerging Architecture Patterns: API-centric and Cel...WSO2

Modeling and Simulation of Parallel and Distributed Computing Systems with Si...Rafael Ferreira da Silva

5 Thomas MagedanzFire Conference 2010

Rivera_COSC880_PresentationEmanuel Rivera

Better integrations through open interfacesSteve Speicher

A Web-Based Simulator for a Discrete Manufacturing SystemFAST-Lab. Factory Automation Systems and Technologies Laboratory, Tampere University of Technology

IRJET - Automation in Python using Speech RecognitionIRJET Journal

FAIR Computational WorkflowsCarole Goble

Monitoring in 2017 - TIAD Camp DockerThe Incredible Automation Day

Prometheus: From technical metrics to business observabilityJulien Pivotto

Proactive ops for container orchestration environmentsDocker, Inc.

Similaire à I want to be a Data DJ! (20)

Simulation Modelling Practice and Theory 47 (2014) 28–45Cont.docx

Rock Overview

How do I aggregate oers

Object oriented software_engg

Overview of modern software ecosystem for big data analysis

IoT Open Platforms

Cloud Reliability: Decreasing outage frequency using fault injection

OpenTelemetry 101 FTW

Top10 Characteristics of Awesome Apps

[WSO2 Summit Sydney 2019] Emerging Architecture Patterns: API-centric and Cel...

Modeling and Simulation of Parallel and Distributed Computing Systems with Si...

5 Thomas Magedanz

Rivera_COSC880_Presentation

Better integrations through open interfaces

A Web-Based Simulator for a Discrete Manufacturing System

IRJET - Automation in Python using Speech Recognition

FAIR Computational Workflows

Monitoring in 2017 - TIAD Camp Docker

Prometheus: From technical metrics to business observability

Proactive ops for container orchestration environments

Plus de Paul Groth

Data Curation and Debugging for Data Centric AIPaul Groth

Content + Signals: The value of the entire data estate for machine learningPaul Groth

Data Communities - reusable data in and outside your organization.Paul Groth

Minimal viable-datareuse-cziPaul Groth

Knowledge Graph MaintenancePaul Groth

Knowledge Graph FuturesPaul Groth

Knowledge Graph MaintenancePaul Groth

Thoughts on Knowledge Graphs & Deeper ProvenancePaul Groth

Thinking About the Making of DataPaul Groth

End-to-End Learning for Answering Structured Queries Directly over Text Paul Groth

From Data Search to Data ShowcasingPaul Groth

Elsevier’s Healthcare Knowledge GraphPaul Groth

The Challenge of Deeper Knowledge Graphs for SciencePaul Groth

More ways of symbol grounding for knowledge graphs?Paul Groth

Diversity and Depth: Implementing AI across many long tail domainsPaul Groth

Progressive Provenance Capture Through Re-computationPaul Groth

From Text to Data to the World: The Future of Knowledge GraphsPaul Groth

Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsPaul Groth

The need for a transparent data supply chainPaul Groth

Knowledge graph construction for research & medicinePaul Groth

Plus de Paul Groth (20)

Data Curation and Debugging for Data Centric AI

Content + Signals: The value of the entire data estate for machine learning

Data Communities - reusable data in and outside your organization.

Minimal viable-datareuse-czi

Knowledge Graph Maintenance

Knowledge Graph Futures

Knowledge Graph Maintenance

Thoughts on Knowledge Graphs & Deeper Provenance

Thinking About the Making of Data

End-to-End Learning for Answering Structured Queries Directly over Text

From Data Search to Data Showcasing

Elsevier’s Healthcare Knowledge Graph

The Challenge of Deeper Knowledge Graphs for Science

More ways of symbol grounding for knowledge graphs?

Diversity and Depth: Implementing AI across many long tail domains

Progressive Provenance Capture Through Re-computation

From Text to Data to the World: The Future of Knowledge Graphs

Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs

The need for a transparent data supply chain

Knowledge graph construction for research & medicine

Dernier

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxnull - The Open Security Community

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed

Artificial intelligence in cctv survelliance.pptxhariprasad279825

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang

Training state-of-the-art general text embeddingZilliz

"ML in Production",Oleksandr BaganFwdays

Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi

Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst

Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

Search Engine Optimization SEO PDF for 2024.pdfRankYa

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106

Story boards and shot lists for my a level piececharlottematthew16

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett

Powerpoint exploring the locations used in television show Time Clashcharlottematthew16

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

DMCC Future of Trade Web3 - Special EditionDubai Multi Commodity Centre

Dernier (20)

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx

Unraveling Multimodality with Large Language Models.pdf

Scanning the Internet for External Cloud Exposures via SSL Certs

Artificial intelligence in cctv survelliance.pptx

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)

Training state-of-the-art general text embedding

"ML in Production",Oleksandr Bagan

Vertex AI Gemini Prompt Engineering Tips

Human Factors of XR: Using Human Factors to Design XR Systems

Dev Dives: Streamline document processing with UiPath Studio Web

Search Engine Optimization SEO PDF for 2024.pdf

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

The Future of Software Development - Devin AI Innovative Approach.pdf

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics

Story boards and shot lists for my a level piece

SIP trunking in Janus @ Kamailio World 2024

What's New in Teams Calling, Meetings and Devices March 2024

Powerpoint exploring the locations used in television show Time Clash

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

DMCC Future of Trade Web3 - Special Edition

I want to be a Data DJ!

1. Paul Groth | Vrije Universiteit Amsterdam | pgroth@few.vu.nl Image: http://www.flickr.com/photos/tomk32/2988993409 / All images are under a creative commons license

2. Image: http://www.flickr.com/photos/lyza/2487848260/sizes/l/

3. Image: http://www.flickr.com/photos/gigi_murru/2757085392/sizes/l

7. Image: http://www.flickr.com/photos/davestfu/2157396025/sizes/l/ Image: http://www.flickr.com/photos/danielleblue/170497153/sizes/o/

10. 1. Records 2. Turntables and mixers 3. Recording equipment

11. Image: http://www.flickr.com/photos/melodramababs/2446537799/sizes/l/

12.

13.

14. Shared Techniques [WIKIAI’09 @IJCAI]

15. Open Task Repository

16.

17.

18. Image: http://www.flickr.com/photos/danielleblue/170497153/sizes/o/

19.

20. 7/10/09 SWF 2009

21.

22. Title: BLASTP with simplified results returned Description: This workflow Performs a blastp search on protein sequence, extracts sequence id within the blast report and retrives the corresponding seuqences.[sic] ≅

23. - myexperiment.org - 2300 users - 750 workflows - 160 groups

24.

25. [IUI’09] [AAAI SS 09] [SWF 2009]

26.

27. Data (triples) How were they produced? Which ones should I trust? Who’s responsible? From Chris Bizer From pipes.deri.org

28.

29. Image: http://www.flickr.com/photos/seidsvag/122718624/sizes/l/

30.

31.

32.

33.

34.

35. [e-Science 08]

36. from esaw09

37. http://www.flickr.com/photos/newbirth/2834643961/

38.

39. Reputation

40. http://www.flickr.com/photos/el_ramon/3804532661/

41. Content http://www.flickr.com/photos/ogcodes/2095054686/

42. Content Nice Letterhead

43. Content Nice Letterhead Official Seal

44. Content Nice Letterhead Official Seal A particular statement is present

45. Content Nice Letterhead Official Seal ≈ A particular statement is present

46.

47.

48.

49.

50. The Community http://www.flickr.com/photos/dunechaser/142079357/sizes/o/

51.

52.

53.

54.

Notes de l'éditeur

Title: I want to be a Data DJ! Abstract: This talk provides an overview of my work towards enabling Data DJs. That is enabling users to create, remix, record, and share their data analyses as easily as DJs make and share mixes. The talk touches on a variety of topics including linked data, scientific workflows, provenance, enterprise mashups and Facebook. It draws these topics into a unified research framework and discusses future research directions.
Because of I want an audience….not really….
Records Simple components (effects, fades) chained together: workflow Whole albums of dj Creativity (through on a new record – backtrack) – fast to novelty You can continually improve because it’s easy to revisit and remix The ability to remix enables combinatorial innovation
Intuitevly….
1800: Interchangeable parts 1900: Gasoline engine 1960: Integrated circuits 1995-now: Internet
Web services is lower case because not about SOA… Flickr, Google Maps, Twitter,
not easy enough for the user… or developers
Records = data and data discovery Turntables = components and composition Recording = capturing what’s gone on
Data
Common apis= sparql and rdf Things like factual and yql Machine readable data on the web
Common apis= “sparql and api
I see that there is a technique called “drive across country” and I go ahead and import it.
Also if we extract information this is exposed as its own RDF triple. (see the references field)
RDF Query Answering using Evolutionary Algorithms
Fault-tolerance Data movement Provenance tracking Validation Component Discovery Reproduction
A proliferation of boxes and arrow diagrams
Natural instruction…
How do people “naturally describe workflows”? Study with myExperiment workflows
- Workﬂow for estimating the maximum accuracy of a model for a set of test data
Linked data + mashup (workflow) = a new cool application, but then what? Need for provenance
IPOD has 451 parts provided by 10 suppliers… but apple trusts all of them http://pcic.merage.uci.edu/papers/2007/AppleiPod.pdf http://people.ischool.berkeley.edu/~hal/people/hal/NYTimes/2007-06-28.html The problem is not mixing and matching components the problem is the need for provenance
Get applications to record process documentation! Log data ! But the key here is to structure that data….
Guarantees that documentation will be captured… Attributable, finalizable, process reflecting, You can also just use log4j
Say it’s an
Condor dag…. Number of jobs
How many people have cell phones? How many people understand their cell phone contract?
I trust the contract because people I know have told me the
Mechanism design, trust because of enforcement
Trust based on the artifact itself
Availability of support for example
Trust based on experience… what you’ve seen before
Note that this is not to say these can’t work together

I want to be a Data DJ!

Recommandé

Recommandé

Contenu connexe

En vedette

En vedette (16)

Similaire à I want to be a Data DJ!

Similaire à I want to be a Data DJ! (20)

Plus de Paul Groth

Plus de Paul Groth (20)

Dernier

Dernier (20)

I want to be a Data DJ!

Notes de l'éditeur