SlideShare a Scribd company logo
1 of 50
Director, Scholarly Communications   Corporate VP
Microsoft Research Connections       Microsoft Research Connections
http://research.microsoft.com/connections/
     2 GEPS2011
Envisioning a New Era of Research Reporting
Imagine…
•   Live research reports that had multiple
    end-user ‘views’ and which could
    dynamically tailor their presentation to
    each user                                              Reproducible
•   An authoring environment that absorbs                   Research
    and encapsulates research workflows
    and outputs from the lab experiments
•   A report that can be dropped into an     Interactive            Collaboration
    electronic lab workbench in order to        Data
    reconstitute an entire experiment
•   A researcher working with multiple
    reports on a Surface and having the                                 Dynamic
    ability to mash up data and workflows                              Documents
    across experiments
•   The ability to apply new analyses and                        Reputation
    visualizations and to perform new in                         & Influence
    silico experiments
Words & Pictures
• Papers/reports today describe chemical reactions/entities in a variety
  of ways:
    –   common (or brand-name) labels
    –   identifiers and shorthand notations
    –   chemical formulae
    –   two- (and three-) dimensional graphical images of molecular structure.
• Describing chemical data becomes an exercise in typesetting and/or
  graphics, and cross- and re-referencing existing chemical entities is
  labor intensive.
    – The resulting text is usually interpretable by humans but chemical data are
      lost in the process, making it difficult to programmatically extract
      meaningful information from such reports.
• The goals of Chem4Word are to:
    – simplify the task of authoring a chemical document,
    – do so in a way that produces a semantically meaningful document, facilitating
      downstream tasks such as publishers workflows, entity extraction, and semantic
      applications.
Chemistry Add-in for Word
                          aka Chem4Word
• Chem4Word allows chemists to create, edit and manipulate
  chemistry in the Word environment, by
   –   Providing a built in dictionary of chemical structures
   –   Enabling online lookup of further structures via web services (e.g. Pubchem)
   –   Facilitating linking/embedding chemical structures inside a Word document
   –   Modification of chemical structures & representations of those structures
• Authoring is backed by semantic data in
  Chemical Markup Language (CML), enabling:
   – novel functionality in data checking during the authoring process
   – chemistry-centric article reading support
   – data-mining applications.


• Open source project (Outercurve Foundation); Apache 2.0 license
• ~500K downloads to date
Word UI Extensibility

•   Ribbon
•   Task Pane
•   Gallery
•   Templates
•   Recognizers
•   Applications
FILE FORMATS:
    OFFICE OPEN XML DOCUMENTS

Thanks to: http://www.slideshare.net/HollowKnight/a-quick-tour-of-open-xml-format
Binary   Office Open XML
format        format
Binary
format
Office Open XML
     format
THEY LOOK IDENTICAL, BUT …
Binary
format
Office Open XML
     format
Office Open XML
  is a ZIP file …
That contains
 XML parts
Images stored in
   native format
(JPEG, PNG, GIF, …)
Programmer View of Open XML Files

• ZIP Archive
• Document Parts
   – XML Parts
   – Binary Parts
   – Typed (RFC 2616)
• Relationships
   – Connections between parts
• Content Type Stream
   – A specially-named stream
   – Defines mappings from part names to content types
   – Not itself a part, not URI addressable

• Folder structure for convenience only
Multiple ‘views’ backed by
 a single CML data file
EXAMPLE OF GETTING CML DATA
BACK OUT OF A DOCUMENT
To conclude..

Current publishing                            With Chem4Word
        … is broken for data-rich science                        … the cycle is closed




Data publication difficult and unsupported    Data preparation integrated into user workflow

Insufficient data to fully support research   Open Standards promote Open Semantic
                                                  Science
Important Details
• Project Site
  – http://research.microsoft.com/chem4word

• Binaries and source code
  – http://chem4word.codeplex.com
• Facebook Page
  – http://www.facebook.com/groups/186300551397797/
• Outercurve Foundation
  – http://www.outercurve.org
Contributors
University of Cambridge   Microsoft Research
• Peter Murray-Rust       • Alex D. Wade
• Jim Downing             • Savas Parastatidis
• Joe Townsend            • Oscar Naim
                          • Pablo Fernicola
                          • Murray Sargent
                          • Geraldine Wade
                          • Tola Chhoeun
                          • Anthony Hanses
                          • Jim McGill

More Related Content

What's hot

A guide for using academic search complete
A guide for using academic search completeA guide for using academic search complete
A guide for using academic search completeamysmith30
 
Open Annotation, Specifiers and Specific Resources tutorial
Open Annotation, Specifiers and Specific Resources tutorialOpen Annotation, Specifiers and Specific Resources tutorial
Open Annotation, Specifiers and Specific Resources tutorialPaolo Ciccarese
 
OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata...
OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata...OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata...
OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata...Open Science Fair
 
OpenAIRE-COAR conference 2014: Next generation metrics of scholarly performa...
 OpenAIRE-COAR conference 2014: Next generation metrics of scholarly performa... OpenAIRE-COAR conference 2014: Next generation metrics of scholarly performa...
OpenAIRE-COAR conference 2014: Next generation metrics of scholarly performa...OpenAIRE
 
IEEE and ORCID Implementation
IEEE and ORCID ImplementationIEEE and ORCID Implementation
IEEE and ORCID ImplementationORCID, Inc
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.orgNorman Morrison
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research ObjectsCarole Goble
 
ACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP ProjectACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP ProjectStuart Chalk
 
Introduction to end note x3 presentation
Introduction to end note x3 presentationIntroduction to end note x3 presentation
Introduction to end note x3 presentationSteveMcIndoe
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...Susanna-Assunta Sansone
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...Carole Goble
 
Acs collaborative computational technologies for biomedical research an enabl...
Acs collaborative computational technologies for biomedical research an enabl...Acs collaborative computational technologies for biomedical research an enabl...
Acs collaborative computational technologies for biomedical research an enabl...Sean Ekins
 

What's hot (18)

A guide for using academic search complete
A guide for using academic search completeA guide for using academic search complete
A guide for using academic search complete
 
Open Annotation, Specifiers and Specific Resources tutorial
Open Annotation, Specifiers and Specific Resources tutorialOpen Annotation, Specifiers and Specific Resources tutorial
Open Annotation, Specifiers and Specific Resources tutorial
 
OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata...
OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata...OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata...
OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata...
 
Open Annotation Model
Open Annotation ModelOpen Annotation Model
Open Annotation Model
 
Ngsp
NgspNgsp
Ngsp
 
OpenAIRE-COAR conference 2014: Next generation metrics of scholarly performa...
 OpenAIRE-COAR conference 2014: Next generation metrics of scholarly performa... OpenAIRE-COAR conference 2014: Next generation metrics of scholarly performa...
OpenAIRE-COAR conference 2014: Next generation metrics of scholarly performa...
 
Crosslinks
Crosslinks Crosslinks
Crosslinks
 
IEEE and ORCID Implementation
IEEE and ORCID ImplementationIEEE and ORCID Implementation
IEEE and ORCID Implementation
 
Hosting a compound centric community resource for chemistry data
Hosting a compound centric community resource for chemistry dataHosting a compound centric community resource for chemistry data
Hosting a compound centric community resource for chemistry data
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
 
ACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP ProjectACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP Project
 
Introduction to end note x3 presentation
Introduction to end note x3 presentationIntroduction to end note x3 presentation
Introduction to end note x3 presentation
 
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
 
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
 
Acs collaborative computational technologies for biomedical research an enabl...
Acs collaborative computational technologies for biomedical research an enabl...Acs collaborative computational technologies for biomedical research an enabl...
Acs collaborative computational technologies for biomedical research an enabl...
 

Viewers also liked

Pengembangan bahan ajar berbasis lembar kerja siswa
Pengembangan bahan ajar berbasis lembar kerja siswaPengembangan bahan ajar berbasis lembar kerja siswa
Pengembangan bahan ajar berbasis lembar kerja siswaAyu Mardiana
 
TRABAJO #1 Ensayo de informática (1)
TRABAJO #1 Ensayo de informática (1)TRABAJO #1 Ensayo de informática (1)
TRABAJO #1 Ensayo de informática (1)alexis duchi
 
The Limited Red Society
The Limited Red SocietyThe Limited Red Society
The Limited Red SocietyJoseph Wilk
 
A Cosmopolitan Proposal for Balancing Budgets
A Cosmopolitan Proposal for Balancing BudgetsA Cosmopolitan Proposal for Balancing Budgets
A Cosmopolitan Proposal for Balancing Budgetsguest13df98
 
Hanseatisches Institut - Teil 2 kes august_2015
Hanseatisches Institut  - Teil 2 kes august_2015Hanseatisches Institut  - Teil 2 kes august_2015
Hanseatisches Institut - Teil 2 kes august_2015Randolph Moreno Sommer
 
Fic 130319 - presentatie wp-haton - flevum innovation community
Fic   130319 - presentatie wp-haton - flevum innovation communityFic   130319 - presentatie wp-haton - flevum innovation community
Fic 130319 - presentatie wp-haton - flevum innovation communityFlevum
 
Build Business Relationships by Being Authentic
Build Business Relationships by Being Authentic Build Business Relationships by Being Authentic
Build Business Relationships by Being Authentic Jennie Gorman
 
Googol gas generator introduction
Googol gas generator introductionGoogol gas generator introduction
Googol gas generator introductionMark Wang
 
Historia General de la Educación
Historia General de la EducaciónHistoria General de la Educación
Historia General de la Educación03Sonny
 

Viewers also liked (17)

Pengembangan bahan ajar berbasis lembar kerja siswa
Pengembangan bahan ajar berbasis lembar kerja siswaPengembangan bahan ajar berbasis lembar kerja siswa
Pengembangan bahan ajar berbasis lembar kerja siswa
 
excel
excelexcel
excel
 
TRABAJO #1 Ensayo de informática (1)
TRABAJO #1 Ensayo de informática (1)TRABAJO #1 Ensayo de informática (1)
TRABAJO #1 Ensayo de informática (1)
 
The Limited Red Society
The Limited Red SocietyThe Limited Red Society
The Limited Red Society
 
A Cosmopolitan Proposal for Balancing Budgets
A Cosmopolitan Proposal for Balancing BudgetsA Cosmopolitan Proposal for Balancing Budgets
A Cosmopolitan Proposal for Balancing Budgets
 
Untitled Presentation
Untitled PresentationUntitled Presentation
Untitled Presentation
 
Geopolitica
GeopoliticaGeopolitica
Geopolitica
 
Hanseatisches Institut - Teil 2 kes august_2015
Hanseatisches Institut  - Teil 2 kes august_2015Hanseatisches Institut  - Teil 2 kes august_2015
Hanseatisches Institut - Teil 2 kes august_2015
 
Fic 130319 - presentatie wp-haton - flevum innovation community
Fic   130319 - presentatie wp-haton - flevum innovation communityFic   130319 - presentatie wp-haton - flevum innovation community
Fic 130319 - presentatie wp-haton - flevum innovation community
 
S & g
S & gS & g
S & g
 
Work Experience
Work ExperienceWork Experience
Work Experience
 
funciones parte 01
funciones parte 01funciones parte 01
funciones parte 01
 
Build Business Relationships by Being Authentic
Build Business Relationships by Being Authentic Build Business Relationships by Being Authentic
Build Business Relationships by Being Authentic
 
Googol gas generator introduction
Googol gas generator introductionGoogol gas generator introduction
Googol gas generator introduction
 
Deformacion-plástica-clases
Deformacion-plástica-clasesDeformacion-plástica-clases
Deformacion-plástica-clases
 
Historia General de la Educación
Historia General de la EducaciónHistoria General de la Educación
Historia General de la Educación
 
Analisis Sistemico
Analisis  SistemicoAnalisis  Sistemico
Analisis Sistemico
 

Similar to Chem4Word Wade

Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Rudy Potenzone
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Anita de Waard
 
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011Lee Dirks
 
OeRC Seminar
OeRC SeminarOeRC Seminar
OeRC Seminarseanb
 
SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...Carole Goble
 
OAI7 Research Objects
OAI7 Research ObjectsOAI7 Research Objects
OAI7 Research Objectsseanb
 
Data Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersData Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersRebekah Cummings
 
Collaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsCollaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsAndrea Wiggins
 
OER for repository managers
OER for repository managersOER for repository managers
OER for repository managersNick Sheppard
 
Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13
Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13
Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13DataDryad
 
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...SEAD
 
Putting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education OrganisationPutting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education OrganisationMathieu d'Aquin
 
Implimenting and Mitigating Change with all of this Newfangled Technology
Implimenting and Mitigating Change with all of this Newfangled TechnologyImplimenting and Mitigating Change with all of this Newfangled Technology
Implimenting and Mitigating Change with all of this Newfangled TechnologyIndiana Online Users Group
 
Ethics reproducibility and data stewardship
Ethics reproducibility and data stewardshipEthics reproducibility and data stewardship
Ethics reproducibility and data stewardshipRussell Jarvis
 
Research methods group accelarating impact by sharing data
Research methods group  accelarating impact by sharing dataResearch methods group  accelarating impact by sharing data
Research methods group accelarating impact by sharing dataWorld Agroforestry (ICRAF)
 
Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynoteCarole Goble
 

Similar to Chem4Word Wade (20)

Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011
 
394 wade word2007-ssp2008
394 wade word2007-ssp2008394 wade word2007-ssp2008
394 wade word2007-ssp2008
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
A Guide for Reproducible Research
A Guide for Reproducible ResearchA Guide for Reproducible Research
A Guide for Reproducible Research
 
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011
 
OeRC Seminar
OeRC SeminarOeRC Seminar
OeRC Seminar
 
SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...
 
OAI7 Research Objects
OAI7 Research ObjectsOAI7 Research Objects
OAI7 Research Objects
 
Data Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersData Management for Undergraduate Researchers
Data Management for Undergraduate Researchers
 
Semantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including AstrophysicsSemantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including Astrophysics
 
Collaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsCollaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna Workflows
 
OER for repository managers
OER for repository managersOER for repository managers
OER for repository managers
 
Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13
Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13
Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13
 
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
 
Putting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education OrganisationPutting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education Organisation
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
Implimenting and Mitigating Change with all of this Newfangled Technology
Implimenting and Mitigating Change with all of this Newfangled TechnologyImplimenting and Mitigating Change with all of this Newfangled Technology
Implimenting and Mitigating Change with all of this Newfangled Technology
 
Ethics reproducibility and data stewardship
Ethics reproducibility and data stewardshipEthics reproducibility and data stewardship
Ethics reproducibility and data stewardship
 
Research methods group accelarating impact by sharing data
Research methods group  accelarating impact by sharing dataResearch methods group  accelarating impact by sharing data
Research methods group accelarating impact by sharing data
 
Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynote
 

Recently uploaded

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

Chem4Word Wade

  • 1. Director, Scholarly Communications Corporate VP Microsoft Research Connections Microsoft Research Connections
  • 3. Envisioning a New Era of Research Reporting Imagine… • Live research reports that had multiple end-user ‘views’ and which could dynamically tailor their presentation to each user Reproducible • An authoring environment that absorbs Research and encapsulates research workflows and outputs from the lab experiments • A report that can be dropped into an Interactive Collaboration electronic lab workbench in order to Data reconstitute an entire experiment • A researcher working with multiple reports on a Surface and having the Dynamic ability to mash up data and workflows Documents across experiments • The ability to apply new analyses and Reputation visualizations and to perform new in & Influence silico experiments
  • 4. Words & Pictures • Papers/reports today describe chemical reactions/entities in a variety of ways: – common (or brand-name) labels – identifiers and shorthand notations – chemical formulae – two- (and three-) dimensional graphical images of molecular structure. • Describing chemical data becomes an exercise in typesetting and/or graphics, and cross- and re-referencing existing chemical entities is labor intensive. – The resulting text is usually interpretable by humans but chemical data are lost in the process, making it difficult to programmatically extract meaningful information from such reports. • The goals of Chem4Word are to: – simplify the task of authoring a chemical document, – do so in a way that produces a semantically meaningful document, facilitating downstream tasks such as publishers workflows, entity extraction, and semantic applications.
  • 5. Chemistry Add-in for Word aka Chem4Word • Chem4Word allows chemists to create, edit and manipulate chemistry in the Word environment, by – Providing a built in dictionary of chemical structures – Enabling online lookup of further structures via web services (e.g. Pubchem) – Facilitating linking/embedding chemical structures inside a Word document – Modification of chemical structures & representations of those structures • Authoring is backed by semantic data in Chemical Markup Language (CML), enabling: – novel functionality in data checking during the authoring process – chemistry-centric article reading support – data-mining applications. • Open source project (Outercurve Foundation); Apache 2.0 license • ~500K downloads to date
  • 6. Word UI Extensibility • Ribbon • Task Pane • Gallery • Templates • Recognizers • Applications
  • 7. FILE FORMATS: OFFICE OPEN XML DOCUMENTS Thanks to: http://www.slideshare.net/HollowKnight/a-quick-tour-of-open-xml-format
  • 8. Binary Office Open XML format format
  • 10. Office Open XML format
  • 13. Office Open XML format
  • 14. Office Open XML is a ZIP file …
  • 16. Images stored in native format (JPEG, PNG, GIF, …)
  • 17. Programmer View of Open XML Files • ZIP Archive • Document Parts – XML Parts – Binary Parts – Typed (RFC 2616) • Relationships – Connections between parts • Content Type Stream – A specially-named stream – Defines mappings from part names to content types – Not itself a part, not URI addressable • Folder structure for convenience only
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34. Multiple ‘views’ backed by a single CML data file
  • 35. EXAMPLE OF GETTING CML DATA BACK OUT OF A DOCUMENT
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48. To conclude.. Current publishing With Chem4Word … is broken for data-rich science … the cycle is closed Data publication difficult and unsupported Data preparation integrated into user workflow Insufficient data to fully support research Open Standards promote Open Semantic Science
  • 49. Important Details • Project Site – http://research.microsoft.com/chem4word • Binaries and source code – http://chem4word.codeplex.com • Facebook Page – http://www.facebook.com/groups/186300551397797/ • Outercurve Foundation – http://www.outercurve.org
  • 50. Contributors University of Cambridge Microsoft Research • Peter Murray-Rust • Alex D. Wade • Jim Downing • Savas Parastatidis • Joe Townsend • Oscar Naim • Pablo Fernicola • Murray Sargent • Geraldine Wade • Tola Chhoeun • Anthony Hanses • Jim McGill

Editor's Notes

  1. We’ll start by taking a look at two documents. The one on the left is a binary document that could be representative of the kind of binary document that is ubiquitous today. A search on the internet reveals that .DOC is the most widely deployed document format on the web, not counting what exists beyond corporate firewalls. There are literally billions of documents stored in binary format.On the right we have the same document after it has been migrated to Ecma Open XML format.
  2. This is what the binary document looks like as rendered by Microsoft Word. It is a biography of William Shakespeare. The contents of this biography come from Wikipedia.
  3. Here we see the Word 2007 rendering the Office Open XML version of the same document. As you can see it looks the same as the one you saw before.
  4. However, what is inside the file is completely different. This is what the binary version of the document looks like. This type of file requires specialized one of a kind software to read it. It is also complex and it quite likely that any programmer trying to read and write these files easily make a mistake.
  5. Now let’s take a look inside the Office Open XML version of the document. XML is a universal data interchange format that has proven itself in the enterprise and on the web. As you can see the contents of the document are in human readable XML format. It’s still XML format, and you have to play by the rules of XML however it will work with any of the XML tools that exist on the widest range of platforms.
  6. Because as we shall see, Open XML is an industry standard ZIP file
  7. To store XML parts.
  8. Open XML keeps media in its native format such as JPEG, PNG, GIF, etc.
  9. This is Word, note the added “Chemistry” tab at the top
  10. This is the Chemistry “ribbon”
  11. There are multiple ways to insert chemistry into a document. 1. The built in Chemistry “gallery” – a handy place to store the structures you use often
  12. 2. Insert from a local (CML) file3. Insert from a web service such as OPSIN or PubChem
  13. Perform a string search against OPSIN
  14. And the CML file is sent from OPSIN, and inserted into the document
  15. To change the way that structure looks in the document, you can double-click it to launch the 2D editor, remove the labels on the atoms, flip, rotate, etc.
  16. And that will modify the CML file and the image in the document
  17. You can also change to other “views” of the molecule
  18. The Chemistry Navigator pane shows you all of the chemical objects in the current file (and lets you jump to that section of the document)
  19. The Chemistry Navigator also allows you to create a Linked Chemistry Zone (i.e. copy by reference to create another chemical reference in the document which is backed by the same CML structure), or an Unlinked Chemistry Zone (i.e. copy by value to make a new version of the underlying CML, so that it can be modified independent of the first one)
  20. Linked Chemistry Zones allow you to have as many references as you want in the document that are all backed by a single CML molecule, which is stored in the DOCX (ZIP) file