SlideShare une entreprise Scribd logo
1  sur  35
Keynote presented at the
Phenotype Foundation first annual
meeting.
Amsterdam, January 18, 2016
Prof. Chris Evelo
Department Bioinformatics –
BiGCaT
Maastricht University
@Chris_Evelo
The use and needs of data sharing in biology
Data
• Things we know
• Things we measure
Knowledge is hard to get
And it doesn’t even play it…
But you can gamify collection
Since we structure it, it can be easier to store
Sharing Data
I would like to exploit common genotype-phenotype relations
between Alzheimer’s Disease and Huntington’s Disease…
I need to combine AD and HD data…
I can help with
that!
I can help with
that!
Source: Marcos Roos
Who wants to share data?
• People who want to use data
• Funders
• Publishers
• But the researchers?
You only need MS-Excel
People hide data
• I did all this work I want to reuse
• They don’t need this part, might be my next…
• I might get a patent on this
• Or… It needs a patent to be valuable
• I can’t even patent because ...
How?
• Don’t add specifics
(ohh those really were knockout cells, but..)
• Leave out important steps
(I did these PCRs, why show the array)
• And “we used an approach slightly modified
from…”
• ...
FAIR data
• Findable
• Accessible
• Interoperable
• Reusable
Sharing Data
I would like to exploit common genotype-phenotype relations
between Alzheimer’s Disease and Huntington’s Disease…
I need to combine AD and HD data…
I can help with
that!
I can help with
that!
Source: Marcos Roos
Sharing Data
Source: Marcos Roos
???
Here’s my data,
have fun!
Here’s my data,
have fun!
Sharing Linkable Data
Source: Marcos Roos
I can go straight to answering my questions with data from
multiple data owners!
Patients will be so pleased with this speed-up!
Here’s my
Linked Data,
have fun!
Here’s my
Linked Data,
have fun!
Really?
From terms “liver, hepar, hepatic tissue”
To URI’s:
http://identifiers.org/tissueont1/liver
http://identifiers.org/tissueont2/hepar
….
Just a first step
And we didn’t even get that…
Reality:
Ontology inspired pull-down menu’s
Nothing is ever “same-as”
• We may need more meaningful predicates
• Or learn to use the better
• We need lenses, context matters
Too many standards
Source XKCD: https://xkcd.com/927/
Too many standards
And ontologies…
But they are there for a reason!
Research fields have different focus/needs
Don’t standardise, map!
We need mapping
• Ontology mapping
• Identifier mapping
• Identity (text mapping)
• Chemistry mapping
We need mapping
• Ontology mapping: NCBO
• Identifier mapping: BridgeDb, IMS
• Identity (text) mapping: Conceptwiki?
• Chemistry mapping: CRS??
There is a lot out there
Discussed last Friday:
Serum and adipose tissue amino acid homeostasis in
the MHO (Badoud 2014)
– Objective: Integrate metabolite and gene expression profiling to elucidate the
molecular distinctions between Metabolically Healthy Obese (MHO) and
Metabolically Unhealthy Obese (MUO)
• Conclusion: SAT gene expression profiling revealed that genes related to branched-chain amino acid catabolism and the tricarboxylic
acid cycle were less down-regulated in MHO individuals compared to MUO individuals. Together, this integrated analysis revealed
that MHO individuals have an intermediate amino acid homeostasis compared to LH and MUO individuals.
– (Diabetes Risk Assessment study) 3 groups: Lean Healthy (LH), MHO and MUO
• Fasting serum samples from all participants and adipose tissue from the periumbilical region under local anesthesia after an
overnight fast
– Initially 30 participants, 10 in each group (7 women, 3 men), but for the Microarray
Analysis they analyzed SAT from 7 LH, 8 MHO and 8 MUO each group having 2 men.
Not very clear why->They selected samples having RNA integrity number higher than
8
– Gene expression data only for the 23 participants
– No gender or biological information (e.g glucose, total triglycerides, etc)
– Not initial serum metabolites concentration (only mean)
– dx.doi.org/10.1021/pr500416v
– Data can be found: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE55200
Discussed last Friday:
Serum and adipose tissue amino acid homeostasis in
the MHO (Badoud 2014)
– Objective: Integrate metabolite and gene expression profiling to elucidate the
molecular distinctions between Metabolically Healthy Obese (MHO) and
Metabolically Unhealthy Obese (MUO)
• Conclusion: SAT gene expression profiling revealed that genes related to branched-chain amino acid catabolism and the tricarboxylic
acid cycle were less down-regulated in MHO individuals compared to MUO individuals. Together, this integrated analysis revealed
that MHO individuals have an intermediate amino acid homeostasis compared to LH and MUO individuals.
– (Diabetes Risk Assessment study) 3 groups: Lean Healthy (LH), MHO and MUO
• Fasting serum samples from all participants and adipose tissue from the periumbilical region under local anesthesia after an
overnight fast
– Initially 30 participants, 10 in each group (7 women, 3 men), but for the Microarray
Analysis they analyzed SAT from 7 LH, 8 MHO and 8 MUO each group having 2 men.
Not very clear why->They selected samples having RNA integrity number higher than
8
– Gene expression data only for the 23 participants
– No gender or biological information (e.g glucose, total triglycerides, etc)
– Not initial serum metabolites concentration (only mean)
– dx.doi.org/10.1021/pr500416v
– Data can be found: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE55200
Adding phenotypic data
Diversity, not size, makes big data hard
SAM module
- small assays
- diverse assays
For now annotation, used after you find it
Repositories are technology driven
• Expression data
• Protein data
• Metabolomics data
• Genetic variation data
Repositories are technology driven
• Expression data: ArrayExpress, GEO
• Protein data: PRIDE
• Metabolomics data: MetaboLight
• Genetic variation data: dbSNP
Start with the samples?
Or the studies?
ISA-tab inspired
investigations links to studies
which link to assays
samples
and the actual data
Study capturing…
Capturing needs meta-ontologies
Examples:
EFO (experimental factor ontology),
eNanomapper (nanomaterials)
•Combine
•Map
•Slim
•Extend
•Feed extensions back to source
•Reproduce from (extended) source
If you can find it in a database
Can you find the database?
Discoverable fairports?
What about institute repo’s?
If study in dbNP
• Large data in repo’s (e.g. MetaboLight)
• Study descriptions still hidden
Combine with knowledge
• Can you find a study by the results?
• Integrate results
(pathway and ontology profiles)
Challenges needed
Teams answering real questions
• Finds needs and solutions
• Combines across communities
• Fun! And inspiring
• Interesting, publishable results
Starting a database is easy
• What about sustainability:
• Core resources need:
– Long time funding
– Regular monitoring
• Integration in communities
Use of data

Contenu connexe

Tendances

Bioinformatics Final Report
Bioinformatics Final ReportBioinformatics Final Report
Bioinformatics Final ReportShruthi Choudary
 
Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsmikaelhuss
 
Emerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsEmerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsmikaelhuss
 
Literature Based Framework for Semantic Descriptions of e-Science resources
Literature Based Framework for Semantic Descriptions of e-Science resourcesLiterature Based Framework for Semantic Descriptions of e-Science resources
Literature Based Framework for Semantic Descriptions of e-Science resourcesHammad Afzal
 
Ondex: Data integration and visualisation
Ondex: Data integration and visualisationOndex: Data integration and visualisation
Ondex: Data integration and visualisationBiogeeks
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchEuropean Bioinformatics Institute
 
UniProt-GOA
UniProt-GOAUniProt-GOA
UniProt-GOAEBI
 
Metabolic Network Analysis
Metabolic Network AnalysisMetabolic Network Analysis
Metabolic Network AnalysisMas Kot
 
Protein databases
Protein databasesProtein databases
Protein databasessarumalay
 
SooryaKiran Bioinformatics
SooryaKiran BioinformaticsSooryaKiran Bioinformatics
SooryaKiran Bioinformaticscontactsoorya
 
Ecocyc database
Ecocyc databaseEcocyc database
Ecocyc databaseShiv Kumar
 
The Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesThe Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesMelanie Courtot
 
BITS: Basics of sequence analysis
BITS: Basics of sequence analysisBITS: Basics of sequence analysis
BITS: Basics of sequence analysisBITS
 

Tendances (20)

Bioinformatics Final Report
Bioinformatics Final ReportBioinformatics Final Report
Bioinformatics Final Report
 
Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomics
 
Emerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsEmerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomics
 
Literature Based Framework for Semantic Descriptions of e-Science resources
Literature Based Framework for Semantic Descriptions of e-Science resourcesLiterature Based Framework for Semantic Descriptions of e-Science resources
Literature Based Framework for Semantic Descriptions of e-Science resources
 
Ondex: Data integration and visualisation
Ondex: Data integration and visualisationOndex: Data integration and visualisation
Ondex: Data integration and visualisation
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven Research
 
UniProt-GOA
UniProt-GOAUniProt-GOA
UniProt-GOA
 
Metabolic Network Analysis
Metabolic Network AnalysisMetabolic Network Analysis
Metabolic Network Analysis
 
Chemistry Online and The vision and challenges associated with building the c...
Chemistry Online and The vision and challenges associated with building the c...Chemistry Online and The vision and challenges associated with building the c...
Chemistry Online and The vision and challenges associated with building the c...
 
Protein databases
Protein databasesProtein databases
Protein databases
 
Paul Groth
Paul GrothPaul Groth
Paul Groth
 
SooryaKiran Bioinformatics
SooryaKiran BioinformaticsSooryaKiran Bioinformatics
SooryaKiran Bioinformatics
 
Data retrieval
Data retrievalData retrieval
Data retrieval
 
Ecocyc database
Ecocyc databaseEcocyc database
Ecocyc database
 
PROTEIN DATABASE
PROTEIN DATABASEPROTEIN DATABASE
PROTEIN DATABASE
 
Proteome databases
Proteome databasesProteome databases
Proteome databases
 
The Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesThe Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resources
 
DCC Keynote 2007
DCC Keynote 2007DCC Keynote 2007
DCC Keynote 2007
 
BITS: Basics of sequence analysis
BITS: Basics of sequence analysisBITS: Basics of sequence analysis
BITS: Basics of sequence analysis
 
Major databases in bioinformatics
Major databases in bioinformaticsMajor databases in bioinformatics
Major databases in bioinformatics
 

En vedette

Adição de ácido clorídrico no meio reacional
Adição de ácido clorídrico no meio reacionalAdição de ácido clorídrico no meio reacional
Adição de ácido clorídrico no meio reacionalAnderson Lima
 
Participant-Centered Consent Toolkit Overview
Participant-Centered Consent Toolkit OverviewParticipant-Centered Consent Toolkit Overview
Participant-Centered Consent Toolkit Overviewjohn wilbanks
 
Dh presentation helig 2014
Dh presentation helig 2014Dh presentation helig 2014
Dh presentation helig 2014HELIGLIASA
 
1st_HIGH-SCHOOL_KALAMPAKA_E.Q.
1st_HIGH-SCHOOL_KALAMPAKA_E.Q.1st_HIGH-SCHOOL_KALAMPAKA_E.Q.
1st_HIGH-SCHOOL_KALAMPAKA_E.Q.1gymkalamp
 
Introducción a la Computación MAE 29
Introducción a la Computación  MAE 29Introducción a la Computación  MAE 29
Introducción a la Computación MAE 29lagreda76
 
Introducción a la Arquitectura Básica del Computador
Introducción a la Arquitectura Básica del ComputadorIntroducción a la Arquitectura Básica del Computador
Introducción a la Arquitectura Básica del ComputadorFranklin Campoverde
 
幽霊島の殺人ルールサマリー
幽霊島の殺人ルールサマリー幽霊島の殺人ルールサマリー
幽霊島の殺人ルールサマリーniconico_sho
 
交點高雄vol.7 - 安蓉 - 傳說中的文化差異
交點高雄vol.7 - 安蓉 - 傳說中的文化差異交點高雄vol.7 - 安蓉 - 傳說中的文化差異
交點高雄vol.7 - 安蓉 - 傳說中的文化差異交點
 
Nuevas tecnologías de la informacion, montse
Nuevas tecnologías de la informacion, montseNuevas tecnologías de la informacion, montse
Nuevas tecnologías de la informacion, montseMonica Castillo
 
Google analytics для тизерной рекламы
Google analytics для тизерной рекламыGoogle analytics для тизерной рекламы
Google analytics для тизерной рекламыОлег Подлуцкий
 
SAFER AND MORE NATURAL WAY TO PREVENT COLD AND FLU
SAFER AND MORE NATURAL WAY TO PREVENT COLD AND FLUSAFER AND MORE NATURAL WAY TO PREVENT COLD AND FLU
SAFER AND MORE NATURAL WAY TO PREVENT COLD AND FLUEason Chan
 
Laughter is the best medicine
Laughter is the best medicineLaughter is the best medicine
Laughter is the best medicineOH TEIK BIN
 
2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXL2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXLMarc Smith
 
Final pr uppehallstillstand
Final pr uppehallstillstand Final pr uppehallstillstand
Final pr uppehallstillstand LinkedIn Nordic
 
Most Hilarious Moments of FIFA 2014
Most Hilarious Moments of FIFA 2014Most Hilarious Moments of FIFA 2014
Most Hilarious Moments of FIFA 2014ixigo.com
 
Gamification at SharePoint Saturday Belgium
Gamification at SharePoint Saturday BelgiumGamification at SharePoint Saturday Belgium
Gamification at SharePoint Saturday BelgiumJussi Mori
 
Клиническая психология - Шизофрения лекция 8 часть 7
Клиническая психология - Шизофрения лекция 8 часть 7Клиническая психология - Шизофрения лекция 8 часть 7
Клиническая психология - Шизофрения лекция 8 часть 7Igor Kleiner
 

En vedette (20)

Adição de ácido clorídrico no meio reacional
Adição de ácido clorídrico no meio reacionalAdição de ácido clorídrico no meio reacional
Adição de ácido clorídrico no meio reacional
 
Participant-Centered Consent Toolkit Overview
Participant-Centered Consent Toolkit OverviewParticipant-Centered Consent Toolkit Overview
Participant-Centered Consent Toolkit Overview
 
Dh presentation helig 2014
Dh presentation helig 2014Dh presentation helig 2014
Dh presentation helig 2014
 
1st_HIGH-SCHOOL_KALAMPAKA_E.Q.
1st_HIGH-SCHOOL_KALAMPAKA_E.Q.1st_HIGH-SCHOOL_KALAMPAKA_E.Q.
1st_HIGH-SCHOOL_KALAMPAKA_E.Q.
 
Introducción a la Computación MAE 29
Introducción a la Computación  MAE 29Introducción a la Computación  MAE 29
Introducción a la Computación MAE 29
 
Introducción a la Arquitectura Básica del Computador
Introducción a la Arquitectura Básica del ComputadorIntroducción a la Arquitectura Básica del Computador
Introducción a la Arquitectura Básica del Computador
 
幽霊島の殺人ルールサマリー
幽霊島の殺人ルールサマリー幽霊島の殺人ルールサマリー
幽霊島の殺人ルールサマリー
 
交點高雄vol.7 - 安蓉 - 傳說中的文化差異
交點高雄vol.7 - 安蓉 - 傳說中的文化差異交點高雄vol.7 - 安蓉 - 傳說中的文化差異
交點高雄vol.7 - 安蓉 - 傳說中的文化差異
 
Nuevas tecnologías de la informacion, montse
Nuevas tecnologías de la informacion, montseNuevas tecnologías de la informacion, montse
Nuevas tecnologías de la informacion, montse
 
The Science of Guru
The Science of GuruThe Science of Guru
The Science of Guru
 
Outubro jardim
Outubro jardimOutubro jardim
Outubro jardim
 
Google analytics для тизерной рекламы
Google analytics для тизерной рекламыGoogle analytics для тизерной рекламы
Google analytics для тизерной рекламы
 
SAFER AND MORE NATURAL WAY TO PREVENT COLD AND FLU
SAFER AND MORE NATURAL WAY TO PREVENT COLD AND FLUSAFER AND MORE NATURAL WAY TO PREVENT COLD AND FLU
SAFER AND MORE NATURAL WAY TO PREVENT COLD AND FLU
 
Laughter is the best medicine
Laughter is the best medicineLaughter is the best medicine
Laughter is the best medicine
 
2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXL2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXL
 
Final pr uppehallstillstand
Final pr uppehallstillstand Final pr uppehallstillstand
Final pr uppehallstillstand
 
Most Hilarious Moments of FIFA 2014
Most Hilarious Moments of FIFA 2014Most Hilarious Moments of FIFA 2014
Most Hilarious Moments of FIFA 2014
 
Path visio3
Path visio3Path visio3
Path visio3
 
Gamification at SharePoint Saturday Belgium
Gamification at SharePoint Saturday BelgiumGamification at SharePoint Saturday Belgium
Gamification at SharePoint Saturday Belgium
 
Клиническая психология - Шизофрения лекция 8 часть 7
Клиническая психология - Шизофрения лекция 8 часть 7Клиническая психология - Шизофрения лекция 8 часть 7
Клиническая психология - Шизофрения лекция 8 часть 7
 

Similaire à Use of data

Amia tb-review-08
Amia tb-review-08Amia tb-review-08
Amia tb-review-08Russ Altman
 
Amia tb-review-12
Amia tb-review-12Amia tb-review-12
Amia tb-review-12Russ Altman
 
bioinformatics simple
bioinformatics simple bioinformatics simple
bioinformatics simple nadeem akhter
 
Ontology for the Financial Services Industry
Ontology for the Financial Services IndustryOntology for the Financial Services Industry
Ontology for the Financial Services IndustryBarry Smith
 
Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...
Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...
Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...Jeremy Yang
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management inscit2006
 
BIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And ChallengesBIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And ChallengesAmos Watentena
 
Amia tb-review-13
Amia tb-review-13Amia tb-review-13
Amia tb-review-13Russ Altman
 
Amia tb-review-11
Amia tb-review-11Amia tb-review-11
Amia tb-review-11Russ Altman
 
Amia tb-review-10
Amia tb-review-10Amia tb-review-10
Amia tb-review-10Russ Altman
 
Open Science and Ecological meta-anlaysis
Open Science and Ecological meta-anlaysisOpen Science and Ecological meta-anlaysis
Open Science and Ecological meta-anlaysisAntica Culina
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei LinChien-Wei Lin
 
A systematic approach to Genotype-Phenotype correlations
A systematic approach to Genotype-Phenotype correlationsA systematic approach to Genotype-Phenotype correlations
A systematic approach to Genotype-Phenotype correlationsfisherp
 
Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!adcobb
 
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...Neuroscience Information Framework
 
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...Human Variome Project
 

Similaire à Use of data (20)

Amia tb-review-08
Amia tb-review-08Amia tb-review-08
Amia tb-review-08
 
Amia tb-review-12
Amia tb-review-12Amia tb-review-12
Amia tb-review-12
 
2015 03 13_puurs_v_public
2015 03 13_puurs_v_public2015 03 13_puurs_v_public
2015 03 13_puurs_v_public
 
bioinformatics simple
bioinformatics simple bioinformatics simple
bioinformatics simple
 
Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03
 
Ontology for the Financial Services Industry
Ontology for the Financial Services IndustryOntology for the Financial Services Industry
Ontology for the Financial Services Industry
 
Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...
Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...
Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...
 
Kernel-based machine learning methods
Kernel-based machine learning methodsKernel-based machine learning methods
Kernel-based machine learning methods
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management
 
Data mining ppt
Data mining pptData mining ppt
Data mining ppt
 
BIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And ChallengesBIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And Challenges
 
Amia tb-review-13
Amia tb-review-13Amia tb-review-13
Amia tb-review-13
 
Amia tb-review-11
Amia tb-review-11Amia tb-review-11
Amia tb-review-11
 
Amia tb-review-10
Amia tb-review-10Amia tb-review-10
Amia tb-review-10
 
Open Science and Ecological meta-anlaysis
Open Science and Ecological meta-anlaysisOpen Science and Ecological meta-anlaysis
Open Science and Ecological meta-anlaysis
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei Lin
 
A systematic approach to Genotype-Phenotype correlations
A systematic approach to Genotype-Phenotype correlationsA systematic approach to Genotype-Phenotype correlations
A systematic approach to Genotype-Phenotype correlations
 
Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!
 
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
 
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
 

Dernier

Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfWildaNurAmalia2
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxGood agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxSimeonChristian
 
Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)itwameryclare
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxJorenAcuavera1
 

Dernier (20)

Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxGood agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
 
Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptx
 

Use of data

  • 1. Keynote presented at the Phenotype Foundation first annual meeting. Amsterdam, January 18, 2016 Prof. Chris Evelo Department Bioinformatics – BiGCaT Maastricht University @Chris_Evelo The use and needs of data sharing in biology
  • 2. Data • Things we know • Things we measure
  • 3. Knowledge is hard to get And it doesn’t even play it… But you can gamify collection Since we structure it, it can be easier to store
  • 4. Sharing Data I would like to exploit common genotype-phenotype relations between Alzheimer’s Disease and Huntington’s Disease… I need to combine AD and HD data… I can help with that! I can help with that! Source: Marcos Roos
  • 5. Who wants to share data? • People who want to use data • Funders • Publishers • But the researchers?
  • 6. You only need MS-Excel
  • 7. People hide data • I did all this work I want to reuse • They don’t need this part, might be my next… • I might get a patent on this • Or… It needs a patent to be valuable • I can’t even patent because ...
  • 8. How? • Don’t add specifics (ohh those really were knockout cells, but..) • Leave out important steps (I did these PCRs, why show the array) • And “we used an approach slightly modified from…” • ...
  • 9. FAIR data • Findable • Accessible • Interoperable • Reusable
  • 10. Sharing Data I would like to exploit common genotype-phenotype relations between Alzheimer’s Disease and Huntington’s Disease… I need to combine AD and HD data… I can help with that! I can help with that! Source: Marcos Roos
  • 11. Sharing Data Source: Marcos Roos ??? Here’s my data, have fun! Here’s my data, have fun!
  • 12. Sharing Linkable Data Source: Marcos Roos I can go straight to answering my questions with data from multiple data owners! Patients will be so pleased with this speed-up! Here’s my Linked Data, have fun! Here’s my Linked Data, have fun!
  • 13. Really? From terms “liver, hepar, hepatic tissue” To URI’s: http://identifiers.org/tissueont1/liver http://identifiers.org/tissueont2/hepar …. Just a first step
  • 14. And we didn’t even get that… Reality: Ontology inspired pull-down menu’s
  • 15. Nothing is ever “same-as” • We may need more meaningful predicates • Or learn to use the better • We need lenses, context matters
  • 16. Too many standards Source XKCD: https://xkcd.com/927/
  • 17. Too many standards And ontologies… But they are there for a reason! Research fields have different focus/needs Don’t standardise, map!
  • 18. We need mapping • Ontology mapping • Identifier mapping • Identity (text mapping) • Chemistry mapping
  • 19. We need mapping • Ontology mapping: NCBO • Identifier mapping: BridgeDb, IMS • Identity (text) mapping: Conceptwiki? • Chemistry mapping: CRS??
  • 20. There is a lot out there
  • 21. Discussed last Friday: Serum and adipose tissue amino acid homeostasis in the MHO (Badoud 2014) – Objective: Integrate metabolite and gene expression profiling to elucidate the molecular distinctions between Metabolically Healthy Obese (MHO) and Metabolically Unhealthy Obese (MUO) • Conclusion: SAT gene expression profiling revealed that genes related to branched-chain amino acid catabolism and the tricarboxylic acid cycle were less down-regulated in MHO individuals compared to MUO individuals. Together, this integrated analysis revealed that MHO individuals have an intermediate amino acid homeostasis compared to LH and MUO individuals. – (Diabetes Risk Assessment study) 3 groups: Lean Healthy (LH), MHO and MUO • Fasting serum samples from all participants and adipose tissue from the periumbilical region under local anesthesia after an overnight fast – Initially 30 participants, 10 in each group (7 women, 3 men), but for the Microarray Analysis they analyzed SAT from 7 LH, 8 MHO and 8 MUO each group having 2 men. Not very clear why->They selected samples having RNA integrity number higher than 8 – Gene expression data only for the 23 participants – No gender or biological information (e.g glucose, total triglycerides, etc) – Not initial serum metabolites concentration (only mean) – dx.doi.org/10.1021/pr500416v – Data can be found: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE55200
  • 22. Discussed last Friday: Serum and adipose tissue amino acid homeostasis in the MHO (Badoud 2014) – Objective: Integrate metabolite and gene expression profiling to elucidate the molecular distinctions between Metabolically Healthy Obese (MHO) and Metabolically Unhealthy Obese (MUO) • Conclusion: SAT gene expression profiling revealed that genes related to branched-chain amino acid catabolism and the tricarboxylic acid cycle were less down-regulated in MHO individuals compared to MUO individuals. Together, this integrated analysis revealed that MHO individuals have an intermediate amino acid homeostasis compared to LH and MUO individuals. – (Diabetes Risk Assessment study) 3 groups: Lean Healthy (LH), MHO and MUO • Fasting serum samples from all participants and adipose tissue from the periumbilical region under local anesthesia after an overnight fast – Initially 30 participants, 10 in each group (7 women, 3 men), but for the Microarray Analysis they analyzed SAT from 7 LH, 8 MHO and 8 MUO each group having 2 men. Not very clear why->They selected samples having RNA integrity number higher than 8 – Gene expression data only for the 23 participants – No gender or biological information (e.g glucose, total triglycerides, etc) – Not initial serum metabolites concentration (only mean) – dx.doi.org/10.1021/pr500416v – Data can be found: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE55200
  • 23. Adding phenotypic data Diversity, not size, makes big data hard SAM module - small assays - diverse assays For now annotation, used after you find it
  • 24. Repositories are technology driven • Expression data • Protein data • Metabolomics data • Genetic variation data
  • 25. Repositories are technology driven • Expression data: ArrayExpress, GEO • Protein data: PRIDE • Metabolomics data: MetaboLight • Genetic variation data: dbSNP
  • 26. Start with the samples?
  • 27. Or the studies? ISA-tab inspired investigations links to studies which link to assays samples and the actual data Study capturing…
  • 28. Capturing needs meta-ontologies Examples: EFO (experimental factor ontology), eNanomapper (nanomaterials) •Combine •Map •Slim •Extend •Feed extensions back to source •Reproduce from (extended) source
  • 29. If you can find it in a database Can you find the database? Discoverable fairports? What about institute repo’s?
  • 30. If study in dbNP • Large data in repo’s (e.g. MetaboLight) • Study descriptions still hidden
  • 31. Combine with knowledge • Can you find a study by the results? • Integrate results (pathway and ontology profiles)
  • 33. Teams answering real questions • Finds needs and solutions • Combines across communities • Fun! And inspiring • Interesting, publishable results
  • 34. Starting a database is easy • What about sustainability: • Core resources need: – Long time funding – Regular monitoring • Integration in communities