SlideShare une entreprise Scribd logo
1  sur  22
Télécharger pour lire hors ligne
How well can embeddings represent the
biology of genes related to the complex
pathophysiology of insulin resistance?
Identification of Insulin
Resistance-Related Genes
with Biomedical Knowledge
Graphs Topology and
Embeddings
M. Lisandra Zepeda Mendoza,
Tankred Ott, Marc Boubnovski, Viktor Sandberg, Ramneek Gupta
Executive summary
M. Lisandra Zepeda M.
Identification of Insulin Resistance-Related Genes
with Biomedical Knowledge Graphs Topology and
Embeddings
It is difficult to identify the entire set of
genes associated with IR (insulin
resistance) due to its complexity and
multifactorial nature.
Knowledge graphs (KGs) model relevant
biomedical entities (proteins, diseases,
pathways, etc.) in many different ways.
The specific data model can impact the
results.
Various different algorithms available.
Challenge
How well can embeddings represent the
biology of genes related to the complex
pathophysiology of insulin resistance?
Question
Understand the complexity of insulin
resistance
Identify genes related to insulin resistance
using knowledge graph embeddings using a
data-driven approach.
Goal
Specialist in Biomedical
Knowledge
Representation, NNRCO
Page 2
Appendix
3
Novo Nordisk company presentation
Neetima Bhardwaj &
Veleena Nisha Lobo
Product Supply
US
Mandy Marquardt
Team Novo Nordisk
Professional track cyclist
Background
What is insulin resistance?
Page 4
The insulin signaling pathway
Picture from https://www.nature.com/articles/s41392-022-01073-0
No, wait… actually, it’s tissue-specific
Page 5
A unified concept of insulin resistance in humans
Picture from https://www.nature.com/articles/s41586-019-1797-8
Picture from https://www.nature.com/articles/s41392-022-01073-0
Insulin resistance related diseases in human
Developing a framework to explore the IR
landscape using biomedical KG
What to consider?
Page 6
o KG schema
o Information within the knowledge graph:
o Quality
o Amount
o Relevance for task
o Methods used to predict IR-related genes/proteins
Methods
Our heritage enables us to
defeat diabetes and other
serious chronic diseases
Novo Nordisk company
presentation
7
Otávio Domingos da Costa
Otávio has type 2 diabetes and obesity
Brazil
Which KGs to use? Enriched benchmarking KGs
Page 8
OpenBiolink
(IR node present as phenotype, 55 Gene-IR links)
Hetionet
(IR node absent)
Picture from https://doi.org/10.1093/bioinformatics/btaa274 Picture from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5640425/
• We use general-purpose biomedical knowledge graphs and
want to update them using selected information.
• Add a link between the genes predicted to be related to IR by a
bioinformatics study from Gao et al. [PMID: 32651353]
• This added 624 Gene-IR links to the KGs (i.e. improved our
training set)
Zoomed-in details of the framework
Train:test:val data splits
Page 9
The developed framework involves graph
data modeling and feature engineering
Methods Overview
• OpenBioLink
• Hetionet
Biomedical KG
• Topological features
• Embeddings
• Link prediction
• Outlier detection & PU
• RFs ensemble model
• GSEA
• Euclidean distance for
clustering
• MSI CMD drugs’ MoAs
1 2 3 4
Feature Engineering Models Biological context
Page 11
Exploring
IR
Diffusion profiles | Potential drug’s MoAs from
the public Multiscale Interactome KG
Page 12
https://doi.org/10.1038/s41467-021-21770-8
• Diffusion profile: The path of most
relevance connecting a drug and a disease.
Gives insights into the drug’s possible MoA.
• Implement inhouse the MSI KG and the
methodology to calculate diffusion profiles
of CMD-related drugs
• Identify which genes and biological
functions of those genes are significantly
high in the diffusion profiles of CMD drugs
Novo Nordisk company presentation
13
Results David Lozano and Peter Kusztor
David and Peter have type 1 diabetes and are
professional Team Novo Nordisk riders.
They are racing with 100 on their jersey to
celebrate the 100-year anniversary of the
discovery of insulin.
Novo Nordisk
OpenBiolink @100 predictions
Top Performers:
• Topology-based approaches on both enriched and
non-enriched OpenBioLink datasets, utilizing large
training sets, outperformed other models.
Close Contenders:
• Elkanoto with XGBoost model, applied to OpenBioLink
with large training sets and employing embeddings
from RotatE link prediction on the same biomedical
knowledge graph (biomedKG), nearly matched the top
topology-based models.
Underperformers:
• Models based on Local Outlier Factor (LOF) were
among the least effective.
Page 14
Models vary in the consistency of the top predictions
• Consistency of the @100 predictions across
10 replicates for each modelling
• Topology model very precise and small
variance
• All other models are significantly more
variance; as expected the worst performing
model is the most variance
Page 15
Which features are most relevant in the topology
modelling approach?
Page 16
Small Large
Euclidean distances known vs unknown IR-related gene
Embedding Quality:
• Lowest-quality embeddings for TransE,
IR-related genes from the positive
training set furthest to the IR node.
Training Set Impact:
• In most models, enriched training sets
decrease distances for both known
and unknown IR-related genes to the
IR node.
Page 17
GSEA Top 100 genes predicted and positive set
• Best models matched known IR pathways
and discovered new aspects.
• Worst models identified broad or organ-
specific pathways, not IR-related.
• The training set had the known IR pathways
and unexpected links to Chagas disease
pathway and cancer.
Page 18
Biological Context MSI
• The best-performing link prediction
method (RotatE on the enriched
OpenBioLink) found more genes
associated with impaired glucose
tolerance - generalize better than the
other good-scoring topology-based
and PUL-based models.
• Models could also identify obesity-
related diseases
Page 19
Novo Nordisk company presentation
20
Perspectives
We transform scientific
ideas into life-saving
medicines for patients
Perspectives
Tissue-specific KG
How would the
embeddings look like if
instead we explored the
disease in a tissue-specific
manner, rather than in a
systemic manner (all
diseases, all tissues, all
genes, in a single schema)
Foundational models
Explore the possibility of
using foundational models
on KG to perform few/zero-
shot inductive inference.
Complex queries
Use more complex
reasoning KG-querying
approaches to identify the
relations/connections
between each found gene
and the IR node to
facilitate interpretability
Validation of results
Inhouse in vitro validation
of results
Page 21
Thank you for your attention
Page 22

Contenu connexe

Similaire à Identification of insulin-resistance genes with Knowledge Graphs topology and embeddings

GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMERGENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMERijcsit
 
Amia tbi-14-final
Amia tbi-14-finalAmia tbi-14-final
Amia tbi-14-finalRuss Altman
 
Next Generation Data and Opportunities for Clinical Pharmacologists
Next Generation Data and Opportunities for Clinical PharmacologistsNext Generation Data and Opportunities for Clinical Pharmacologists
Next Generation Data and Opportunities for Clinical PharmacologistsPhilip Bourne
 
Stephen Friend MIT 2011-10-20
Stephen Friend MIT 2011-10-20Stephen Friend MIT 2011-10-20
Stephen Friend MIT 2011-10-20Sage Base
 
Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...Alexander Decker
 
Amia tb-review-13
Amia tb-review-13Amia tb-review-13
Amia tb-review-13Russ Altman
 
Role of bioinformatics of drug designing
Role of bioinformatics of drug designingRole of bioinformatics of drug designing
Role of bioinformatics of drug designingDr NEETHU ASOKAN
 
Report- Genome wide association studies.
Report- Genome wide association studies.Report- Genome wide association studies.
Report- Genome wide association studies.Varsha Gayatonde
 
provenance of microarray experiments
provenance of microarray experimentsprovenance of microarray experiments
provenance of microarray experimentsHelena Deus
 
Technology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive NetworksTechnology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive NetworksAlexander Pico
 
Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...
Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...
Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...rahulmonikasharma
 
Presentation july 31_2015
Presentation july 31_2015Presentation july 31_2015
Presentation july 31_2015gkoytiger
 
SNOMED CT concept model for molecular pathology_final.pptx
SNOMED CT concept model for molecular pathology_final.pptxSNOMED CT concept model for molecular pathology_final.pptx
SNOMED CT concept model for molecular pathology_final.pptxHariHaran685388
 
Whole Genome Trait Association in SVS
Whole Genome Trait Association in SVSWhole Genome Trait Association in SVS
Whole Genome Trait Association in SVSGolden Helix
 
Establishment and analysis of a disease risk prediction model for chronic kid...
Establishment and analysis of a disease risk prediction model for chronic kid...Establishment and analysis of a disease risk prediction model for chronic kid...
Establishment and analysis of a disease risk prediction model for chronic kid...KrishMendapara1
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencingIncedo
 
Network embedding in biomedical data science
Network embedding in biomedical data scienceNetwork embedding in biomedical data science
Network embedding in biomedical data scienceArindam Ghosh
 
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...Chris Southan
 
Introduction to data integration in bioinformatics
Introduction to data integration in bioinformaticsIntroduction to data integration in bioinformatics
Introduction to data integration in bioinformaticsYan Xu
 

Similaire à Identification of insulin-resistance genes with Knowledge Graphs topology and embeddings (20)

GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMERGENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
 
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMERGENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
 
Amia tbi-14-final
Amia tbi-14-finalAmia tbi-14-final
Amia tbi-14-final
 
Next Generation Data and Opportunities for Clinical Pharmacologists
Next Generation Data and Opportunities for Clinical PharmacologistsNext Generation Data and Opportunities for Clinical Pharmacologists
Next Generation Data and Opportunities for Clinical Pharmacologists
 
Stephen Friend MIT 2011-10-20
Stephen Friend MIT 2011-10-20Stephen Friend MIT 2011-10-20
Stephen Friend MIT 2011-10-20
 
Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...
 
Amia tb-review-13
Amia tb-review-13Amia tb-review-13
Amia tb-review-13
 
Role of bioinformatics of drug designing
Role of bioinformatics of drug designingRole of bioinformatics of drug designing
Role of bioinformatics of drug designing
 
Report- Genome wide association studies.
Report- Genome wide association studies.Report- Genome wide association studies.
Report- Genome wide association studies.
 
provenance of microarray experiments
provenance of microarray experimentsprovenance of microarray experiments
provenance of microarray experiments
 
Technology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive NetworksTechnology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive Networks
 
Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...
Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...
Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...
 
Presentation july 31_2015
Presentation july 31_2015Presentation july 31_2015
Presentation july 31_2015
 
SNOMED CT concept model for molecular pathology_final.pptx
SNOMED CT concept model for molecular pathology_final.pptxSNOMED CT concept model for molecular pathology_final.pptx
SNOMED CT concept model for molecular pathology_final.pptx
 
Whole Genome Trait Association in SVS
Whole Genome Trait Association in SVSWhole Genome Trait Association in SVS
Whole Genome Trait Association in SVS
 
Establishment and analysis of a disease risk prediction model for chronic kid...
Establishment and analysis of a disease risk prediction model for chronic kid...Establishment and analysis of a disease risk prediction model for chronic kid...
Establishment and analysis of a disease risk prediction model for chronic kid...
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
Network embedding in biomedical data science
Network embedding in biomedical data scienceNetwork embedding in biomedical data science
Network embedding in biomedical data science
 
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
 
Introduction to data integration in bioinformatics
Introduction to data integration in bioinformaticsIntroduction to data integration in bioinformatics
Introduction to data integration in bioinformatics
 

Plus de Neo4j

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansQIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansNeo4j
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...Neo4j
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosNeo4j
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Neo4j
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Neo4j
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeNeo4j
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsNeo4j
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j
 

Plus de Neo4j (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansQIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge Graphs
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with Graph
 

Dernier

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 

Dernier (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 

Identification of insulin-resistance genes with Knowledge Graphs topology and embeddings

  • 1. How well can embeddings represent the biology of genes related to the complex pathophysiology of insulin resistance? Identification of Insulin Resistance-Related Genes with Biomedical Knowledge Graphs Topology and Embeddings M. Lisandra Zepeda Mendoza, Tankred Ott, Marc Boubnovski, Viktor Sandberg, Ramneek Gupta
  • 2. Executive summary M. Lisandra Zepeda M. Identification of Insulin Resistance-Related Genes with Biomedical Knowledge Graphs Topology and Embeddings It is difficult to identify the entire set of genes associated with IR (insulin resistance) due to its complexity and multifactorial nature. Knowledge graphs (KGs) model relevant biomedical entities (proteins, diseases, pathways, etc.) in many different ways. The specific data model can impact the results. Various different algorithms available. Challenge How well can embeddings represent the biology of genes related to the complex pathophysiology of insulin resistance? Question Understand the complexity of insulin resistance Identify genes related to insulin resistance using knowledge graph embeddings using a data-driven approach. Goal Specialist in Biomedical Knowledge Representation, NNRCO Page 2
  • 3. Appendix 3 Novo Nordisk company presentation Neetima Bhardwaj & Veleena Nisha Lobo Product Supply US Mandy Marquardt Team Novo Nordisk Professional track cyclist Background
  • 4. What is insulin resistance? Page 4 The insulin signaling pathway Picture from https://www.nature.com/articles/s41392-022-01073-0
  • 5. No, wait… actually, it’s tissue-specific Page 5 A unified concept of insulin resistance in humans Picture from https://www.nature.com/articles/s41586-019-1797-8 Picture from https://www.nature.com/articles/s41392-022-01073-0 Insulin resistance related diseases in human
  • 6. Developing a framework to explore the IR landscape using biomedical KG What to consider? Page 6 o KG schema o Information within the knowledge graph: o Quality o Amount o Relevance for task o Methods used to predict IR-related genes/proteins
  • 7. Methods Our heritage enables us to defeat diabetes and other serious chronic diseases Novo Nordisk company presentation 7 Otávio Domingos da Costa Otávio has type 2 diabetes and obesity Brazil
  • 8. Which KGs to use? Enriched benchmarking KGs Page 8 OpenBiolink (IR node present as phenotype, 55 Gene-IR links) Hetionet (IR node absent) Picture from https://doi.org/10.1093/bioinformatics/btaa274 Picture from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5640425/ • We use general-purpose biomedical knowledge graphs and want to update them using selected information. • Add a link between the genes predicted to be related to IR by a bioinformatics study from Gao et al. [PMID: 32651353] • This added 624 Gene-IR links to the KGs (i.e. improved our training set)
  • 9. Zoomed-in details of the framework Train:test:val data splits Page 9
  • 10. The developed framework involves graph data modeling and feature engineering
  • 11. Methods Overview • OpenBioLink • Hetionet Biomedical KG • Topological features • Embeddings • Link prediction • Outlier detection & PU • RFs ensemble model • GSEA • Euclidean distance for clustering • MSI CMD drugs’ MoAs 1 2 3 4 Feature Engineering Models Biological context Page 11 Exploring IR
  • 12. Diffusion profiles | Potential drug’s MoAs from the public Multiscale Interactome KG Page 12 https://doi.org/10.1038/s41467-021-21770-8 • Diffusion profile: The path of most relevance connecting a drug and a disease. Gives insights into the drug’s possible MoA. • Implement inhouse the MSI KG and the methodology to calculate diffusion profiles of CMD-related drugs • Identify which genes and biological functions of those genes are significantly high in the diffusion profiles of CMD drugs
  • 13. Novo Nordisk company presentation 13 Results David Lozano and Peter Kusztor David and Peter have type 1 diabetes and are professional Team Novo Nordisk riders. They are racing with 100 on their jersey to celebrate the 100-year anniversary of the discovery of insulin. Novo Nordisk
  • 14. OpenBiolink @100 predictions Top Performers: • Topology-based approaches on both enriched and non-enriched OpenBioLink datasets, utilizing large training sets, outperformed other models. Close Contenders: • Elkanoto with XGBoost model, applied to OpenBioLink with large training sets and employing embeddings from RotatE link prediction on the same biomedical knowledge graph (biomedKG), nearly matched the top topology-based models. Underperformers: • Models based on Local Outlier Factor (LOF) were among the least effective. Page 14
  • 15. Models vary in the consistency of the top predictions • Consistency of the @100 predictions across 10 replicates for each modelling • Topology model very precise and small variance • All other models are significantly more variance; as expected the worst performing model is the most variance Page 15
  • 16. Which features are most relevant in the topology modelling approach? Page 16 Small Large
  • 17. Euclidean distances known vs unknown IR-related gene Embedding Quality: • Lowest-quality embeddings for TransE, IR-related genes from the positive training set furthest to the IR node. Training Set Impact: • In most models, enriched training sets decrease distances for both known and unknown IR-related genes to the IR node. Page 17
  • 18. GSEA Top 100 genes predicted and positive set • Best models matched known IR pathways and discovered new aspects. • Worst models identified broad or organ- specific pathways, not IR-related. • The training set had the known IR pathways and unexpected links to Chagas disease pathway and cancer. Page 18
  • 19. Biological Context MSI • The best-performing link prediction method (RotatE on the enriched OpenBioLink) found more genes associated with impaired glucose tolerance - generalize better than the other good-scoring topology-based and PUL-based models. • Models could also identify obesity- related diseases Page 19
  • 20. Novo Nordisk company presentation 20 Perspectives We transform scientific ideas into life-saving medicines for patients
  • 21. Perspectives Tissue-specific KG How would the embeddings look like if instead we explored the disease in a tissue-specific manner, rather than in a systemic manner (all diseases, all tissues, all genes, in a single schema) Foundational models Explore the possibility of using foundational models on KG to perform few/zero- shot inductive inference. Complex queries Use more complex reasoning KG-querying approaches to identify the relations/connections between each found gene and the IR node to facilitate interpretability Validation of results Inhouse in vitro validation of results Page 21
  • 22. Thank you for your attention Page 22