SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
Dynamic SA/Reports:
Analyzing Current Project and HTS Data by
Interactive Selection of Frequently-occurring Scaffolds
Deepak Bandyopadhyay
Development help: Chris Louer, Ceara Rea, Jerome Verlin, Alain Deschenes, Nels
Thorsteinson, Guido Kirsten, Bernd Wiswedel
Project testing: Ami Lakdawala, Chaya Duraiswami, Guanglei Cui, Kaushik Raha,
Kristin Brown, Neysa Nevins, Xuan Hong, Constantine Kreatsoulas
Star
cast:
Find viable chemical series from project HTS data
or other large/diverse datasets
–Ideally, from single-shot data: 
–Pragmatically, full-curve data: ∫∫∫∫∫∫∫∫∫∫ …↗
∫∫∫∫∫∫∫∫∫∫∫∫
Usually: scaffold-agnostic (clustering) analysis
–But clusters do not map 1:1 to chemotypes
Our goal: R-group analysis of HTS data
–Provide SAR in a more user-friendly format
Tool of choice: MOE SA/Report
Problem statement
Outline
SA/Report Background
–Problem with out-of-box analysis of HTS data
 Frequent fragment scaffold selection
– Automated and interactive solutions
 Customizations for project data delivery
– Custom units to visualize arbitrary data types
– KNIME workflows for automated generation
Case studies (project and public datasets)
Conclusion
What is a Structure-Activity Report?
SAR analysis and visualization tool in MOE (chemcomp.com)
Input: MOE database (created from CSV, SD-file, etc.)
– Structure and multiple activity/property columns
– Pick/guess column data types (pIC50, IC50, percent,…)
Scaffolds: Auto-detect or specify; R-groups optional
Output: tabbed web page
– Summary tab: arranges molecules
by scaffolds and R-groups,
showing details on mouse-over
or clicking on R-groups
Clark AM, Labute P. J Med Chem. 2009 52(2):469-83.
Agrafiotis DK et al., J Med Chem. 2007 50(24):5926-37
Below: SA/Report on
PubChem pyruvate
kinase screen,
Assay ID 361
What is a Structure-Activity Report?
SAR analysis and visualization tool in MOE (chemcomp.com)
Input: MOE database (created from CSV, SD-file, etc.)
– Structure and multiple activity/property columns
– Pick/guess column data types (pIC50, IC50, percent,…)
Scaffolds: Auto-detect or specify
Output: tabbed web page
– Summary tab: arranges molecules
by scaffolds and R-groups,
showing details on mouse-over
or clicking on R-groups
– Activity tab: grid, R1 vs. R2
or scaffold vs. R1.
– Multiple activities visualized
simultaneously as color bars or
concentric pie charts (“cartwheels”)
Clark AM, Labute P. J Med Chem. 2009 52(2):469-83.
Agrafiotis DK et al., J Med Chem. 2007 50(24):5926-37
Below: SA/Report on
PubChem pyruvate
kinase screen,
Assay ID 361
SA/Report: auto-detect on HTS data
Auto-detect does not find all frequently-occurring series in diverse
datasets (eg. HTS hits, >4000 compds, >10 series)
–Eg. PubChem AssayID 361, 4265 Pyruvate Kinase inhibitor hits
– Two scaffolds found; known series with more exemplars missed
What to do?:
–Specify manually OR
–Use automated or interactive method to find scaffolds
Clark AM, Labute P. J Med Chem. 2009 52(2):469-83.
Outline
SA/Report Background
–Problem with out-of-box analysis of HTS data
 Frequent fragment scaffold selection
– Automated and interactive solutions
 Customizations for project data delivery
– Custom units to visualize arbitrary data types
– KNIME workflows for automated generation
Case studies (project and public datasets)
Conclusion
Scaffolds from Fragment Decomposition
Use frequent fragments as scaffolds
–Schuffenhauer hierarchical decomposition 
–Compounds sorted by frequency of fragment
at each level.
A. Schuffenhauer et al., J. Chem. Inf. Modeling 47:47-58, 2007
Interactive scaffold picking
Users prefer scaffold suggestions, not full automation
– Exclude known nuisance or cross-target-active fragments
– Exclude scaffolds that don’t make chemical sense
– Prefer one among overlapping or multiple scaffolds in a molecule
– Want to analyze a subset of the scaffolds found
Interactive “common fragment selection” GUI
–“Analyze…” button next to “Browse…” on patched version of SA/Report
cmnfrag.svl
(A. Clark/A. Deschenes, CCG;
*available* on SVL exchange)
Interactive scaffold picking, step 1
Top 12 best frequent fragments presented to the user to choose from
–Rank= frequency heavy atom count (1+ (similarity to existing scaffolds))
–↓ User picks #2:
PubChem
dataset:
AID 893,
HSD17B4,
hydroxysteroid
(17-beta)
dehydrogenase 4
Frequent scaffold picking, iterative step
1. Add picked fragment
to scaffold list
2. Remove molecules
that map to it from
consideration
3. Re-analyze remaining
molecules for frequent
scaffolds
4. Repeat until satisfied
Frequent scaffold picking, final iteration
1. Add picked fragment
to scaffold list
2. Remove molecules
that map to it from
consideration
3. Re-analyze remaining
molecules for frequent
scaffolds
4. Repeat until satisfied
Run SA/Report with
scaffolds picked from
frequent fragment
hierarchy,
automatically or
interactively
HTS SAR analysis
Outline
SA/Report Background
–Problem with out-of-box analysis of HTS data
 Frequent fragment scaffold selection
– Automated and interactive solutions
 Customizations for project data delivery
– Custom units to visualize arbitrary data types
– KNIME workflows for automated generation
Case studies (project and public datasets)
Conclusion
Customization 1: units for visualization
SA/Report built to visualize activity
(pIC50/pKi, IC50/Ki, percent, fractions)
New applications:
–visualize data where weak actives are
significant
–optimize compound properties,
along with activity
Solution:
–Define custom units for all commonly
measured/calculated properties in a GUI
– Examples:
–CLogP(5/3/1)
–Permeability: 0/100/300
–Solubility(uM): 0/100/300
…SAReport_custom_units.svl,
A. Deschenes, *available*
from SVL exchange
6 pie sectors = 6 cpds
with these R-groups
Scaffold R6 pIC50 cLogP permeability
Customization 2: Dynamic SA/Reports
SA/Reports need to be regenerated in MOE whenever new
compounds are synthesized
– In an active project, this happens relatively frequently…
One solution to stay current: automated workflow
– KNIME, an open source workflow tool, with comp chem nodes
available from multiple vendors
Automating SA/Report production
SA/Report KNIME node
–Inputs: data (port 0), scaffolds (optional, port 1)
–Activity fields can be configured
–Custom units can be defined and incorporated
Example KNIME workflow for SA/Report
 Many aspects can be customized
Generate
SA/Report
Save URL
(Cron job to
run this nightly
or weekly)
Input scaffolds
Input
molecule
data
Filter by
scaffold /
properties
Data manipulation
Outline
SA/Report Background
–Problem with out-of-box analysis of HTS data
 Frequent fragment scaffold selection
– Automated and interactive solutions
 Customizations for project data delivery
– Custom units to visualize arbitrary data types
– KNIME workflows for automated generation
Case studies (project and public datasets)
Conclusion
GSK project example 1: HTS data analysis
28 scaffolds found in data by interactive scaffold analysis
– prioritized for follow-up based on aggregate properties, believable SAR trends
–

Color patterns: spot good R-group combinations
–Example inference for benzothiophene scaffold:
R6=OMe favored over H R8=NH2 active with >½ other substituents
Combine to fill SAR holes…
> > >
GSK project example 2: Mitigating hERG
Lead series has hERG liability
–Find R-groups that reduce hERG, maintain activity, selectivity
selectivity hERG
activity
R3R10       
↓
H
CH3
Cl
NH2
PubChem example: Pyruvate Kinase screen
Primary assay: AID 361: Pyruvate kinase (PyK, 4265 inhibitors)
Five secondary assays:
–2 orthologs: AID 1631 (human muscle isoform 2 PyK), 1721 (L. Mexicana PyK)
–2 assays to eliminate false positive hits (luciferase, cytotoxicity)
–1 selectivity cross-target (MT1-MMP)
 Interactive scaffold selection
– Chose 25, covering >50% cpds 
 Final report:
– 6 pIC50s (listed above)
– several calculated properties with
custom units: MolWt, ClogP, LogD,
predicted solubility/permeability
PubChem SAR trend elucidation
Biaryl amide
scaffold:
R6=H, Me, OMe, OEt
often hit luciferase/cytotoxicity
cross-screens, are false positives
R6=Et, F do not hit these assays
361_PyK_pIC50 411_lucif_pIC50 924_p53cyTox_pIC50
PubChem example: SAR trend elucidation
SAR trends across similar scaffolds:
–Active/selective R-groups on one scaffold (e.g. R10=OMe on benzothiazole)
used to suggest analogs with the same R-group on related scaffolds.
?
?
?
?
Conclusions
 MOE SA/Reports can be intuitive and valuable for project SAR analysis:
–Extensions to find scaffolds
–Visualize physicochemical properties
–Automated generation using project data
 Interactive scaffold analysis enables:
–Quick identification of interesting series among HTS hits
–Understanding any SAR
–Comparing them to existing series from other hit ID methods, the literature and
public datasets.
 Automated generation of SA/Reports from current data greatly enhances
their appeal as a user-friendly SAR analysis tool
Backup
Semi-automated frequent fragment scaffold picking
Plot scalar fields “freq_1”, “freq_2” etc.
–Pick a compd in each freq plateau above a threshold (eg. 50 out of 4000)
–Choose largest fragment size i with freq_i > threshold as scaffold
freq_1
freq_2
freq_3
freq_4

Contenu connexe

Similaire à Analyzing Project and HTS Data by Interactive Selection of Frequently-occurring Scaffolds

Proteomics, Computational Immunology and Machine Learning - Bioinformatics Re...
Proteomics, Computational Immunology and Machine Learning - Bioinformatics Re...Proteomics, Computational Immunology and Machine Learning - Bioinformatics Re...
Proteomics, Computational Immunology and Machine Learning - Bioinformatics Re...Facultad de Informática UCM
 
CS Guest Lecture 2015 10-05 advanced databases
CS Guest Lecture 2015 10-05 advanced databasesCS Guest Lecture 2015 10-05 advanced databases
CS Guest Lecture 2015 10-05 advanced databasesGabe Rudy
 
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...Kamel Mansouri
 
Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...
Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...
Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...Spark Summit
 
Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...Sunghwan Kim
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataDatamining Tools
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataDataminingTools Inc
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Dmitry Grapov
 
XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...
XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...
XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...Mark Evans
 
MAGE-TAB introduction: Alvis Brazma (EBI)
MAGE-TAB introduction: Alvis Brazma (EBI)MAGE-TAB introduction: Alvis Brazma (EBI)
MAGE-TAB introduction: Alvis Brazma (EBI)niranabey
 
Data analytcis-first-steps
Data analytcis-first-stepsData analytcis-first-steps
Data analytcis-first-stepsShesha R
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceGigaScience, BGI Hong Kong
 
Deep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsDeep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsValery Tkachenko
 
Chemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collectionChemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collectionValery Tkachenko
 
Get Your Hands Dirty with Intel® Distribution for Python*
Get Your Hands Dirty with Intel® Distribution for Python*Get Your Hands Dirty with Intel® Distribution for Python*
Get Your Hands Dirty with Intel® Distribution for Python*Intel® Software
 
PCR Array Data Analysis Tutorial: qPCR Technology Webinar Series Part 3
PCR Array Data Analysis Tutorial: qPCR Technology Webinar Series Part 3PCR Array Data Analysis Tutorial: qPCR Technology Webinar Series Part 3
PCR Array Data Analysis Tutorial: qPCR Technology Webinar Series Part 3QIAGEN
 

Similaire à Analyzing Project and HTS Data by Interactive Selection of Frequently-occurring Scaffolds (20)

Proteomics, Computational Immunology and Machine Learning - Bioinformatics Re...
Proteomics, Computational Immunology and Machine Learning - Bioinformatics Re...Proteomics, Computational Immunology and Machine Learning - Bioinformatics Re...
Proteomics, Computational Immunology and Machine Learning - Bioinformatics Re...
 
Approaches for extraction and digital chromatography of chemical data
Approaches for extraction and digital chromatography of chemical dataApproaches for extraction and digital chromatography of chemical data
Approaches for extraction and digital chromatography of chemical data
 
CS Guest Lecture 2015 10-05 advanced databases
CS Guest Lecture 2015 10-05 advanced databasesCS Guest Lecture 2015 10-05 advanced databases
CS Guest Lecture 2015 10-05 advanced databases
 
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
 
Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...
Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...
Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...
 
Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)
 
XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...
XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...
XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...
 
MAGE-TAB introduction: Alvis Brazma (EBI)
MAGE-TAB introduction: Alvis Brazma (EBI)MAGE-TAB introduction: Alvis Brazma (EBI)
MAGE-TAB introduction: Alvis Brazma (EBI)
 
Data analytcis-first-steps
Data analytcis-first-stepsData analytcis-first-steps
Data analytcis-first-steps
 
ADMET.pptx
ADMET.pptxADMET.pptx
ADMET.pptx
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
 
Deep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsDeep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpoints
 
Chemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collectionChemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collection
 
Get Your Hands Dirty with Intel® Distribution for Python*
Get Your Hands Dirty with Intel® Distribution for Python*Get Your Hands Dirty with Intel® Distribution for Python*
Get Your Hands Dirty with Intel® Distribution for Python*
 
System mldl meetup
System mldl meetupSystem mldl meetup
System mldl meetup
 
PCR Array Data Analysis Tutorial: qPCR Technology Webinar Series Part 3
PCR Array Data Analysis Tutorial: qPCR Technology Webinar Series Part 3PCR Array Data Analysis Tutorial: qPCR Technology Webinar Series Part 3
PCR Array Data Analysis Tutorial: qPCR Technology Webinar Series Part 3
 
Beagle Imputation in SVS
Beagle Imputation in SVSBeagle Imputation in SVS
Beagle Imputation in SVS
 

Dernier

All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Caco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionCaco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionPriyansha Singh
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 

Dernier (20)

All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Caco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionCaco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorption
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 

Analyzing Project and HTS Data by Interactive Selection of Frequently-occurring Scaffolds

  • 1. Dynamic SA/Reports: Analyzing Current Project and HTS Data by Interactive Selection of Frequently-occurring Scaffolds Deepak Bandyopadhyay Development help: Chris Louer, Ceara Rea, Jerome Verlin, Alain Deschenes, Nels Thorsteinson, Guido Kirsten, Bernd Wiswedel Project testing: Ami Lakdawala, Chaya Duraiswami, Guanglei Cui, Kaushik Raha, Kristin Brown, Neysa Nevins, Xuan Hong, Constantine Kreatsoulas Star cast:
  • 2. Find viable chemical series from project HTS data or other large/diverse datasets –Ideally, from single-shot data:  –Pragmatically, full-curve data: ∫∫∫∫∫∫∫∫∫∫ …↗ ∫∫∫∫∫∫∫∫∫∫∫∫ Usually: scaffold-agnostic (clustering) analysis –But clusters do not map 1:1 to chemotypes Our goal: R-group analysis of HTS data –Provide SAR in a more user-friendly format Tool of choice: MOE SA/Report Problem statement
  • 3. Outline SA/Report Background –Problem with out-of-box analysis of HTS data  Frequent fragment scaffold selection – Automated and interactive solutions  Customizations for project data delivery – Custom units to visualize arbitrary data types – KNIME workflows for automated generation Case studies (project and public datasets) Conclusion
  • 4. What is a Structure-Activity Report? SAR analysis and visualization tool in MOE (chemcomp.com) Input: MOE database (created from CSV, SD-file, etc.) – Structure and multiple activity/property columns – Pick/guess column data types (pIC50, IC50, percent,…) Scaffolds: Auto-detect or specify; R-groups optional Output: tabbed web page – Summary tab: arranges molecules by scaffolds and R-groups, showing details on mouse-over or clicking on R-groups Clark AM, Labute P. J Med Chem. 2009 52(2):469-83. Agrafiotis DK et al., J Med Chem. 2007 50(24):5926-37 Below: SA/Report on PubChem pyruvate kinase screen, Assay ID 361
  • 5. What is a Structure-Activity Report? SAR analysis and visualization tool in MOE (chemcomp.com) Input: MOE database (created from CSV, SD-file, etc.) – Structure and multiple activity/property columns – Pick/guess column data types (pIC50, IC50, percent,…) Scaffolds: Auto-detect or specify Output: tabbed web page – Summary tab: arranges molecules by scaffolds and R-groups, showing details on mouse-over or clicking on R-groups – Activity tab: grid, R1 vs. R2 or scaffold vs. R1. – Multiple activities visualized simultaneously as color bars or concentric pie charts (“cartwheels”) Clark AM, Labute P. J Med Chem. 2009 52(2):469-83. Agrafiotis DK et al., J Med Chem. 2007 50(24):5926-37 Below: SA/Report on PubChem pyruvate kinase screen, Assay ID 361
  • 6. SA/Report: auto-detect on HTS data Auto-detect does not find all frequently-occurring series in diverse datasets (eg. HTS hits, >4000 compds, >10 series) –Eg. PubChem AssayID 361, 4265 Pyruvate Kinase inhibitor hits – Two scaffolds found; known series with more exemplars missed What to do?: –Specify manually OR –Use automated or interactive method to find scaffolds Clark AM, Labute P. J Med Chem. 2009 52(2):469-83.
  • 7. Outline SA/Report Background –Problem with out-of-box analysis of HTS data  Frequent fragment scaffold selection – Automated and interactive solutions  Customizations for project data delivery – Custom units to visualize arbitrary data types – KNIME workflows for automated generation Case studies (project and public datasets) Conclusion
  • 8. Scaffolds from Fragment Decomposition Use frequent fragments as scaffolds –Schuffenhauer hierarchical decomposition  –Compounds sorted by frequency of fragment at each level. A. Schuffenhauer et al., J. Chem. Inf. Modeling 47:47-58, 2007
  • 9. Interactive scaffold picking Users prefer scaffold suggestions, not full automation – Exclude known nuisance or cross-target-active fragments – Exclude scaffolds that don’t make chemical sense – Prefer one among overlapping or multiple scaffolds in a molecule – Want to analyze a subset of the scaffolds found Interactive “common fragment selection” GUI –“Analyze…” button next to “Browse…” on patched version of SA/Report cmnfrag.svl (A. Clark/A. Deschenes, CCG; *available* on SVL exchange)
  • 10. Interactive scaffold picking, step 1 Top 12 best frequent fragments presented to the user to choose from –Rank= frequency heavy atom count (1+ (similarity to existing scaffolds)) –↓ User picks #2: PubChem dataset: AID 893, HSD17B4, hydroxysteroid (17-beta) dehydrogenase 4
  • 11. Frequent scaffold picking, iterative step 1. Add picked fragment to scaffold list 2. Remove molecules that map to it from consideration 3. Re-analyze remaining molecules for frequent scaffolds 4. Repeat until satisfied
  • 12. Frequent scaffold picking, final iteration 1. Add picked fragment to scaffold list 2. Remove molecules that map to it from consideration 3. Re-analyze remaining molecules for frequent scaffolds 4. Repeat until satisfied
  • 13. Run SA/Report with scaffolds picked from frequent fragment hierarchy, automatically or interactively HTS SAR analysis
  • 14. Outline SA/Report Background –Problem with out-of-box analysis of HTS data  Frequent fragment scaffold selection – Automated and interactive solutions  Customizations for project data delivery – Custom units to visualize arbitrary data types – KNIME workflows for automated generation Case studies (project and public datasets) Conclusion
  • 15. Customization 1: units for visualization SA/Report built to visualize activity (pIC50/pKi, IC50/Ki, percent, fractions) New applications: –visualize data where weak actives are significant –optimize compound properties, along with activity Solution: –Define custom units for all commonly measured/calculated properties in a GUI – Examples: –CLogP(5/3/1) –Permeability: 0/100/300 –Solubility(uM): 0/100/300 …SAReport_custom_units.svl, A. Deschenes, *available* from SVL exchange 6 pie sectors = 6 cpds with these R-groups Scaffold R6 pIC50 cLogP permeability
  • 16. Customization 2: Dynamic SA/Reports SA/Reports need to be regenerated in MOE whenever new compounds are synthesized – In an active project, this happens relatively frequently… One solution to stay current: automated workflow – KNIME, an open source workflow tool, with comp chem nodes available from multiple vendors
  • 17. Automating SA/Report production SA/Report KNIME node –Inputs: data (port 0), scaffolds (optional, port 1) –Activity fields can be configured –Custom units can be defined and incorporated
  • 18. Example KNIME workflow for SA/Report  Many aspects can be customized Generate SA/Report Save URL (Cron job to run this nightly or weekly) Input scaffolds Input molecule data Filter by scaffold / properties Data manipulation
  • 19. Outline SA/Report Background –Problem with out-of-box analysis of HTS data  Frequent fragment scaffold selection – Automated and interactive solutions  Customizations for project data delivery – Custom units to visualize arbitrary data types – KNIME workflows for automated generation Case studies (project and public datasets) Conclusion
  • 20. GSK project example 1: HTS data analysis 28 scaffolds found in data by interactive scaffold analysis – prioritized for follow-up based on aggregate properties, believable SAR trends –  Color patterns: spot good R-group combinations –Example inference for benzothiophene scaffold: R6=OMe favored over H R8=NH2 active with >½ other substituents Combine to fill SAR holes… > > >
  • 21. GSK project example 2: Mitigating hERG Lead series has hERG liability –Find R-groups that reduce hERG, maintain activity, selectivity selectivity hERG activity R3R10        ↓ H CH3 Cl NH2
  • 22. PubChem example: Pyruvate Kinase screen Primary assay: AID 361: Pyruvate kinase (PyK, 4265 inhibitors) Five secondary assays: –2 orthologs: AID 1631 (human muscle isoform 2 PyK), 1721 (L. Mexicana PyK) –2 assays to eliminate false positive hits (luciferase, cytotoxicity) –1 selectivity cross-target (MT1-MMP)  Interactive scaffold selection – Chose 25, covering >50% cpds   Final report: – 6 pIC50s (listed above) – several calculated properties with custom units: MolWt, ClogP, LogD, predicted solubility/permeability
  • 23. PubChem SAR trend elucidation Biaryl amide scaffold: R6=H, Me, OMe, OEt often hit luciferase/cytotoxicity cross-screens, are false positives R6=Et, F do not hit these assays 361_PyK_pIC50 411_lucif_pIC50 924_p53cyTox_pIC50
  • 24. PubChem example: SAR trend elucidation SAR trends across similar scaffolds: –Active/selective R-groups on one scaffold (e.g. R10=OMe on benzothiazole) used to suggest analogs with the same R-group on related scaffolds. ? ? ? ?
  • 25. Conclusions  MOE SA/Reports can be intuitive and valuable for project SAR analysis: –Extensions to find scaffolds –Visualize physicochemical properties –Automated generation using project data  Interactive scaffold analysis enables: –Quick identification of interesting series among HTS hits –Understanding any SAR –Comparing them to existing series from other hit ID methods, the literature and public datasets.  Automated generation of SA/Reports from current data greatly enhances their appeal as a user-friendly SAR analysis tool
  • 27. Semi-automated frequent fragment scaffold picking Plot scalar fields “freq_1”, “freq_2” etc. –Pick a compd in each freq plateau above a threshold (eg. 50 out of 4000) –Choose largest fragment size i with freq_i > threshold as scaffold freq_1 freq_2 freq_3 freq_4