Principles of Peak Picking and Alignment in Pictures and further "doing". ASMS Fall Metabolomics Informatics Workshop 2018.
https://www.asms.org/conferences/fall-workshop/program
3. 3
Presenting Peak Picking: Plan
o Why Peak Pick
o Terminology
• Peak Picking vs Centroid vs Profile …
o Peak Picking & Peak Pickers
• “best of” xcms and enviPick
• Peak Picking in Pictures
• Peak Picking Parameters
• Alleviating Peak Picking Parameter Panic
o Alignment ( / Profiling)
• “best of” xcms and enviMass
o Peak Picking Pointers
o Don’t just listen to me … do it!
6. 6
Why Peak Pick (III)
Identification = turning numbers into structures
N
N
N
S
CH3
NHNH
CH3
CH3
CH3
N
N
N
S
CH3
NHNHCH3
CH3
OH
P
O
S
SO
CH3
CH3
CH3
P OHS
S
O
CH3
CH3
OH
CH3
S
O
O
OH
CH3
CH3
S
N
S
O
O
OH
S
O
O
OH
CH3
CH3
S
O
O
OH
CH3
CH3
S
O
O
OH
CH3
CH3
S
O
O
OH
CH3
CH3
S
O
O
OH
CH3
CH3
N
N
N
S
NHNH
CH3
CH3
CH3
NH2
OH
O
massbank.eu
8. 8
Mass: Centroid vs Profile Data (enviPat)
https://www.envipat.eawag.ch/index.php and Loos et al Anal. Chem. 87(11), 5738-5744. DOI: 10.1021/acs.analchem.5b00941
9. 9
Mass: Centroid vs Profile Data (enviPat)
https://www.envipat.eawag.ch/index.php and Loos et al Anal. Chem. 87(11), 5738-5744. DOI: 10.1021/acs.analchem.5b00941
11. 11
Peak Picking (in time)
Source: R. Tautenhahn, C. Böttcher, S. Neumann, BMC Bioinformatics 2008, 9:504. DOI: 10.1186/1471-2105-9-504
o Peak picking along time axis (chromatographic peaks)
12. 12
Peak Picking
Source: R. Tautenhahn, C. Böttcher, S. Neumann, BMC Bioinformatics 2008, 9:504. DOI: 10.1186/1471-2105-9-504
o Peak picking along time axis (chromatographic peaks)
13. 13
Peak Picking
Source: Johannes Rainer; http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html
o Peak picking along time axis (chromatographic peaks)
14. 14
Peak Picking
Source: Johannes Rainer; http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html
o Peak picking along time axis (chromatographic peaks)
Several Samples Overlaid
Red = KO
Blue = wild type
Rectangle = chromatographic
peaks identified per sample
15. 15
Peak Picking
o Several options for peak picking
• XCMS and centWave
• Tautenhahn et al 2008 DOI: 10.1186/1471-2105-9-504
• http://bioconductor.org/packages/xcms/
• MZmine 2
• Pluskal et al 2010 DOI: 10.1186/1471-2105-11-395
• http://mzmine.github.io/
• enviPick / enviMass
• Loos 2018 DOI: 10.5281/zenodo.1213098
• http://www.looscomputing.ch/eng/enviMass/overview.htm
• Plenty of other open, research and vendor options ...
28. 28
Peak Picking Parameters: centWave
ppm maximal tolerated m/z deviation in consecutive scans, in
ppm (parts per million)
NOTE: dependent on your mass spectrometer
29. 29
Peak Picking Parameters: centWave
peakwidth Chromatographic peak width, given as range (min,max) in seconds
NOTE: highly dependent on your chromatography!
31. 31
Peak Picking Parameters: centWave
prefilter prefilter=c(k,I). Prefilter step for the first phase. Mass traces are
only retained if they contain at least k peaks with intensity >= I
Only one “stick” so will
fail recommended prefilter
settings
32. 32
Too Many Peak Picking Parameters ???????
https://bioconductor.org/packages/
release/bioc/vignettes/IPO/inst/doc
/IPO.html
o IPO to the rescue!
o Parameter
optimization for
xcms-based
workflows …
o Libiseller et al
2015, DOI:
10.1186/s12859-015-0562-8
IPO = Isotopologue Parameter Optimization
34. 34
RECAP: Why Peak Pick?
Identification = turning numbers into structures
N
N
N
S
CH3
NHNH
CH3
CH3
CH3
N
N
N
S
CH3
NHNHCH3
CH3
OH
P
O
S
SO
CH3
CH3
CH3
P OHS
S
O
CH3
CH3
OH
CH3
S
O
O
OH
CH3
CH3
S
N
S
O
O
OH
S
O
O
OH
CH3
CH3
S
O
O
OH
CH3
CH3
S
O
O
OH
CH3
CH3
S
O
O
OH
CH3
CH3
S
O
O
OH
CH3
CH3
N
N
N
S
NHNH
CH3
CH3
CH3
NH2
OH
O
massbank.eu
35. 35
o Instruments change over time …
o Before we can do fancy statistics, we need to make sure
our samples are comparable!
38. 38
Alignment ~= Retention Time Correction
http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html#3_initial_data_inspection
o Many algorithms and methods …
o Before:
39. 39
Alignment ~= Retention Time Correction
http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html#5_alignment
o Many algorithms and methods …
o After (Obiwarp algorithm in xcms)
42. 42
Some advice …
o Peak pickers are designed to pick the perfect peak
• But life is never perfect and peaks are no different
o Pick the peak picker that is best for your situation
• Convenience, ease of use, designed for your data, …
• The optimal choice is usually a compromise
o Be sceptical (visualise your data, reality check it, etc.)
• But don’t go overboard in evaluating peak pickers … remember
your (real) goal …
44. 44
Verify with EIC Extraction [these are NOT picked]
https://github.com/schymane/ReSOLUTION/blob/master/R/RMB_EIC_prescreen.R
No peak at all
Nice peak, MSMS
Peak, no MSMS
Noise with MSMS (careful!)
Isobars with MSMS (careful!)*
Looking for chemicals known
to be present in the sample
45. 45
Just because you find a peak …
ENTACT Project: https://www.epa.gov/sites/production/files/2018-06/documents/comptox_cop_6-28-18.pdf
o Mix 505: One candidate with this mass/formula
• DTXSID9040001, C9H8O4
o One chemical…
How many
peaks?
48. 48
Further reading DOING! [Vendor independent]
o Don’t just take my word for it … don’t just read about it
… DO IT. There are so many ways to try it out …
complete with sample data! [Open Science!]
o http://bioconductor.org/packages/release/bioc/vignettes/x
cms/inst/doc/xcms.html
o http://www.looscomputing.ch/eng/enviMass/overview.htm
o An interface that many enjoy, likely comes with example
data but requires a login …
o https://xcmsonline.scripps.edu/
49. 49
Further reading DOING! [Vendor independent]
o http://mzmine.github.io/
o http://prime.psc.riken.jp/Metabolomics_Software/MS-DIAL/
o MS-DIAL
52. 52
Quality Control of Data
Slide c/o Michael Stravs
o Always visualise results … never take anything for granted
53. 53
Homologues: Challenge Peak Pickers but are Present!
Stravs et al. (2013), J. Mass Spectrom, 48(1):89-99. DOI: 10.1002/jms.3131
OHSO
O
CH3
O
OH
m n
SPA-9C
m+n=6
www.massbank.eu ACCESSIONS (LAS, SPACs):
Literature MS/MS LIT00034, LIT00037
Std Mix., Sample ETS00012, ETS00018https://github.com/MassBank/RMassBank/
Tentatively Identified Spectra:
http://goo.gl/0t7jGp
54. 54
Be wary of instrument specific phenomena!
o R package nontarget: satellite peak removal
55. 55
Be wary of instrument specific phenomena II
o Orbitrap-specific calibration issues (not observed in TOF)