SlideShare une entreprise Scribd logo
1  sur  28
An introduction to RNA-seq
                        RNA-
     data analysis
                Sonika Tyagi
                Australian Genome Research Facility




1 August 2012
Outline
• Transcriptomics using RNA-seq:
  Applications
• Gene expression profiling workflows
• Design Challenges
RNA sequencing (mRNA-seq or
               (mRNA-
         RNA-
         RNA-seq)
“An experimental protocol that uses
next- generation sequencing
technologies to sequence the RNA
molecules within a biological sample in
an effort to determine the primary
sequence and relative abundance of
each RNA”
A typical RNA-seq experiment
          RNA-

                                         Library preparation
                                                 and
                                             Sequencing




                                       Bioinformatics Analysis




              Nature Reviews Genetics, November 2008; doi:10.1038/nrg2484
RNA-
       RNA-seq Application
• Allele specific expression: prevelance
  of transcribed SNPs
• Fusion transcripts: e.g., in cancer
• Abundance estimation: alternative
  splicing, RNA-editing, novel
  transcripts
• Gene expression profiling
Raw sequences (fastq
My Answer:          files)


              Quality control (QC)



             Spliced Read alignment


                   Transcripts
                 reconstruction


             Differential expression
                    analysis


                    Biology
Reference
                                  Available ?


                                                     Annotated de novo transcriptome
Annotated Genome            Assembled/Predicted
                                                                assembly
                               transcriptome


                             Reads mapping        •De novo assembly
Reads mapping
                                                  •Reference assisted


                             Transcripts
Transcripts                  reconstruction
reconstruction


                             Summarization
                       a     (by CDS, exon,
                             gene, splice
                             junctions )




                            Tables of
                            counts (digital
                            expression)




            Biology         DE analysis
                                                  RNA-
                                                  RNA-seq workflows
            (GO/Pathways)
Raw sequences (fastq
       files)


Quality control (QC)



Spliced Read alignment


      Transcripts
    reconstruction


Differential expression
       analysis


       Biology
QC tools
Raw sequences (fastq
       files)


 Quality control (QC)


    Spliced Read
     alignment

      Transcripts
    reconstruction


Differential expression
       analysis


       Biology
Alignments /
mapping splice
   junctions

   Unspliced read           Examples:             •       Ideal for mapping
                                                          reads against cDNA
   aligners                 • MAQ, Stampy,                databases.
                              ELAND               •       Splice junction/events
   • Seed methods
                            • BWA, Bowtie                 are not picked up
   • Burrow wheel methods



   Spliced read             Examples:                 •    Novel splice junctions
                                                           can be detected
   aligners                 • Tophat,Mapsplice,
                              SpliceMap               •    Perform better for
   • Exon first                                            polymorphic regions
   • Seed – Extend method   • GSNAP, QPALMA,
                                                           and aligning
                              Elandv2e
                                                           pseudogenes.
Raw sequences (fastq
       files)


 Quality control (QC)



Spliced Read alignment


     Transcripts
   reconstruction


Differential expression
       analysis


       Biology
Transcripts
reconstruction

                    Examples:
    Genome guided   • G.mor.se (short
                      reads), cufflinks and
                      Scripture (for long
                     reads)




                     Examples:
    Genome           •   Transabyss,
                         velvet+Oases,
    independent          MIRA, cufflinks*
Genome guided transcriptome
        assembly
Genome guided transcriptome
        assembly



           doi:10.1038/nrg3068
            doi:10.1038/nrg3068
            Published online




                Martin J and Wang Z, Nat Rev Gen 2011
Raw sequences (fastq
       files)


 Quality control (QC)



Spliced Read alignment


      Transcripts
    reconstruction


    Differential
expression analysis


       Biology
Normalisation
   and DE

   Library size     Examples:
   RPKM             ERANGE, Cuffdiff
   FPKM              edgeR , Myrna
   TMM
   Upper quartile
   Poisson GLM      Examples:
   Negative         DEGseq Myrna
   binomial         edgeR, bayseq,
                    Cuffdiff
Quantification and
normalisation
1. Digital expression or raw
   count: number of reads
   mapping to a region (exon/
   transcript/novel region)
2. Normalize counts* : number
   of reads per million reads
   per kb
3. Splice junction detection
4. Compare to existing gene
   models
        Nat Meth 2008 ; DOI:10.1038/NMETH.1226
Differential expression
• Normalised gene expression value as RPKM:
  – reads per kilobase of exon model per million mapped reads

• Or FPKM:
  – fragments per kilobase of exon model per million mapped reads

• Compare RPKM/FPKM across conditions or tissues




                                           Nat Meth DOI:10.1038/NMETH.1226
Raw sequences (fastq
       files)


 Quality control (QC)



Spliced Read alignment


      Transcripts
    reconstruction


Differential expression
       analysis


       Biology
System Biology: beyond the
       list of DE genes
• Ontologies: GO enrichment, Goseq
  (R package)
• DAVID (http://david.abcc.ncifcrf.gov)
• Pathway analysis
RNA-
        RNA-seq experiment design
               challenges
• NGS biases:
    – Libraryprep (GC content, 5’ or 3’
      depletion, random hexamer primers,
      RNA species, bias towards 3’ end …).
    – Transcript length
•   Sequencing depth
•   Single or paired end
•   Biological or technical replicates
•   Validation         BRIEFINGS IN BIOINFORMATICS. VOL 12. NO 3. 280^287
RNA-
   RNA-seq and other
transcriptomics methods




          Nature Reviews Genetics, November 2008; doi:10.1038/nrg2484
Summary
• RNA-seq: more versatile, comprehensive with
  superior reproducibility and resolution.
• Not dependent on prior sequence information:
  suitable for non-model organisms.
• Potentially provides information for all RNA
  species in the cell and allows discovery of novel
  ones.
• Still an actively developing fields and there are
  research areas which still need refinement.
• Experimental design and validation gold
  standards to be set.
Tophat Cufflinks pipeline reference


Differential gene and transcript expression
analysis of RNA-seq experiments with
TopHat and Cufflinks. Nat Protoc 7(3), 562-
78. [article]
Differential gene and transcript expression
analysis of RNA-seq experiments with
TopHat and Cufflinks. Nat Protoc 7(3), 562-
78. [article]
R-bioconductor based RNA-seq
                     RNA-
          packages
• edgeR
• Voom
• Deseq

http://bioconductor.org/packages/rele
ase/BiocViews.html#___Software
An introduction to RNA-seq data analysis

Contenu connexe

Tendances

Rnaseq basics ngs_application1
Rnaseq basics ngs_application1Rnaseq basics ngs_application1
Rnaseq basics ngs_application1Yaoyu Wang
 
RNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionRNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionJatinder Singh
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencingDayananda Salam
 
Next Generation Sequencing of DNA
Next Generation Sequencing of DNANext Generation Sequencing of DNA
Next Generation Sequencing of DNAmaryamshah13
 
Massively Parallel Signature Sequencing (MPSS)
Massively Parallel Signature Sequencing (MPSS) Massively Parallel Signature Sequencing (MPSS)
Massively Parallel Signature Sequencing (MPSS) Bharathiar university
 
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...VHIR Vall d’Hebron Institut de Recerca
 
COMPARATIVE GENOMICS.ppt
COMPARATIVE GENOMICS.pptCOMPARATIVE GENOMICS.ppt
COMPARATIVE GENOMICS.pptSilpa87
 
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisSo you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisUniversity of California, Davis
 
Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)LOGESWARAN KA
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomicsPawan Kumar
 
Study of Transcriptome
Study of TranscriptomeStudy of Transcriptome
Study of TranscriptomeBOTANYWith
 

Tendances (20)

Ngs ppt
Ngs pptNgs ppt
Ngs ppt
 
Rnaseq basics ngs_application1
Rnaseq basics ngs_application1Rnaseq basics ngs_application1
Rnaseq basics ngs_application1
 
RNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionRNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential Expression
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
Next Generation Sequencing of DNA
Next Generation Sequencing of DNANext Generation Sequencing of DNA
Next Generation Sequencing of DNA
 
DNA microarray
DNA microarrayDNA microarray
DNA microarray
 
Massively Parallel Signature Sequencing (MPSS)
Massively Parallel Signature Sequencing (MPSS) Massively Parallel Signature Sequencing (MPSS)
Massively Parallel Signature Sequencing (MPSS)
 
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
 
COMPARATIVE GENOMICS.ppt
COMPARATIVE GENOMICS.pptCOMPARATIVE GENOMICS.ppt
COMPARATIVE GENOMICS.ppt
 
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisSo you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
 
Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
Genome Assembly 2018
Genome Assembly 2018Genome Assembly 2018
Genome Assembly 2018
 
Intro to illumina sequencing
Intro to illumina sequencingIntro to illumina sequencing
Intro to illumina sequencing
 
Transcriptome analysis
Transcriptome analysisTranscriptome analysis
Transcriptome analysis
 
Study of Transcriptome
Study of TranscriptomeStudy of Transcriptome
Study of Transcriptome
 
Introduction to next generation sequencing
Introduction to next generation sequencingIntroduction to next generation sequencing
Introduction to next generation sequencing
 
High throughput sequencing
High throughput sequencingHigh throughput sequencing
High throughput sequencing
 
Microarray
MicroarrayMicroarray
Microarray
 
ChIP-seq
ChIP-seqChIP-seq
ChIP-seq
 

Similaire à An introduction to RNA-seq data analysis

Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGScursoNGS
 
RNASeq Experiment Design
RNASeq Experiment DesignRNASeq Experiment Design
RNASeq Experiment DesignYaoyu Wang
 
Experimentos de nubes científicas: Medical Genome Project
Experimentos de nubes científicas: Medical Genome ProjectExperimentos de nubes científicas: Medical Genome Project
Experimentos de nubes científicas: Medical Genome ProjectFundación Ramón Areces
 
RNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGSRNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGSHAMNAHAMNA8
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pubsesejun
 
Forsharing cshl2011 sequencing
Forsharing cshl2011 sequencingForsharing cshl2011 sequencing
Forsharing cshl2011 sequencingSean Davis
 
Bioinformatics workshop Sept 2014
Bioinformatics workshop Sept 2014Bioinformatics workshop Sept 2014
Bioinformatics workshop Sept 2014LutzFr
 
Differential expression in RNA-Seq
Differential expression in RNA-SeqDifferential expression in RNA-Seq
Differential expression in RNA-SeqcursoNGS
 
Rna seq and chip seq
Rna seq and chip seqRna seq and chip seq
Rna seq and chip seqJyoti Singh
 
A Comparison of NGS Platforms.
A Comparison of NGS Platforms.A Comparison of NGS Platforms.
A Comparison of NGS Platforms.mkim8
 
Kogo 2013 RNA-seq analysis
Kogo 2013 RNA-seq analysisKogo 2013 RNA-seq analysis
Kogo 2013 RNA-seq analysisJunsu Ko
 
The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...Borlaug Global Rust Initiative
 
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...fruitbreedomics
 
Catalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seqCatalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seqManjappa Ganiger
 
Tools for lncRNA research in cancer
Tools for lncRNA research in cancerTools for lncRNA research in cancer
Tools for lncRNA research in cancerGhent University
 

Similaire à An introduction to RNA-seq data analysis (20)

Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGS
 
RNASeq Experiment Design
RNASeq Experiment DesignRNASeq Experiment Design
RNASeq Experiment Design
 
Experimentos de nubes científicas: Medical Genome Project
Experimentos de nubes científicas: Medical Genome ProjectExperimentos de nubes científicas: Medical Genome Project
Experimentos de nubes científicas: Medical Genome Project
 
RNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGSRNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGS
 
Rnaseq forgenefinding
Rnaseq forgenefindingRnaseq forgenefinding
Rnaseq forgenefinding
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pub
 
Forsharing cshl2011 sequencing
Forsharing cshl2011 sequencingForsharing cshl2011 sequencing
Forsharing cshl2011 sequencing
 
Introduction to Apollo for i5k
Introduction to Apollo for i5kIntroduction to Apollo for i5k
Introduction to Apollo for i5k
 
Bioinformatics workshop Sept 2014
Bioinformatics workshop Sept 2014Bioinformatics workshop Sept 2014
Bioinformatics workshop Sept 2014
 
Differential expression in RNA-Seq
Differential expression in RNA-SeqDifferential expression in RNA-Seq
Differential expression in RNA-Seq
 
Rna seq and chip seq
Rna seq and chip seqRna seq and chip seq
Rna seq and chip seq
 
A Comparison of NGS Platforms.
A Comparison of NGS Platforms.A Comparison of NGS Platforms.
A Comparison of NGS Platforms.
 
Biotech autumn2012-02-ngs2
Biotech autumn2012-02-ngs2Biotech autumn2012-02-ngs2
Biotech autumn2012-02-ngs2
 
Kogo 2013 RNA-seq analysis
Kogo 2013 RNA-seq analysisKogo 2013 RNA-seq analysis
Kogo 2013 RNA-seq analysis
 
The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...
 
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
 
20140711 4 e_tseng_ercc2.0_workshop
20140711 4 e_tseng_ercc2.0_workshop20140711 4 e_tseng_ercc2.0_workshop
20140711 4 e_tseng_ercc2.0_workshop
 
Catalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seqCatalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seq
 
Tools for lncRNA research in cancer
Tools for lncRNA research in cancerTools for lncRNA research in cancer
Tools for lncRNA research in cancer
 
Evolution 2012
Evolution 2012Evolution 2012
Evolution 2012
 

Dernier

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 

Dernier (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 

An introduction to RNA-seq data analysis

  • 1. An introduction to RNA-seq RNA- data analysis Sonika Tyagi Australian Genome Research Facility 1 August 2012
  • 2. Outline • Transcriptomics using RNA-seq: Applications • Gene expression profiling workflows • Design Challenges
  • 3. RNA sequencing (mRNA-seq or (mRNA- RNA- RNA-seq) “An experimental protocol that uses next- generation sequencing technologies to sequence the RNA molecules within a biological sample in an effort to determine the primary sequence and relative abundance of each RNA”
  • 4. A typical RNA-seq experiment RNA- Library preparation and Sequencing Bioinformatics Analysis Nature Reviews Genetics, November 2008; doi:10.1038/nrg2484
  • 5. RNA- RNA-seq Application • Allele specific expression: prevelance of transcribed SNPs • Fusion transcripts: e.g., in cancer • Abundance estimation: alternative splicing, RNA-editing, novel transcripts • Gene expression profiling
  • 6. Raw sequences (fastq My Answer: files) Quality control (QC) Spliced Read alignment Transcripts reconstruction Differential expression analysis Biology
  • 7. Reference Available ? Annotated de novo transcriptome Annotated Genome Assembled/Predicted assembly transcriptome Reads mapping •De novo assembly Reads mapping •Reference assisted Transcripts Transcripts reconstruction reconstruction Summarization a (by CDS, exon, gene, splice junctions ) Tables of counts (digital expression) Biology DE analysis RNA- RNA-seq workflows (GO/Pathways)
  • 8. Raw sequences (fastq files) Quality control (QC) Spliced Read alignment Transcripts reconstruction Differential expression analysis Biology
  • 10. Raw sequences (fastq files) Quality control (QC) Spliced Read alignment Transcripts reconstruction Differential expression analysis Biology
  • 11. Alignments / mapping splice junctions Unspliced read Examples: • Ideal for mapping reads against cDNA aligners • MAQ, Stampy, databases. ELAND • Splice junction/events • Seed methods • BWA, Bowtie are not picked up • Burrow wheel methods Spliced read Examples: • Novel splice junctions can be detected aligners • Tophat,Mapsplice, SpliceMap • Perform better for • Exon first polymorphic regions • Seed – Extend method • GSNAP, QPALMA, and aligning Elandv2e pseudogenes.
  • 12. Raw sequences (fastq files) Quality control (QC) Spliced Read alignment Transcripts reconstruction Differential expression analysis Biology
  • 13. Transcripts reconstruction Examples: Genome guided • G.mor.se (short reads), cufflinks and Scripture (for long reads) Examples: Genome • Transabyss, velvet+Oases, independent MIRA, cufflinks*
  • 15. Genome guided transcriptome assembly doi:10.1038/nrg3068 doi:10.1038/nrg3068 Published online Martin J and Wang Z, Nat Rev Gen 2011
  • 16. Raw sequences (fastq files) Quality control (QC) Spliced Read alignment Transcripts reconstruction Differential expression analysis Biology
  • 17. Normalisation and DE Library size Examples: RPKM ERANGE, Cuffdiff FPKM edgeR , Myrna TMM Upper quartile Poisson GLM Examples: Negative DEGseq Myrna binomial edgeR, bayseq, Cuffdiff
  • 18. Quantification and normalisation 1. Digital expression or raw count: number of reads mapping to a region (exon/ transcript/novel region) 2. Normalize counts* : number of reads per million reads per kb 3. Splice junction detection 4. Compare to existing gene models Nat Meth 2008 ; DOI:10.1038/NMETH.1226
  • 19. Differential expression • Normalised gene expression value as RPKM: – reads per kilobase of exon model per million mapped reads • Or FPKM: – fragments per kilobase of exon model per million mapped reads • Compare RPKM/FPKM across conditions or tissues Nat Meth DOI:10.1038/NMETH.1226
  • 20. Raw sequences (fastq files) Quality control (QC) Spliced Read alignment Transcripts reconstruction Differential expression analysis Biology
  • 21. System Biology: beyond the list of DE genes • Ontologies: GO enrichment, Goseq (R package) • DAVID (http://david.abcc.ncifcrf.gov) • Pathway analysis
  • 22. RNA- RNA-seq experiment design challenges • NGS biases: – Libraryprep (GC content, 5’ or 3’ depletion, random hexamer primers, RNA species, bias towards 3’ end …). – Transcript length • Sequencing depth • Single or paired end • Biological or technical replicates • Validation BRIEFINGS IN BIOINFORMATICS. VOL 12. NO 3. 280^287
  • 23. RNA- RNA-seq and other transcriptomics methods Nature Reviews Genetics, November 2008; doi:10.1038/nrg2484
  • 24. Summary • RNA-seq: more versatile, comprehensive with superior reproducibility and resolution. • Not dependent on prior sequence information: suitable for non-model organisms. • Potentially provides information for all RNA species in the cell and allows discovery of novel ones. • Still an actively developing fields and there are research areas which still need refinement. • Experimental design and validation gold standards to be set.
  • 25. Tophat Cufflinks pipeline reference Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc 7(3), 562- 78. [article]
  • 26. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc 7(3), 562- 78. [article]
  • 27. R-bioconductor based RNA-seq RNA- packages • edgeR • Voom • Deseq http://bioconductor.org/packages/rele ase/BiocViews.html#___Software