You are on page 1of 13

Read Alignment and Quality

Control

Alexandre Thiery
Eva Hamrud
Centre for Craniofacial and Regenerative Biology
Read Alignment and Quality Control
Learning Objectives:
• Understand the steps of pre-processing sequenced reads
• Be familiar with some FastQC plots
• Understand when to use different types of aligners with BWA and
STAR as examples
• Be familiar with peak calling and transcript quantification
How sequenced reads are processed
FastQC FastQ GTF Fasta

FastQC Adapter trimming Build reference genome

Alignment (BWA, STAR)


(BAM)
Application specific

Peak calling (MACS) Quantification

Peak count matrix Transcript abundance


Input file formats
GTF

https://en.wikipedia.org/wiki/File:GTF_file_example.jpg

Fasta

https://compgenomr.github.io/book/fasta-and-fastq-formats.html
Input file formats

FASTQ

https://www.researchgate.net/figure/A-sample-of-the-FASTQ-file_fig2_309134977
MultiQC Report
ATAC-seq FastQC plots:
https://moodle.learninghub.kingshealthpartners.org/pluginfile.php/184577/mod_resource/content/2/multiqc_report.html
#fastqc

RNA-seq FastQC plots:


https://moodle.learninghub.kingshealthpartners.org/pluginfile.php/184578/mod_resource/content/1/multiqc_report.html
#fastqc
How sequenced reads are processed
FastQC FastQ GTF Fasta

FastQC Adapter trimming Build reference genome

Alignment (BWA, STAR)


(BAM)
Application specific

Peak calling (MACS) Quantification

Peak count matrix Transcript abundance


Alignment using BWA
• BWA is not a splice-aware aligner

Genomic read aligns to closest


Transcripts will not align
match whilst allowing for certain
due to missing exons
number of indels/mismatches

Reference genome
Alignment using STAR
• STAR is a splice-aware aligner – it can ignore introns

https://hbctraining.github.io/Intro-to-rnaseq-hpc-O2/lessons/03_alignment.html
How sequenced reads are processed
FastQC FastQ GTF Fasta

FastQC Adapter trimming Build reference genome

Alignment (BWA, STAR)


(BAM)
Application specific

Peak calling (MACS) Quantification

Peak count matrix Transcript abundance


ATAC-seq processing: Peak Calling

Macs2
RNA-seq processing: Quantification

Salmon
featureCounts

https://hbctraining.github.io/Intro-to-rnaseq-hpc-O2/lessons/05_counting_reads.html
Resources
• Youtube video with details of each step of RNA-seq analysis:
https://www.youtube.com/watch?v=lG11JjovJHE
• Explanation of how STAR aligner works: https://hbctraining.github.io/Intro-to-rnaseq-hpc-
O2/lessons/03_alignment.html
• Paper introducing BWA aligner: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2705234/
• NF-core RNA-seq pipeline explanation of MultiQC report plots: https://nf-co.re/rnaseq/3.11.2/output
• NF-core ATAC-seq pipeline explanation of MultiQC report plots: https://nf-co.re/atacseq/2.0/output

You might also like