You are on page 1of 3

18/09/2019 GitHub - guigolab/bamstats: A command line tool to compute mapping statistics from a BAM file

guigolab / bamstats

Dismiss
Join GitHub today
GitHub is home to over 40 million developers working together to host and
review code, manage projects, and build software together.

Sign up

A command line tool to compute mapping statistics from a BAM file

223 commits 3 branches 7 releases 1 contributor BSD-3-Clause

Branch: master New pull request Find File Clone or download

emi80 Bump version to 0.3.3 for release Latest commit a97df7c 14 days ago

annotation Fix wrong feature merging and interleaving with unsorted annotation - c… 14 days ago

cmd/bamstats Switch to Go Modules - close #22 15 days ago

config Add missing config package 3 years ago

data Add check for int8 NH tag in general stats - fix #18 11 months ago

sam Switch to Go Modules - close #22 15 days ago

scripts Add preprocessing scripts to repo last year

stats Add check for int8 NH tag in general stats - fix #18 11 months ago

utils Add method to stats.Map to output JSON - close #16 last year

.gitignore [ci skip] Update gitignore file last year

.travis.yml Switch to Go Modules - close #22 15 days ago

LICENSE Update license 3 years ago

Makefile Use tag description as GitHub release name 10 months ago

README.md [ci skip] Update installation instructions in readme to use latest re… 10 months ago

go.mod Switch to Go Modules - close #22 15 days ago

go.sum Switch to Go Modules - close #22 15 days ago

process.go Switch to Go Modules - close #22 15 days ago

process_test.go Add check for int8 NH tag in general stats - fix #18 11 months ago

version.go Bump version to 0.3.3 for release 14 days ago

README.md

Bamstats
build passing coverage 85%

Bamstats is a command line tool written in Go for computing mapping statistics from a BAM file.

Installation instructions
Use one of the following methods to install Bamstats .

Install a released version

https://github.com/guigolab/bamstats 1/3
18/09/2019 GitHub - guigolab/bamstats: A command line tool to compute mapping statistics from a BAM file

The easiest way is to download a pre-compiled binary from Github releases. Here is an example for installing the latest
released version on Linux 64bit:

export VERSION=0.3.2 OS=linux ARCH=amd64 BIN=/usr/local/bin


wget -O - https://github.com/guigolab/bamstats/releases/download/v${VERSION}/bamstats-
v${VERSION}-${OS}-${ARCH}.tar.bz2 | tar xj --strip-components 3 -C ${BIN}

Install the latest version with go


The following command will install the latest version from the master branch into $GOPATH :

go get github.com/guigolab/bamstats/cmd/bamstats

Provided statistics
Bamstats can currently compute the following mapping statistics:

general
genome coverage
RNA-seq

General
The general mapping statistics include:

Total number of reads


Number of unmapped reads
Number of mapped reads grouped by number of multimaps ( NH tag in BAM file)
Number of mappings
Ratio of mappings vs mapped reads

If the data is paired-end, a section for read-pairs is also reported. In addition to the above metrics, the section contains a
map of the insert size length and the corresponding support as number of reads.

Genome coverage
The genome coverage ststistics are computed for RNA-seq data and include counts for the following genomic regions:

exon
intron
exonic_intronic
intergenic
others

The above metrics are computed for continuous and split mapped reads. An aggregated total is computed across elements
and read types too.

The --uniq (or -u ) command line flag allows reporting of genome coverage statistics for uniquely mapped reads too.

RNA-seq
The RNA-seq statistics follow IHEC reccomendations for RNA-seq data quality metrics. They include counts for the following
regions:

intergenic (different from coverage stats)


ribosomal RNA ( rRNA )

As long as other fractional metrics for the following read types:

mapped
intergenic
rRNA
https://github.com/guigolab/bamstats 2/3
18/09/2019 GitHub - guigolab/bamstats: A command line tool to compute mapping statistics from a BAM file

duplicates

Output examples:

Some examples of the program output can be found in the data folder ot this GitHub repository:

General Stats
Genomic coverage stats
Genomic coverage stats with uniquely mapped reads (Note that the coverageUniq stats are reported as an additional
JSON object)
RNA-seq stats

License

This software is release under a BSD-style license. Please check the LICENSE file for more details.

https://github.com/guigolab/bamstats 3/3

You might also like