Professional Documents
Culture Documents
Download textbook Computational Exome And Genome Analysis 1St Edition Jager ebook all chapter pdf
Download textbook Computational Exome And Genome Analysis 1St Edition Jager ebook all chapter pdf
https://textbookfull.com/product/computational-mathematics-and-
variational-analysis-1st-edition-nicholas-j-daras-editor/
https://textbookfull.com/product/functional-analysis-of-the-
human-genome-first-edition-cooper/
https://textbookfull.com/product/computational-and-experimental-
analysis-of-functional-materials-1st-edition-reshetnyak/
https://textbookfull.com/product/indian-mujahideen-computational-
analysis-and-public-policy-1st-edition-v-s-subrahmanian/
Computational EEG Analysis Methods and Applications
Chang-Hwan Im
https://textbookfull.com/product/computational-eeg-analysis-
methods-and-applications-chang-hwan-im/
https://textbookfull.com/product/nonlinear-analysis-problems-
applications-and-computational-methods-zakia-hammouch/
https://textbookfull.com/product/computational-methods-for-
processing-and-analysis-of-biological-pathways-1st-edition-
anastasios-bezerianos/
https://textbookfull.com/product/computational-retinal-image-
analysis-tools-applications-and-perspectives-1st-edition-
emanuele-trucco-editor/
https://textbookfull.com/product/fractals-in-probability-and-
analysis-1st-edition-christopher-j-bishop/
Computational Exome
and Genome Analysis
CHAPMAN & HALL/CRC
Mathematical and Computational Biology Series
Series Editors
N. F. Britton
Department of Mathematical Sciences
University of Bath
Xihong Lin
Department of Biostatistics
Harvard University
Nicola Mulder
University of Cape Town
South Africa
Mona Singh
Department of Computer Science
Princeton University
Proposals for the series should be submitted to one of the series editors above or directly to:
CRC Press, Taylor & Francis Group
3 Park Square, Milton Park
Abingdon, Oxfordshire OX14 4RN
UK
Published Titles
An Introduction to Systems Biology: Statistical Methods for QTL Mapping
Design Principles of Biological Circuits Zehua Chen
Uri Alon An Introduction to Physical Oncology:
Glycome Informatics: Methods and How Mechanistic Mathematical
Applications Modeling Can Improve Cancer Therapy
Kiyoko F. Aoki-Kinoshita Outcomes
Computational Systems Biology of Vittorio Cristini, Eugene J. Koay,
Cancer and Zhihui Wang
Emmanuel Barillot, Laurence Calzone, Normal Mode Analysis: Theory and
Philippe Hupé, Jean-Philippe Vert, and Applications to Biological and Chemical
Andrei Zinovyev Systems
Python for Bioinformatics, Second Edition Qiang Cui and Ivet Bahar
Sebastian Bassi Kinetic Modelling in Systems Biology
Quantitative Biology: From Molecular to Oleg Demin and Igor Goryanin
Cellular Systems Data Analysis Tools for DNA Microarrays
Sebastian Bassi Sorin Draghici
Methods in Medical Informatics: Statistics and Data Analysis for
Fundamentals of Healthcare Microarrays Using R and Bioconductor,
Programming in Perl, Python, and Ruby Second Edition
Jules J. Berman Sorin Drăghici
Chromatin: Structure, Dynamics, Computational Neuroscience:
Regulation A Comprehensive Approach
Ralf Blossey Jianfeng Feng
Computational Biology: A Statistical Mathematical Models of Plant-Herbivore
Mechanics Perspective Interactions
Ralf Blossey Zhilan Feng and Donald L. DeAngelis
Game-Theoretical Models in Biology Biological Sequence Analysis Using
Mark Broom and Jan Rychtář the SeqAn C++ Library
Computational and Visualization Andreas Gogol-Döring and Knut Reinert
Techniques for Structural Bioinformatics Gene Expression Studies Using
Using Chimera Affymetrix Microarrays
Forbes J. Burkowski Hinrich Göhlmann and Willem Talloen
Structural Bioinformatics: An Algorithmic Handbook of Hidden Markov Models
Approach in Bioinformatics
Forbes J. Burkowski Martin Gollery
Spatial Ecology Meta-analysis and Combining
Stephen Cantrell, Chris Cosner, Information in Genetics and Genomics
and Shigui Ruan Rudy Guerra and Darlene R. Goldstein
Cell Mechanics: From Single Scale- Differential Equations and Mathematical
Based Models to Multiscale Modeling Biology, Second Edition
Arnaud Chauvière, Luigi Preziosi, D.S. Jones, M.J. Plank, and B.D. Sleeman
and Claude Verdier Knowledge Discovery in Proteomics
Bayesian Phylogenetics: Methods, Igor Jurisica and Dennis Wigle
Algorithms, and Applications Introduction to Proteins: Structure,
Ming-Hui Chen, Lynn Kuo, and Paul O. Lewis Function, and Motion
Amit Kessel and Nir Ben-Tal
Published Titles (continued)
RNA-seq Data Analysis: A Practical Introduction to Bio-Ontologies
Approach Peter N. Robinson and Sebastian Bauer
Eija Korpelainen, Jarno Tuimala, Dynamics of Biological Systems
Panu Somervuo, Mikael Huss, and Garry Wong Michael Small
Introduction to Mathematical Oncology Genome Annotation
Yang Kuang, John D. Nagy, and Jung Soh, Paul M.K. Gordon, and
Steffen E. Eikenberry Christoph W. Sensen
Biological Computation Niche Modeling: Predictions from
Ehud Lamm and Ron Unger Statistical Distributions
Optimal Control Applied to Biological David Stockwell
Models Algorithms for Next-Generation
Suzanne Lenhart and John T. Workman Sequencing
Clustering in Bioinformatics and Drug Wing-Kin Sung
Discovery Algorithms in Bioinformatics: A Practical
John D. MacCuish and Norah E. MacCuish Introduction
Spatiotemporal Patterns in Ecology Wing-Kin Sung
and Epidemiology: Theory, Models, Introduction to Bioinformatics
and Simulation Anna Tramontano
Horst Malchow, Sergei V. Petrovskii, and
The Ten Most Wanted Solutions in
Ezio Venturino
Protein Bioinformatics
Stochastic Dynamics for Systems Biology Anna Tramontano
Christian Mazza and Michel Benaïm
Combinatorial Pattern Matching
Statistical Modeling and Machine Algorithms in Computational Biology
Learning for Molecular Biology Using Perl and R
Alan M. Moses Gabriel Valiente
Engineering Genetic Circuits Managing Your Biological Data with
Chris J. Myers Python
Pattern Discovery in Bioinformatics: Allegra Via, Kristian Rother, and
Theory & Algorithms Anna Tramontano
Laxmi Parida Cancer Systems Biology
Exactly Solvable Models of Biological Edwin Wang
Invasion Stochastic Modelling for Systems
Sergei V. Petrovskii and Bai-Lian Li Biology, Second Edition
Computational Hydrodynamics of Darren J. Wilkinson
Capsules and Biological Cells Big Data Analysis for Bioinformatics and
C. Pozrikidis Biomedical Discoveries
Modeling and Simulation of Capsules Shui Qing Ye
and Biological Cells Bioinformatics: A Practical Approach
C. Pozrikidis Shui Qing Ye
Cancer Modelling and Simulation Introduction to Computational
Luigi Preziosi Proteomics
Computational Exome and Genome Golan Yona
Analysis
Peter N. Robinson, Rosario M. Piro,
and Marten Jäger
Computational Exome
and Genome Analysis
This book contains information obtained from authentic and highly regarded sources. Reasonable
efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and
publishers have attempted to trace the copyright holders of all material reproduced in this publication
and apologize to copyright holders if permission to publish in this form has not been obtained. If any
copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access
www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc.
(CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization
that provides licenses and registration for a variety of users. For organizations that have been granted
a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and
are used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
To Elisabeth, Sophie & Anna
-PNR
Preface xv
Contributors xxi
Part I Introduction
Chapter 1 Introduction 3
Chapter 4 Data 47
Chapter 5 FASTQ 57
Chapter 8 Alignment 95
ix
x Contents
Part VI Prioritization
References 503
Index 547
Who is this book for?
xiii
Preface
xv
xvi Preface
$ ls
Some commands are too long to fit onto a single line. In these cases,
we use a slash (\) to indicate that the command is continued on the
next line.
$ command argument-1 \
argument-2 ...
We have assumed that readers will adjust the paths of the com-
mands shown in the book as needed. We have also assumed they are
familiar with the basic Unix utilities such as wc and cat or are able to
find information about them on the Web.
We will define the abbreviations used in each chapter — with the
following exceptions that are used throughout the book.
• nt: nucleotides
Thanks
This book, and indeed much of the work done on genomics in Berlin,
would not have been possible without the work, conversations, and ad-
vice of many people. Professor Stefan Mundlos, the Director of Med-
ical Genetics and Human Genetics of the Charité University Hospital
in Berlin, provided a challenging and rewarding environment for the
Robinson lab, and well deserves our special thanks. Dr. Jochen Hecht
set up the NGS lab in Berlin from the beginning, solving one problem
after another by persistence, experience, and, yes, inspiration. Dr. Se-
bastian Köhler, who began as a Master student in the group, moving
up through the ranks as a graduate student and a postdoc, was in-
dispensable for multiple projects and is the co-founder and co-driving
force behind the HPO. We have been extremely lucky with the people
who became members of the lab, and would like to thank (in no special
order) Peter Hansen, Peter Krawitz, Leon Kuchenbecker, Verena Hein-
rich, Na Zhu, Raghu Bhushan, Claus-Eric Ott, Johannes Grünhagen,
Layal Abo Khayal, Valerie Johnston, Angelika Pletschacher, Robin
Steinhaus, Max Schubach, Michal Schweiger, and Manuel Holtgrewe.
These thanks also form a kind of farewell for Peter Robinson, who
has relocated to the Jackson Laboratory (JAX) for Genomic Medicine
in Farmington, CT, in August 2016 to pursue translational genomics
and computational biology in a new setting. And finally, . . . thanks to
Daniel Danis, Hannah Blau, and Leigh Carmody and the entire Jack-
son Laboratory for being so awesome!
We would also like to extend thanks to Rosario Piro’s colleagues at
the DKFZ (in no special order) Rainer König, Natalie Jäger, Barbara
Hutter, Roland Eils, Peter Lichter, Stefan Pfister, Andreas von Deim-
ling, Susanne Gröbner, Sonja Hutter, Marc Zapatka, Volker Hoves-
tadt, Bernhard Radlwimmer, David Reuss, David Capper, Felix Sahm,
Leonille Schweizer, Yvonne Schweizer, David Jones, Marcel Kool, Paul
Northcott, Benedikt Brors, Matthias Schlesner, Frank Westermann,
Marcus Oswald, Hendrik Witt, Josephine Bageritz, Violaine Goidts,
Martje Tönjes, Moritz Aschoff, and many others that were not directly
involved in his research projects but provided invaluable backing, guid-
ance and above all friendship.
Our book is based on over 10 years of experience in exome and
genome analysis. This book grew out of course notes prepared for a
course on exome analysis held in the Bioinformatics Department of the
Free University of Berlin, and we would like to thank the students of
Preface xix
that course for insightful questions that kept us on our toes. Many of
the exercises in this book were adapted from exercises in our course.
Finally, we would like to extend special thanks to Knut Reinert,
Alexander Bockmayr, and all the other colleagues at the Institutes of
Bioinformatics, Computer Science and Mathematics of Free Univer-
sity of Berlin (FUB) for creating an amazing atmosphere for learning,
teaching, and practicing bioinformatics.
xxi
I
Introduction
1
CHAPTER 1
Introduction: Whole
Exome and Genome
Sequencing
3
4 Computational Exome and Genome Analysis
— Ei, suomalaisia.
— Ei.
Matruusi hymyilee.
*****
Kello puoli kaksitoista kiintyvät katseemme laivan savutorvissa
oleviin merkinantolaitteisiin. Niissä alkaa punainen sähkölamppuvyö
kipinöidä. Torpeedo ilmoittaa nähtävästi tulostaan.
Se on risteilijä Augsburg.
— Sota-aika, vastataan.
*****
— Ajatelkaa, pojat, miten kamalata olisi ollut, jos joku olisi jäänyt
ammuttavaan laivaan?
On saapunut sotasähkösanomia.
*****
*****
Vihdoin komennetaan:
— Seis!