You are on page 1of 54

Hydrocarbon and Lipid Microbiology

Protocols: Genetic, Genomic and


System Analyses of Communities 1st
Edition Terry J. Mcgenity
Visit to download the full and correct content document:
https://textbookfull.com/product/hydrocarbon-and-lipid-microbiology-protocols-genetic
-genomic-and-system-analyses-of-communities-1st-edition-terry-j-mcgenity/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Hydrocarbon and Lipid Microbiology Protocols Synthetic


and Systems Biology Applications 1st Edition Terry J.
Mcgenity

https://textbookfull.com/product/hydrocarbon-and-lipid-
microbiology-protocols-synthetic-and-systems-biology-
applications-1st-edition-terry-j-mcgenity/

Hydrocarbon and Lipid Microbiology Protocols Synthetic


and Systems Biology Tools 1st Edition Terry J. Mcgenity

https://textbookfull.com/product/hydrocarbon-and-lipid-
microbiology-protocols-synthetic-and-systems-biology-tools-1st-
edition-terry-j-mcgenity/

Hydrocarbon and Lipid Microbiology Protocols: Microbial


Quantitation, Community Profiling and Array Approaches
1st Edition Terry J. Mcgenity

https://textbookfull.com/product/hydrocarbon-and-lipid-
microbiology-protocols-microbial-quantitation-community-
profiling-and-array-approaches-1st-edition-terry-j-mcgenity/

Hydrocarbon and Lipid Microbiology Protocols Microbial


Quantitation, Community Profiling and Array Approaches
Mcgenity

https://textbookfull.com/product/hydrocarbon-and-lipid-
microbiology-protocols-microbial-quantitation-community-
profiling-and-array-approaches-mcgenity/
Lipid Signaling Protocols 2nd Edition Mark Waugh (Eds.)

https://textbookfull.com/product/lipid-signaling-protocols-2nd-
edition-mark-waugh-eds/

Biota Grow 2C gather 2C cook Loucas

https://textbookfull.com/product/biota-grow-2c-gather-2c-cook-
loucas/

The Genetic Manipulation of Staphylococci Methods and


Protocols 1st Edition Jeffrey L. Bose (Eds.)

https://textbookfull.com/product/the-genetic-manipulation-of-
staphylococci-methods-and-protocols-1st-edition-jeffrey-l-bose-
eds/

Genomic Medicine: A Practical Guide Laura J. Tafe

https://textbookfull.com/product/genomic-medicine-a-practical-
guide-laura-j-tafe/

The Ubiquitin Proteasome System Methods and Protocols


Thibault Mayor

https://textbookfull.com/product/the-ubiquitin-proteasome-system-
methods-and-protocols-thibault-mayor/
Terry J. McGenity
Kenneth N. Timmis
Balbina Nogales Editors

Hydrocarbon and
Lipid Microbiology
Protocols
Genetic, Genomic and System
Analyses of Communities
Springer Protocols Handbooks

More information about this series at http://www.springer.com/series/8623


Terry J. McGenity • Kenneth N. Timmis • Balbina Nogales
Editors

Hydrocarbon and Lipid


Microbiology Protocols
Genetic, Genomic and System Analyses
of Communities

Scientific Advisory Board


Jack Gilbert, Ian Head, Mandy Joye, Victor de Lorenzo,
Jan Roelof van der Meer, Colin Murrell, Josh Neufeld,
Roger Prince, Juan Luis Ramos, Wilfred Röling,
Heinz Wilkes, Michail Yakimov
Editors
Terry J. McGenity Kenneth N. Timmis
School of Biological Sciences Institute of Microbiology
University of Essex Technical University Braunschweig
Colchester, Essex, UK Braunschweig, Germany

Balbina Nogales
Department of Biology
University of the Balearic Islands
and Mediterranean Institute for Advanced
Studies (IMEDEA, UIB-CSIC)
Palma de Mallorca, Spain

ISSN 1949-2448 ISSN 1949-2456 (electronic)


Springer Protocols Handbooks
ISBN 978-3-662-50449-9 ISBN 978-3-662-50450-5 (eBook)
DOI 10.1007/978-3-662-50450-5

Library of Congress Control Number: 2016938230

# Springer-Verlag Berlin Heidelberg 2017


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on
microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation,
computer software, or by similar or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply,
even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and
therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be
true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or
implied, with respect to the material contained herein or for any errors or omissions that may have been made.

Printed on acid-free paper

This Springer imprint is published by Springer Nature


The registered company is Springer-Verlag GmbH Berlin Heidelberg
Preface to Hydrocarbon and Lipid Microbiology
Protocols1

All active cellular systems require water as the principal medium and solvent for their metabolic and
ecophysiological activities. Hydrophobic compounds and structures, which tend to exclude water,
although providing inter alia excellent sources of energy and a means of biological compartmental-
ization, present problems of cellular handling, poor bioavailability and, in some cases, toxicity.
Microbes both synthesize and exploit a vast range of hydrophobic organics, which includes biogenic
lipids, oils and volatile compounds, geochemically transformed organics of biological origin
(i.e. petroleum and other fossil hydrocarbons) and manufactured industrial organics. The underlying
interactions between microbes and hydrophobic compounds have major consequences not only for
the lifestyles of the microbes involved but also for biogeochemistry, climate change, environmental
pollution, human health and a range of biotechnological applications. The significance of this
“greasy microbiology” is reflected in both the scale and breadth of research on the various aspects
of the topic. Despite this, there was, as far as we know, no treatise available that covers the subject.
In an attempt to capture the essence of greasy microbiology, the Handbook of Hydrocarbon and
Lipid Microbiology (http://www.springer.com/life+sciences/microbiology/book/978-3-540-77584-
3) was published by Springer in 2010 (Timmis 2010). This five-volume handbook is, we believe,
unique and of considerable service to the community and its research endeavours, as evidenced by
the large number of chapter downloads. Volume 5 of the handbook, unlike volumes 1–4 which
summarize current knowledge on hydrocarbon microbiology, consists of a collection of experimen-
tal protocols and appendices pertinent to research on the topic.
A second edition of the handbook is now in preparation and a decision was taken to split off
the methods section and publish it separately as part of the Springer Protocols program (http://
www.springerprotocols.com/). The multi-volume work Hydrocarbon and Lipid Microbiology
Protocols, while rooted in Volume 5 of the Handbook, has evolved significantly, in terms of
range of topics, conceptual structure and protocol format. Research methods, as well as
instrumentation and strategic approaches to problems and analyses, are evolving at an unprec-
edented pace, which can be bewildering for newcomers to the field and to experienced
researchers desiring to take new approaches to problems. In attempting to be comprehensive
– a one-stop source of protocols for research in greasy microbiology – the protocol volumes
inevitably contain both subject-specific and more generic protocols, including sampling in the
field, chemical analyses, detection of specific functional groups of microorganisms and com-
munity composition, isolation and cultivation of such organisms, biochemical analyses and
activity measurements, ultrastructure and imaging methods, genetic and genomic analyses,

1
Adapted in part from the Preface to Handbook of Hydrocarbon and Lipid Microbiology.
v
vi Preface to Hydrocarbon and Lipid Microbiology Protocols

systems and synthetic biology tool usage, diverse applications, and the exploitation of bioin-
formatic, statistical and modelling tools. Thus, while the work is aimed at researchers working
on the microbiology of hydrocarbons, lipids and other hydrophobic organics, much of it will be
equally applicable to research in environmental microbiology and, indeed, microbiology in
general. This, we believe, is a significant strength of these volumes.
We are extremely grateful to the members of our Scientific Advisory Board, who have
made invaluable suggestions of topics and authors, as well as contributing protocols them-
selves, and to generous ad hoc advisors like Wei Huang, Manfred Auer and Lars Blank. We also
express our appreciation of Jutta Lindenborn of Springer who steered this work with profes-
sionalism, patience and good humour.

Colchester, Essex, UK Terry J. McGenity


Braunschweig, Germany Kenneth N. Timmis
Palma de Mallorca, Spain Balbina Nogales

Reference

Timmis KN (ed) (2010) Handbook of hydrocarbon and lipid microbiology. Springer, Berlin, Heidelberg
Contents

Introduction to Genetic, Genomic, and System Analyses for Communities . . . . . . . . . . 1


Jack A. Gilbert and Nicole M. Scott
Genomic Analysis of Pure Cultures and Communities . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Stepan V. Toshchakov, Ilya V. Kublanov, Enzo Messina,
Michail M. Yakimov, and Peter N. Golyshin
Protocols for Metagenomic Library Generation and Analysis
in Petroleum Hydrocarbon Microbe Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Stephanie M. Moormann, Jarrad T. Hampton-Marcell,
Sarah M. Owens, and Jack A. Gilbert
Preparation and Analysis of Metatranscriptomic Libraries
in Petroleum Hydrocarbon Microbe Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Jarrad T. Hampton-Marcell, Angel Frazier, Stephanie M. Moormann,
Sarah M. Owens, and Jack A. Gilbert
Prokaryotic Metatranscriptomics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Danilo Pérez-Pantoja and Javier Tamames
Defining a Pipeline for Metaproteomic Analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Joseph A. Christie-Oleza, Despoina Sousoni, Jean Armengaud,
Elizabeth M. Wellington, and Alexandra M.E. Jones
Metabolic Profiling and Metabolomic Procedures for Investigating
the Biodegradation of Hydrocarbons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Vincent Bonifay, Egemen Aydin, Deniz F. Aktas, Jan Sunner,
and Joseph M. Suflita
Generating Enriched Metagenomes from Active Microorganisms
with DNA Stable Isotope Probing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
Carolina Grob, Martin Taubert, Alexandra M. Howat, Oliver J. Burns,
Yin Chen, Josh D. Neufeld, and J. Colin Murrell
DNA- and RNA-Based Stable Isotope Probing of Hydrocarbon Degraders . . . . . . . . . . 181
Tillmann Lueders
Protocol for Performing Protein Stable Isotope Probing (Protein-SIP)
Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
Nico Jehmlich and Martin von Bergen

vii
viii Contents

Protein Extraction from Contaminated Soils and Sediments . . . . . . . . . . . . . . . . . . . . . 215


Mercedes V. Del Pozo, Mónica Martı́nez-Martı́nez, and Manuel Ferrer
Analysis of the Regulation of the Rate of Hydrocarbon and Nutrient
Flow Through Microbial Communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
Wilfred F.M. Röling, Lucas Fillinger, and Ulisses Nunes da Rocha
Constructing and Analyzing Metabolic Flux Models of Microbial
Communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
José P. Faria, Tahmineh Khazaei, Janaka N. Edirisinghe, Pamela Weisenhorn,
Samuel M.D. Seaver, Neal Conrad, Nomi Harris, Matthew DeJongh,
and Christopher S. Henry
Protocol for Evaluating the Permissiveness of Bacterial Communities
Toward Conjugal Plasmids by Quantification and Isolation
of Transconjugants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Uli Klümper, Arnaud Dechesne, and Barth F. Smets
About the Editors

Terry J. McGenity is a Reader at the University of Essex, UK. His


Ph.D., investigating the microbial ecology of ancient salt deposits
(University of Leicester), was followed by postdoctoral positions at
the Japan Marine Science and Technology Centre (JAMSTEC, Yoko-
suka) and the Postgraduate Research Institute for Sedimentology (Uni-
versity of Reading). His overarching research interest is to understand
how microbial communities function and interact to influence major
biogeochemical processes. He worked as a postdoc with Ken Timmis at
the University of Essex, where he was inspired to investigate microbial
interactions with hydrocarbons at multiple scales, from communities to cells, and as both a source of
food and stress. He has broad interests in microbial ecology and diversity, particularly with respect
to carbon cycling (especially the second most abundantly produced hydrocarbon in the atmosphere,
isoprene), and is driven to better understand how microbes cope with, or flourish in hypersaline,
desiccated and poly-extreme environments.

Kenneth N. Timmis read microbiology and obtained his Ph.D. at


Bristol University, where he became fascinated with the topics of
environmental microbiology and microbial pathogenesis, and their
interface pathogen ecology. He undertook postdoctoral training at the
Ruhr-University Bochum with Uli Winkler, Yale with Don Marvin,
and Stanford with Stan Cohen, at the latter two institutions as a Fellow
of the Helen Hay Whitney Foundation, where he acquired the tools and
strategies of genetic approaches to investigate mechanisms and causal
relationships underlying microbial activities. He was subsequently
appointed Head of an Independent Research Group at the Max Planck
Institute for Molecular Genetics in Berlin, then Professor of Biochem-
istry in the University of Geneva Faculty of Medicine. Thereafter, he became Director of the
Division of Microbiology at the National Research Centre for Biotechnology (GBF)/now the
Helmholtz Centre for Infection Research (HZI) and Professor of Microbiology at the Technical
University Braunschweig. His group has worked for many years, inter alia, on the biodegradation of
oil hydrocarbons, especially the genetics and regulation of toluene degradation, pioneered the
genetic design and experimental evolution of novel catabolic activities, discovered the new group
of marine hydrocarbonoclastic bacteria, and conducted early genome sequencing of bacteria that

ix
x About the Editors

became paradigms of microbes that degrade organic compounds (Pseudomonas putida and Alcani-
vorax borkumensis). He has had the privilege and pleasure of working with and learning from some
of the most talented young scientists in environmental microbiology, a considerable number of
which are contributing authors to this series, and in particular Balbina and Terry. He is Fellow of the
Royal Society, Member of the EMBO, Recipient of the Erwin Schrödinger Prize, and Fellow of the
American Academy of Microbiology and the European Academy of Microbiology. He founded the
journals Environmental Microbiology, Environmental Microbiology Reports and Microbial Bio-
technology. Kenneth Timmis is currently Emeritus Professor in the Institute of Microbiology at the
Technical University of Braunschweig.

Balbina Nogales is a Lecturer at the University of the Balearic Islands,


Spain. Her Ph.D. at the Autonomous University of Barcelona (Spain)
investigated antagonistic relationships in anoxygenic sulphur photosyn-
thetic bacteria. This was followed by postdoctoral positions in the
research groups of Ken Timmis at the German National Biotechnology
Institute (GBF, Braunschweig, Germany) and the University of Essex,
where she joined Terry McGenity as postdoctoral scientist. During that
time, she worked in different research projects on community diversity
analysis of polluted environments. After moving to her current position,
her research is focused on understanding microbial communities in chronically hydrocarbon-polluted
marine environments, and elucidating the role in the degradation of hydrocarbons of certain groups of
marine bacteria not recognized as typical degraders.
Introduction to Genetic, Genomic, and System Analyses
for Communities
Jack A. Gilbert and Nicole M. Scott

Abstract
Complex microbial ecosystems represent unique challenges to understanding, especially with regard to the
complex metabolic relationships that define the network of interactions that define the systems ecology of
an environment. Utilizing a wealth of available techniques we can now explore the genomic, transcriptomic,
proteomic, and metabolomic components of this system, with each component providing a window into a
stage of the network of interactions. Using these techniques we are just starting to map and validate the
mechanisms of catabolism, anabolism, and metabolite cross talk that enable microorganisms to sense and
interact with their environment. Translating this information into useful knowledge is the next major
challenge, with the end goal of producing models that can capture the variance and complexity of these
systems to faithfully reproduce observed characteristics of an ecosystem. We will explore some of these
techniques and tools in this section.

Keywords Metabolic networks, Metabolomics, Metagenomics, Metaproteomics, Metatranscrip-


tomics, Systems models

Much can be said about the extraordinarily rapid development of


the field of microbial ecology over the last ten years within the
world of sequencing. Since 2005 our ability to interrogate micro-
bial communities through high-throughput sequencing of their
DNA and RNA has revolutionized our understanding of their
diversity, assembly, structure, and function. Subsequent develop-
ments in understanding the post-genomic and transcriptomic biol-
ogy of microbial worlds have led to the development of proteomics
and metabolomics, which are also now being applied to whole
assemblages to determine functional enzyme abundances, and met-
abolic relationships between different organisms. Essentially we are
now just starting to map out the information pathways that consti-
tute the “system” in “ecosystem.” In this section we will explore
the techniques and tools that are now being applied to elucidate the
ecological relationships that define the interaction space within a
dynamic biological assemblage of microorganisms. The complexity

T.J. McGenity et al. (eds.), Hydrocarbon and Lipid Microbiology Protocols, Springer Protocols Handbooks, (2017) 1–4,
DOI 10.1007/8623_2014_5, © Springer-Verlag Berlin Heidelberg 2014, Published online: 19 November 2014

1
2 Jack A. Gilbert and Nicole M. Scott

of these interactions often defies even our most advanced techno-


logical and analytical efforts to understand them, yet with every
new study that explores both simple and complex communities
alike, we are gaining new ground in mapping these processes.
This section will explore the multiple techniques being applied
to elucidate the structure of communities, map interactions
between organisms, and create predictive models of these dynam-
ics. The section is therefore divided into the methodologies
being used to generate data from the genome, transcriptome, and
proteome; the analytical techniques being leveraged to translate
these data into information to test complex hypotheses; and finally,
the approaches to model the complex dynamics within each com-
munity so as to be able to predict how these assemblages will
respond to changes in their environment. We will cover under-
standing the community composition and abundance through
amplicon sequencing and shotgun metagenomics. The data from
these two approaches can be used to map the diversity and compo-
sition of communities, understand the phylogenetic and functional
relationships between taxa, and elucidate ecological relationships
between organisms and environmental factors. Metagenomics is
also becoming an extremely valuable tool to reconstruct genomes
of key bacterial, archaeal, and viral members of the community,
providing access to the uncultivable. Such organism-specific gene
repertoires can be used refine understanding of organism interac-
tions. In fact, being able to reconstruct whole or partial genotypes
from complex community metagenomic data has revolutionized
our ability to explain how organisms coexist, share resources, com-
pete, and cooperate within complex ecosystems. This above all else
is helping us to create conceptual models of microbial interactions
within different environments that can shed light on the processes
that lead to observed biogeochemical dynamics for that system.
Amplicon sequencing, defined here as the sequencing of spe-
cific gene fragments amplified using PCR, has primarily focused on
the 16S rRNA gene, providing an unprecedented exploration of the
phylogenetic diversity of the bacterial and archaeal kingdoms.
However, amplicon sequencing is now being applied to 18S and
ITS rRNA to help characterize the eukaryotic and fungal compo-
nents (respectively) of ecosystems. The technology can also be
applied to specific functional genes and viral markers, so that the
diversity of these elements can be better catalogued. Metatranscrip-
tomics, defined as the transcriptional profile of a community of
organisms, has become extremely valuable in illuminating how
communities of organisms are responding to short-term changes
in the environment. The application of next-generation sequencing
platforms has enabled us to map gene regulation so that we can
explore the potential physiological adaptation microbes present to
deal with shifts in the availability of nutrients, environmental stres-
sors, physical interaction, or physicochemical variability. Analytical
Introduction to Genetic, Genomic, and System Analyses for Communities 3

techniques allow us to explore how organisms in an assemblage


respond, by either mapping back to assembled reads and/or com-
paring mapped reads to metagenomic results, providing the poten-
tial to model gene regulation and predict the flux of metabolites
that those genes may mediate once expressed as proteins. Recently,
developments in proteomics have led to its application to under-
stand the protein diversity and structure of whole communities,
providing a solid understanding of the functional components of
ecosystems [1]. This is the final stage of biological translation from
the genome and provides a key aspect of the interface between the
genome and the world of metabolites.
In analyzing these data one of the most important stages is
annotation; in this section we outline different approaches for
annotating particular genes, functional attributes, and/or pathways
of interest from amplicon, metagenomic, and metaproteomic data-
sets. Though these techniques and methods are limited by the
contents of current databases and available genomes, this set is
ever expanding. The ability to annotate genes of interest within a
genome, transcriptome, or proteome is a fundamentally important
if we want to translate what these data mean. Knowing the abun-
dance of a protein sequence or transcript across a gradient of
methane production is mostly useless if we don’t know what func-
tion that sequence codes for. Much focus has been given to trying
to improve annotation of genes, transcripts, and proteins of interest
using our existing understanding of the functional repertoire of
cells. Primarily this is because our ability to generate data of eco-
logical relevance now outstrips our ability to biochemically assess
the actual function of a given gene, transcript, or protein, which
requires heterologous expression, functional screening, isolation,
and characterization of the protein structure. These processes are
expensive, and time consuming, and only work for a subsection of
sequences. While the functional characterization and annotation
research community catches up with our ability to generate data, we
are left with using what we have to interpret the data we can generate.
This section will explore informatics methods for predicting biodeg-
radation enzymes and metabolic pathways from metagenomic data,
annotating functions in metatranscriptomic data, and annotating
genes undergoing lateral gene transfer between taxonomic lineages.
While this is only a tiny subset of the tools that could be and should
be applied to understand microbial ecological dynamics, it goes
someway to providing a tool box for exploring the prevalence of
functions associated with hydrocarbon and lipid metabolism.
Finally we will also explore how to model microbial commu-
nities using metagenomic data. Much has already been written
about modeling amplicon data describing microbial dynamics to
be able to predict how ecosystems may respond to changing envi-
ronmental conditions. However, the research community is only
just starting to leverage functional information about microbial
4 Jack A. Gilbert and Nicole M. Scott

assemblages to predict how they will shift in response to these


changes [2]. The ability of an assemblage, and more importantly
the components of that assemblage, to physiologically respond to
changing environmental conditions is fundamental to ecology, and
being able to model these responses in a predictive way provides the
researcher with unparalleled opportunities to explore the stability
and adaptability of microbial ecosystems. These models should
enable us to start a new era of hydrocarbon and lipid microbial
biotechnology by designing, in silico, assemblages of microorgan-
isms that can mediate key reactions, perform specific remediation,
and generate key products of interest to industry and society. It is
through these models that we will be able to design the next genera-
tion of biofuels and leverage a system scale understanding of genetic
components to design more efficient organisms and assemblages to
perform key biotechnological roles in industrial processes.
The key hurdles that are slowing progress in both data genera-
tion and data analysis are multifold, but not insurmountable. Sam-
ple acquisition and preparation is slow and lugubrious; automation
of sampling and sample processing would significantly improve our
capacity to generate datasets that have the potential to appropri-
ately test specific hypotheses. Similarly, being able to analyze gen-
erated data requires improved algorithms, better standardization of
data treatment to enable comparability, and a move toward shared
computed resources to improve everyone’s capacity to process
multi-terabase pair datasets. Of course, we also, as a community,
need to move beyond the purely descriptive studies, which while
useful at providing data, are limited in their ability to further our
understanding of specific processes. We need to focus much more
on how communities of microorganisms associated with hydrocar-
bon and lipid metabolism assemble, interact, and mediate their
physicochemical environment. This requires a much more con-
certed effort to better describe the individual organisms and then
to leverage the suite of meta’omic tools to parameterize their
contextual interactions within different ecosystem scenarios.
We face a bold new future of microbial assemblage analysis, one
with the potential to change virtually every aspect of our lives on
this planet. With the proper suite of tools and technologies, we can
harness knowledge of the inner workings of the microbial machine
and define new ways to exploit the seemingly infinite biochemical
potential of these systems.

References

1. Williams TJ, Cavicchioli R (2014) Marine meta- 2. Larsen PE, Gibbons SM, Gilbert JA (2012)
proteomics: deciphering the microbial metabolic Modeling microbial community structure and
food web. Trends Microbiol 22(5):248–260. functional diversity across time and space.
doi:10.1016/j.tim.2014.03.004 FEMS Microbiol Lett 332(2):91–98. doi:10.
1111/j.1574-6968.2012.02588.x
Genomic Analysis of Pure Cultures and Communities
Stepan V. Toshchakov*, Ilya V. Kublanov*, Enzo Messina,
Michail M. Yakimov, and Peter N. Golyshin

Abstract
Oil-degrading bacteria and their communities have been in focus of the research for the past few decades for
a number of reasons. First, this allows filling the voids in our knowledge on the major mechanisms
facilitating the oil biodegradation, to identify the key organisms playing significant roles in these processes
and, furthermore, to learn how to effectively manage their performance in situ to enhance the rates of
biodegradation. Historically, of a particular interest for genomics studies were the so-called marine hydro-
carbonoclastic bacteria, the petroleum biodegradation specialists with very restricted substrate profiles.
Apart from their utility in environmental cleanup, oil-degrading bacteria possess an array of enzymes and
pathways of a great potential for further biotechnological applications: biopolymers production, oxidation-
reduction reactions, chiral synthesis, biosurfactant production, etc. In this chapter we describe current
methods for genome and metagenome sequencing and annotation. Importantly, these are not limited to a
particular group of microorganisms and are thus almost universally applicable. We focused exclusively on
the methods and tools that everyone could use on a non-commercial basis. Due to the availability of
numerous alternative methods and approaches, we have arbitrarily chosen reliable protocols that can be
used by a common biologist without a great deal of computational biology background.

Keywords: Assembly, Comparative genomics, Functional annotation, Genome, Metagenome


sequencing

1 Introduction

Genomic analysis of oil-degrading bacteria has started with the


publication of the genome of Alcanivorax borkumensis SK2 [1]
which used the, at that time common, Sanger sequencing tech-
nique for genomic data production. Since then, few dozens of
genome and metagenome sequencing projects on oil-degrading
microbes and their communities have been carried out, with an
increasing use of new emerging technologies for the genome
sequencing and data analysis. The impact of the next-generation

*Author contributed equally with all other contributors.

T.J. McGenity et al. (eds.), Hydrocarbon and Lipid Microbiology Protocols, Springer Protocols Handbooks, (2017) 5–27,
DOI 10.1007/8623_2015_126, © Springer-Verlag Berlin Heidelberg 2015, Published online: 09 August 2015

5
6 Stepan V. Toshchakov et al.

sequencing techniques (NGS) on microbial genomics can be


compared with the invention of PCR reaction. While up to
the late 2000s the experimental genome science was a privilege
of big sequencing centres, currently available benchtop,
second-generation sequencers make the whole genome analysis
affordable for smaller research labs and individuals. With a machine
cost equal to, or lower than that of, multicapillary automated unit
for Sanger sequencing, small NGS sequencers output up to 15
billion nucleotides of high-quality data in a single run, which is
usually more than enough for microbial genomics studies. At the
same time, extremely high throughputs of NGS machines give rise
to data analysis challenges. Currently, there is a great variety of
software packages for NGS data analysis, each of which is usually
aimed to perform a certain step in genomic analysis pipeline, and all
of them cannot be reviewed in a book chapter. Therefore, here we
describe the pipeline we routinely use for the analysis of NGS data
of microbial pure cultures and communities.

1.1 Experiment The importance of the appropriate choice of sequencing strategy


Planning: for the production of high-quality genomic or metagenomic data
Considerations of cannot be underestimated. A thorough experimental design can
Library Type, minimize efforts needed for genome finishing (in case of pure
Sequencing Platform, cultures) or taxonomic sequence assignment (environmental sam-
Minimal Coverage and ples), as opposed to suboptimal strategies, which may lead to
Sample Multiplexing significant increase in monetary costs and working time effort
Strategy needed to address the research question.
Firstly, one should understand what degree of sequence assem-
1.1.1 General Comments bly contiguity is required. For instance, to get information about
coding potential of the microorganism(s) presented in the sample,
there is no need to get ungapped genome sequence, and this can be
achieved by a standard shotgun sequencing protocol with a simple
assembly pipeline. This will result in very short unordered contigs,
but nevertheless information about proteins encoded in the DNA
sample can be easily extracted. On the other hand, comparative
genomic analysis demands more sophisticated approach including
mate-paired library preparation, scaffolding and assembly valida-
tion. In this section we review principal experimental parameters
needed to be defined before proceeding with wet lab work.

1.1.2 Library Type There are two major kinds of DNA libraries generally used for de
novo (meta)-genomic assembly: fragment (shotgun or short-
insert) and mate-paired (or long-insert) library. Standard fragment
library protocol includes ultrasound or enzymatic fragmentation of
DNA molecule to short 150–800 nt long pieces, ligation of
sequencing adapters, size selection and PCR. Currently available
fragment library preparation kits are quite robust and allow prepa-
ration of the NGS library from DNA sample within 1 working day.
Final fragment library can be sequenced using single-end or
paired-end approach, depending upon the chosen platform.
Genomic Analysis of Pure Cultures and Communities 7

However, even short repeated bacterial genome sequences such as


IS elements (length from 700 to 2,500 bp) usually cannot be
resolved by data generated from this kind of library. To resolve a
repeat with current de novo assembly algorithms, initial fragment
size should be larger than a length of repeat. To overcome this issue
the mate-paired library preparation protocol is used.
For mate-paired library approach, the large genome fragments
(3–20 kbp) are generated using hydrodynamic shearing. Then,
DNA fragments are subjected to intramolecular DNA circulariza-
tion in highly diluted solution to avoid intermolecular ligation
events. In the next step, the bigger part of circularized DNA
molecule is enzymatically degraded, leaving only short DNA frag-
ments (corresponding to the termini of the initial long fragment)
adjacent to the internal adapter. Finally, the standard NGS library
adapters are ligated on both sides of the fragment. This mate-paired
library can be sequenced with single- or paired-end reads giving
information about nucleotide sequence of two fragments posi-
tioned in the genome at the predefined distance, corresponding
to the initial fragmentation size range. This information can be
effectively used to determine the order and orientation of contigs
in final genome sequence.
This powerful approach demands, however, a relatively big
quantity of the starting material, mainly due to the low efficiency
of intramolecular ligation (Table 1). To our experience, the quan-
tity of starting material is the main factor to be considered to make a
decision about library preparation method. Thus, renewable sam-
ples, such as pure or enrichment cultures, definitely should be
sequenced by a combination of fragment and mate-paired library
protocols, while environmental scarce and rare samples are usually
analysed only with the fragment library.

Table 1
Comparison of fragment and mate-paired library preparation protocols

Fragment (short-insert) library Mate-paired (long-insert) library


Fragment size 150–1,000 bp 1.5–20 kb
Minimal amount 1 ng 1 ug
of starting material
Preparation time 0.5 2–3
(working days)
Advantages Fast processing, less PCR bias Longer scaffolds, less misassemblies
Disadvantages Maximum length of repeat Big amount of starting material needed –
which can be resolved equal not always possible to use with non-
to fragment length culturable samples; longer processing
time and higher library prep costs
8 Stepan V. Toshchakov et al.

1.1.3 Sequencing Each NGS platform currently available on the market can be
Platform characterized by several distinctive features, along with some
advantages and disadvantages. Thus, Roche 454 systems provided
longest reads among all second-generation platforms, SOLiD 5500
system gives a highest accuracy, Illumina HiSeq characterized
by deepest coverage, etc. All these characteristics of different
sequencing platforms are thoroughly analysed in a number of
excellent reviews [2, 3]. In this chapter, we would like the readers
to pay their attention mainly to tabletop sequencers, which are
considered as systems of choice for individual microbiology
research groups.
Introduced in 2011, both Ion Torrent Personal Genome
Machine (PGM) (Life Technologies) and MiSeq Desktop
Sequencer (Illumina) made next-generation sequencing technolo-
gies routine procedures in microbiology. While Illumina MiSeq
represents a reduced version of HiSeq possessing the same revers-
ible terminator sequencing chemistry, PGM is build on a new
principle based upon pH detection during incorporation of nucleo-
tides in growing chain. Current specifications of both systems are
summarized in Table 2. Despite MiSeq outperforms Ion Torrent by
throughput per run, short running time of PGM makes both
system comparable in terms of data production rate per day consid-
ering the 24/7 PGM load.

1.1.4 Minimal Coverage Reference-free de novo assembly requires significantly more cover-
age than reference-based (or mapping) assembly does. Generally
accepted value of read coverage for de novo assembly of pure
cultures is 100. This implies that for de novo assembly of bacteria
with genome of five million bases, one should generate at least
500 Mbp of sequencing data.
For the metagenomic assembly, one cannot make a reliable
judgment about total size of genomes presented in the sample.
In the literature, one can find some practical approaches based on
computational (statistical) methods typically counting selected
genes inside metagenome sequences in order to produce an

Table 2
Comparison of MiSeq and PGM sequencing systems (based upon February
2015 system specs)

Parameter Illumina MiSeq Ion Torrent PGM


Maximum output 15 Gb 1.2–2 Gb
Maximum reads 22–25 Millions 4–5.5 Millions
Maximum read length 300 bp 400 bp
Runtime for maximal output 55 h 7.3 h
Genomic Analysis of Pure Cultures and Communities 9

estimate formula or numeric results. In the paper of Raes and


colleagues [4], for example, authors introduce a formula to predict
effective genome size (EGS) from short sequencing reads of envi-
ronmental metagenomics projects based on counts of selected
marker genes from several completely sequenced genomes. The
web-based COVER tool, in contrast, may be used for an a priori
estimation of coverage for metagenomic sequencing [5]. It uses a
set of selected 16S rRNA gene sequences in order to estimate the
number of OTUs (Operational Taxonomic Units) in the sample
and to calculate the amount of sequencing needed to achieve
certain coverage. However, these methods cannot be considered
as universal, although they do provide a demonstration of the
existence of a correlation between the quantity of sequences pro-
duced and abundance (or coverage) of a given species. Therefore, in
an ideal case, a small-scale pilot estimation of the metagenome
complexity would be very helpful [6].

2 De Novo Assembly Pipeline

2.1 General De novo assembly of genome sequence from a raw NGS data is a
Comments multistep process summarized in Fig. 1. Currently, one can find
numerous commercial and academic (usually, free) software
packages, each of which can execute particular assembly step. Aca-
demic software is usually designed to work with a specific kind of
data and for the specific application (Table 3). Below we describe
analysis pipeline used to determine genome sequence of pure

Quality
Raw files Quality control
trimming

De novo Error correction Adapter


assembly (optional) trimming

Assembly
Scaffolding Gapfilling validation

Fig. 1 General stages and steps in the de novo assembly


10

Table 3
Commercial and freely available software packages for particular steps of de novo assembly pipeline

Name of Web-based/ OS compatibility


Pipeline module software Link NGS platform(s) standalone (standalone version) References
Quality control FastQC http://www.bioinformatics.babraham.ac. Any Standalone Win/MacOS X/Linux –
and trimming uk/projects/fastqc/
Prinseq http://prinseq.sourceforge.net/ Any Both MacOS X/Linux [7]
Trim http://www.bioinformatics.babraham.ac. Any Standalone MacOS X/Linux –
Stepan V. Toshchakov et al.

Galore! uk/projects/trim_galore/
Decontamination Deconseq http://deconseq.sourceforge.net/index. Any Both MacOS X/Linux [8]
html
Error correction Quake http://www.cbcb.umd.edu/software/ Illumina Standalone MacOS X/Linux [9]
quake/
Coral http://www.cs.helsinki.fi/u/lmsalmel/ 454/IonTorrent Standalone Linux [10]
coral/
De novo MIRA http://sourceforge.net/projects/mira- Any Standalone MacOS X/Linux [11]
assembly assembler/
Newbler http://www.454.com/products/analysis- Any Standalone Linux [12]
software/
Velvet https://www.ebi.ac.uk/~zerbino/velvet/ Any Standalone MacOS X/Linux [13]
Scaffolding SSPACE http://www.baseclear.com/genomics/ Any Standalone MacOS X/Linux [14]
bioinformatics/basetools/SSPACE
Velvet https://www.ebi.ac.uk/~zerbino/velvet/ Any Standalone MacOS X/Linux [13]
Filling of GapFiller http://www.baseclear.com/genomics/ Any Standalone MacOS X/Linux [15]
sequence gaps bioinformatics/basetools/gapfiller
GMcloser http://sourceforge.net/projects/gmcloser/ Any Standalone MacOS X/Linux Unpublished
Genomic Analysis of Pure Cultures and Communities 11

culture with Illumina paired-end (fragment library) and PGM


mate-paired reads with free software packages. However, this par-
ticular protocol can be adjusted depending upon the goal of study
and particular NGS dataset. Also it should be noted that despite
general format for NGS data is usually standard, simple scripts
might be required to modify special features of input files
demanded by different parts of the pipeline. These scripts are
widely available through the web, but most popular and well docu-
mented are as follows:
l FastX Toolkit (http://hannonlab.cshl.edu/fastx_toolkit/)
l fastq-tools (http://homes.cs.washington.edu/~dcjones/fastq-
tools/)
l SAM tools (http://samtools.sourceforge.net/)

2.2 Raw Read Files In most cases, raw next-generation sequencing data are presented
as fastq files which store the reads as a nucleotide sequence with its
corresponding quality values for each nucleotide in a read (Fig. 2).
For sequencers using single nucleotide flow for sequence detec-
tion (IonTorrent and 454-based systems), data can also be pre-
sented as *.sff (Single Flow Format). If for some reason the user is
supplied just with *.sff files from the sequencing centre, it can easily
be converted to fastq with variety of tools (e.g. sff to fastq script
https://github.com/indraniel/sff2fastq)

2.3 Quality Control Extensive quality control of raw NGS data is a crucial step for all
kinds of downstream applications.
Despite there is a great variety of NGS quality control packages,
one of the easiest approaches would be to install and run FastQC
developed in Babraham Bioinformatics Institute (http://www.bioin
formatics.babraham.ac.uk/projects/fastqc/). FastQC is compiled in

Fig. 2 MiSeq raw data presented in fastq format. String containing read name (header) marked with (A);
nucleotide sequence (B); ‘+’ is a separator string; sequence of quality values in ASCII format (C )
12 Stepan V. Toshchakov et al.

Fig. 3 Results of FastQC analysis. In the example presented here, the read quality values are fine, but TruSeq
Adapter Index 5 sequences presented in the beginning of some reads result in the red flag on “Overrepre-
sented sequences”, “Per base sequence content” and “Kmer Content”

Java language, which makes it possible to use FastQC with any OS


with suitable Java environment installed.
Analysis is started by simple opening of raw fastq files, and
results are presented as a number of reports. Reports on quality
tests which passed normally, with warning or with error, are high-
lighted with green, yellow or red, respectively (Fig. 3).
Results of read analysis provide a clue to adjust appropriate
parameters for filtering and trimming in such a way that average
quality of data significantly improves and, at the same time, the data
does not get lost.

2.4 Quality Read To our experience, among numerous software packages used for
Trimming filtering and trimming next-generation sequencing reads, the pack-
age of choice should work with almost all kinds of sequencing data
and be highly adoptable in terms of filtering and trimming para-
meters. For routine NGS applications, we use Trim Galore! package
which allows to perform both quality and adapter trimming in
one step. It can be downloaded from Babraham Bioinformatics
website (http://www.bioinformatics.babraham.ac.uk/projects/
trim_galore/) and run by a command shell in Linux or MacOS X
system. In the first step, the software trims low-quality base calls
with thresholds specified by user. Next, the adapter sequences are
trimmed. By default, the Trim Galore! deletes sequences identical
to Illumina Universal adapter, but any other adapter sequences can
also be specified. It is very important to provide a level of stringency
Genomic Analysis of Pure Cultures and Communities 13

during 30 prime adapter search since default settings deleting even


single base identical to the first base of adapter might be too rigid.
At the end, all the reads (read pairs) shorter than specified threshold
are filtered out.
Another useful tool which should be mentioned especially due
to the availability of web-based service which allows avoiding any
software installations is PrinSeq [7]. PrinSeq is capable of dupli-
cated read removal, length quality and complexity filtering and read
trimming of any flavour. Useful feature of PrinSeq is its ability to
convert different read formats.

2.5 Decontamination Sequences obtained from symbiotic microbial communities usually


contain loads of host DNA reads. Also, the library preparation
process includes steps susceptible to contamination (such as library
size selection by agarose gel electrophoresis). Sequence contamina-
tion may seriously affect further de novo assembly, leading to
misassemblies and wrong experiment conclusions. Therefore, it is
always useful to remove potential contamination from the read
pool. Obvious way to remove contaminating sequences is to map
all the reads against the host reference sequence (if available) and
collect unmapped reads. Deconseq, developed in San Diego State
University [8], allows fast and robust removal of contaminating
sequences from the dataset. This package implements BWA-SW
algorithm [16] to map reads against the genome of contamination
source. Deconseq exists in both web-based and standalone ver-
sions, but due to the high load of server, web-based analysis may
took up to several weeks. Therefore, it makes sense to install this
package locally.
To run Deconseq, the user must firstly create BWA index
database of contamination source genome by BWA index script.
Consequently, this database is used by the software to map the
reads against it.
Current version of Deconseq does not support paired reads;
therefore, after decontamination, the files of every pair need to be
synchronized – in other words, all reads in tag file that do not have a
pair in another should be removed, which can be done by number
of scripts presented on the web.

2.6 Read Error Despite accuracy of next-generation sequencing, data is continu-


Correction ously improving by advances in sequencing chemistry and base
calling algorithms; the percentage of errors in NGS reads is still
much higher than in Sanger sequencing outputs. Due to that fact, a
number of read error-correction algorithms were developed [17].
In general, error-correction algorithms work by splitting all the
reads to the nucleotide stretches of a fixed length (k-mers) and
then aligning k-mers to each other and correcting errors in specific
k-mer position using the information from the majority of reads in
the alignment. Due to the difference in raw data format and
14 Stepan V. Toshchakov et al.

platform-specific biases, error-correction packages are much more


specific to the sequencing platform than any other software used in
de novo assembly pipeline. Also it should be noted that this step
should be performed with a great deal of caution or not performed
at all when analysing metagenomic datasets: the reads,
corresponding to minor representatives of population, might be
erroneously corrected.
In particular, we usually utilize Quake for Illumina read correc-
tion [9] and Coral for PGM reads [10]. Both programs can be used
on Linux/MacOS X machines and quite simple in operation.
Detailed manuals can be found on software web pages (Table 3).

2.7 De Novo De novo assembly is a computational process aimed at building


Assembly long contiguous sequences (contigs) from a bunch of overlapping
fragments (reads). This is a central step of all the NGS project
pipeline, and any errors that emerged in this step may affect all
the downstream analysis. Main bottlenecks here are comparatively
high error rates of NGS data, short read length and repetitive
elements widespread in genomes of any kind.
Currently, there are many sequence assembly packages, well
reviewed in several papers [18, 19]. All algorithms implemented
in these packages may be subdivided onto three different classes:
greedy algorithms, OLC (overlap-layout-consensus) algorithms
and DeBrujin graph-based algorithms. While greedy algorithms
were widely used in the beginning of NGS era, they were shown
to be very prone to repeat-associated misassemblies and therefore
gave its place to graph-based algorithms. The most effective and
reliable de novo assemblers are based on OLC algorithms. Those
are MIRA [11] and Newbler assemblers [12], extensively used for
prokaryotic de novo assembly. Finally, the DeBrujin graph-based
algorithm is actually implemented by the Velvet assembler [13] and
mainly used for high coverage very short read datasets (typical of
IlluminaHiSeq). The above packages can be run from the com-
mand shell in Unix-based operating systems and work well with
default parameters. It should be considered, however, that memory
needed to run an OLC assembler is proportional to the number of
input reads. Furthermore, excessive coverage can lead to a more
fragmented assembly. Thus, normal coverage for MIRA assembler
is about 50 to 70 for genome projects. Given a huge amount of
data user gets from the sequencer, NGS reads should sometimes be
randomly down sampled before putting them into the assembly
step in order to save more RAM and storage capacities.
The result of the assembly is presented in the form of contig file
in FASTA format together with quality value file which contains
estimation of the reliability of each nucleotide assembled.
Genomic Analysis of Pure Cultures and Communities 15

2.8 Contig Scaffolding is a process of determination of relative orientation of


Scaffolding contigs and the distance between them. Scaffolding is usually per-
and Gapfilling formed by using paired reads information. Thus, two adjacent
contigs may be placed in the proper orientation by considering
the position and orientation of each read of the pair splitted
between two contigs. Some assemblers (e.g. Newbler or Velvet)
have the built-in scaffolding procedures, while for others (MIRA)
the scaffolding needs to be done separately. Very useful and simple
standalone scaffolding tool is SSPACE by BaseClear BV (Nether-
lands), freely distributed by request to academic users [14].
SSPACE is compatible with Linux and MacOS machines and is
operated from command line. As an input, it uses contig files and
paired sequence reads. Information of relative orientation and dis-
tance between reads is supplied separately in parameters file. As the
result, the user obtains scaffolds, the sets of oriented contigs with
Ns between them, corresponding to the length of unresolved
sequence. Those unknown nucleotide blocks are called gaps and
also can be resolved by computational approach by using paired-
read information. This method is implemented in GapFiller soft-
ware also distributed by BaseClear BV [15]. GapFiller uses local
assembly, based upon extending contig ends by paired reads, par-
tially mapped to termini of adjacent contigs. As an input, GapFiller
uses scaffold files together with paired read, similar to SSPACE.
A similar approach is used for the most recent GMcloser software
(http://sourceforge.net/projects/gmcloser/): GMcloser uses
preassembled contig sets or long read sets as the sequences to
close gaps and uses paired-end (PE) reads and a likelihood-based
algorithm to improve the accuracy and efficiency of gap closure.
Gaps that cannot be closed computationally should be resolved
by experimental procedures – picking the primers to the ends of
contigs, amplifying missing region and Sanger sequencing.

2.9 Pre-annotation Computational methods used to validate the assembly rely on


Assembly Evaluation mapping all reads back to the assembly and analysing the mapping.
The best, but also challenging, approach is to view the mapping
using visualization tools, allowing to visualize the enrichment of:
l Mismatches between reads and assembled sequence
l Broken read pairs – regions where read pairs map at a distance
different from expected
l Unaligned read ends – regions where only part of a read is
mapped to the assembled sequence
Examples of such genome mapping visualization tools are
Consed [20] and Hawkeye assembly viewer [21]. Regions of
enrichments of such features might contain misassembled
sequences and should be re-analysed experimentally by amplifica-
tion and Sanger sequencing.
In addition to the visual analysis, the FRCbam tool was recently
developed to analyse BAM mapping files for the enrichment of
16 Stepan V. Toshchakov et al.

misaligned reads [22, 23]. It can be downloaded at https://github.


com/vezzi/FRC_align and run on any Unix-based system. Despite
doing generally the same job as visualization tools, it can effectively
be used to compare the assemblies generated by different de novo
assemblers and for choosing the best one.

3 General Genomic Features Prediction

3.1 Origin/Skew In bacteria, replication origins could be predicted by cumulative


Plots nucleotide (e.g. GC or AT) skews, localization of DnaA boxes as
well as genes, often associated with Ori site: coding tRNAses,
gyrases, RNA and DNA polymerases and ribosomal proteins [24].
All these features provide the basis of origin prediction by OriFin-
der server (http://tubic.tju.edu.cn/Ori-Finder/ [25]). Archaeal
origins of replication (several archaea have 2–4 Ori sites) could be
predicted using Orifinder2 server (http://tubic.tju.edu.cn/Ori-
Finder2/ [26]) and compared with results, presented in the
DoriC database for manual comparisons of presumable Ori of
target sequence and the sequences, present in the database includ-
ing that approved experimentally.

3.2 Non-coding RNA Quick rRNA-coding sequence predictions could be done using the
Predictions RNAmmer (http://www.cbs.dtu.dk/services/RNAmmer/ [27]).
RNASpace server (http://rnaspace.org/ [28]) and Rfam (http://
rfam.xfam.org/ [29]) database and search tool (similar by architec-
ture to Pfam, see below) are more convenient for more compre-
hensive search and annotation of non-coding RNAs. The fast way
of tRNA sequence prediction is to use the tRNAscan-SE web-based
tool [30, 31].

4 Tools and Databases for Protein Annotations and Analysis

4.1 General Currently, plenty of tools are available, both online and offline.
Comments Actually, each step could be accomplished by different tools with
similar functionality. Here, we introduce some of them, which are
the most convenient to our experience. While the majority of tools
are located individually, there are several multifunctional servers for
comprehensive large-scale annotation and analysis, like IMG/ER
and IMG/MER or RAST and MG-RAST for genomic and meta-
genomic annotation, respectively. While the growing functionality
of the servers makes it possible to perform the majority of analyses
on the remote servers, it is still necessary to use different individual
applications in frames of further vigorous manual curation steps.
From our experience, the most prominent strategy is to use the
servers for structural annotation and initial steps of automatic
functional analysis using the results for following manual curation
Genomic Analysis of Pure Cultures and Communities 17

genomic sequence ORFs prediction in silico translation

BLAST against GenBank


domain HMM search nr or UniProt (if no BLAST against
using Pfam or HMMER SwissProt hits or low- SwissProt
quality hits)

BLAST against protein localization


specialized databases multiple alignment and (SignalP, TatP,
(eg. COGs, MEROPS, phylogenetic analysis secretomeP, TMHMM,
dbCANm, TCDB) Phobius etc.)

protein structure metabolic pathways protein-protein


prediction recontruction interaction

Fig. 4 The general scheme of the functional genome analysis pipeline

with help of different individual software. The number of these


software packages and servers is large and growing rapidly; thus,
there is no single unique algorithm, and each scientist might have
own preferences. We therefore describe here the tools, which are
functional and reliable in our hands, which nevertheless should not
prevent anyone from using alternative tools. The general overview
of the workflow is depicted in Fig. 4.

4.2 IMG/ER As referred on its resource website (https://img.jgi.doe.gov/


and IMG/MER #IMGMission), the Integrated Microbial Genomics (IMG) system
mission is to support the annotation, analysis and distribution of
microbial genome and metagenome datasets sequenced at DOE’s
Joint Genome Institute (JGI). However, one can upload and anno-
tate his/her private genome using IMG/ER (expert view) or meta-
genome IMG/MER servers. Upon uploading, the genomes/
metagenomes will remain in a private access, while the IMG server
will give its full functionality for their annotation and analysis.
RAST (Rapid Annotation using Subsystems Technology
(http://rast.nmpdr.org/ [32]) and MG-RAST (metagenomic
RAST) servers (https://metagenomics.anl.gov/ [33]) being a
18 Stepan V. Toshchakov et al.

part of the SEED server were developed for automatic SEED-


quality annotations for somewhat complete bacterial and archaeal
genomes and metagenomes, respectively. The SEED server was
developed by the Fellowship for Interpretation of Genomes (FIG)
with the main purpose in the development of curated genomic data
– subsystems or FIG families of proteins (FIGfams). Upon upload-
ing the genomes and metagenomes onto the servers and their
annotation, the found proteins will be integrated into the FIGfams
and will be available for analysis using the tools integrated within
the servers.

4.3 Genomic/ The ORF predictions could be performed using the above servers.
Assembly Structural In the RAST, there are two pipelines available: RAST ORF predic-
Annotation tion and GLIMMER ORF prediction. RAST has more options, but
usually has a higher overprediction rate; thus, we recommend
GLIMMER.

4.4 Analysis of De When ORFs were predicted, several approaches could be used for
Novo Annotated further protein function verification. In general, these approaches
Genomes/ are individual protein prediction, metabolic pathway predictions
Metagenomes and gene colocalization and coregulation. The latter is an impor-
tant part of analysis; however, we are not going to discuss it since it
is well reviewed in several papers [34–36]. For this, ORFs must be
translated into polypeptides due to genetic code degeneracy.

4.4.1 Basic Local Two strategies are (a) to BLAST entire database with your protein
Alignment Search Tool query and (b) to BLAST your genome/metagenome with a char-
(BLAST) [37] acterized protein.
For the first one, there are many BLAST servers available, both
online and offline, the most popular being at NCBI/NIH (http://
blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM¼blastp&PAGE_
TYPE¼BlastSearch&LINK_LOC¼blasthome) and Uniprot
(http://www.uniprot.org/blast/) BLAST servers. In most cases,
the default parameters are suitable. It is important to check the e-
values and to define acceptable the cut-offs, e.g. (below 0.01 or
below 106) of the hits and the coverage of both hits and query.
Ideally, for full-length single-domain protein search, the coverage
should not be far from 100%. The main “need-to-know” features
are:
(a) GenBank server supports multiple-sequence query, while Uni-
prot does not support this.
(b) Genbank server has advanced algorithms of analysis, e.g.
position-specific iterated PSI-BLAST [38], pattern-hit
initiated PHI-BLAST [39] and domain enhanced lookup
time accelerated DELTA-BLAST [40] which are suitable for
proteins, characterized by a low similarity to those present in
the databases.
Genomic Analysis of Pure Cultures and Communities 19

(c) Both servers support search against SwissProt database of well-


curated and often experimentally characterized proteins.
BLAST against this database is a highly recommended step
for a better prediction of protein function. However, it should
be mentioned that even within this database, more than 70% of
protein functions were inferred by sequence similarity (http://
web.expasy.org/docs/relnotes/relstat.html).
(d) If servers with featured databases have BLAST packages, it
would be quite useful to use those (e.g. TCDB [41] for
transporters, MEROPS [42] for peptidases, dbCAN [43] for
CAZymes). On the other hand, in many cases it is possible to
download the database and to do a local BLAST against it
using different softwares, of which we would recommend
BioEdit freeware [44].
Analysis Algorithm
I. BLAST against SwissProt database. Manual curation of posi-
tive results, negative results to the next step
II. BLAST against nr proteins database. Negative results to the
next step
III. PSI-BLAST, DELTA-BLAST and PHI-BLAST for more sen-
sitive BLAST search
IV. BLAST against COGs (Clusters of Orthologous Groups of
proteins) using EggNog server (http://eggnog.embl.de/ver
sion_3.0/index.html) or with NCBI CD-search server
(http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi)
V. TCDB BLAST (http://www.tcdb.org/progs/blast.php)
against specific transporter database for transporter
predictions
VI. MEROPS BLAST (http://merops.sanger.ac.uk/cgi-bin/
blast/submitblast/merops/advanced) against peptidases
database for peptidase search and prediction of their families
and function
VII. dbCAN (http://csbl.bmb.uga.edu/dbCAN/index.php)
BLAST or HMM search against CAZy database for GHs,
GTs, PLs, CEs and CBMs

4.4.2 HMM and Pfam Another approach for protein functional characterization is a search
for conserved domain families against the curated Pfam family
database, constructed based on multiple-sequence alignment and
hidden Markov models (HMMs). It should be mentioned that
several servers as dbCAN (see above) or HMMER (see below)
provide both BLAST and HMM search tools.
Another random document with
no related content on Scribd:
"O sir, you wouldn't be so wicked, surely!" Dick broke in, in
accents of alarm. "We should starve outright, I believe,
without mother's Wednesday and Saturday earnings at the
Manor House. And the children ain't half fed as it is!" He
wound up with another flood of tears.

"Then hold your tongue, now that you know what your
silence is worth," replied Stephen. "I'm sure you needn't
make such a cry-baby of yourself. I haven't hurt you, and
I've given you a jolly little box."

"But the box isn't any use to me," Dick argued. "Please—
please give me back my shilling, Master Stephen. 'Tis
dreadful to be hungry; and mother started off to work this
morning without a bit of anything inside her lips, because
she knew if she ate breakfast there wouldn't have been
enough for the little ones."

"Don't trouble to tell lies," the squire's son said, as he


turned contemptuously away. "Pick up your bundle and go
home, or the bogies that hang about these woods after dark
will have you."

Without another word or look, he then strode off, and was


quickly out of sight. When he was visible no longer, Dick
Wilkins sat down on his load of sticks, hid his face in his
hands, and wept long and piteously.

Dick was a brave-hearted lad, and at last recovered himself.


He determined that he would keep the treatment he had
received at Stephen Filmer's hands a secret from his
mother. He would be brave, and bear his trouble alone.

Up went the fagot on the child's thin shoulders. Try as he


might, Dick could not whistle to-day, as he usually did,
because his eyes were so full of tears that he had all he
could do to see where he was going. He trudged on, fighting
against his grief, and by the time he reached home, he had
quite composed himself.

To his surprise, Dick found that Molly had already kindled a


fire with some of the wood he had gathered earlier in the
day, and had set the tea-things out upon the snow-white
cloth.

"O Dick!" the little girl exclaimed, "What a long time you've
been! And how red the wind has made your poor eyes look
—just as if you had been crying!"

"Mother isn't back yet, I suppose," remarked the boy, taking


no heed of the comment his sister had made about his
appearance.

"No; I expect, though, she'll be here soon now. Come close


to the fire, Dick—do! and warm yourself. The sticks you
fetched this morning blaze up splendidly; they give out
better heat than any we've had as yet."

"That's right!" in gratified accents. "I'll bring home some


more to-morrow."

And Dick Wilkins took a stool, a sharp knife, and a basketful


of sticks, and sat down making clothes' pegs in the poor but
well-warmed kitchen; whilst Molly stood knitting by the
firelight; and the twins and Stranger occupied a prominent
position on the hearth, and watched the lifting cover of the
already boiling kettle.

CHAPTER IV.
TEN SHILLINGS REWARD.

"MOTHER, how fast the days go by!" remarked little Dick


one evening after the other children had gone to bed. "The
year is nearly out—only a few days left of it now. O mother,
don't you hope the next will be a better one for us all than
this has been?"

"Indeed I do!" sighed Mrs. Wilkins, and a hot tear fell upon
her work; she was knitting to-night by the uncertain light of
the fire. "Life's a struggle at the best of-times for poor
people," she went on; "and when the father of a family is
taken, it's bound to go hard with those he leaves behind."

"Ain't you straining your eyes?" asked the boy anxiously.


"Do let me light the lamp for you! We've been more sparing
than ever over oil of late, and I can't bear to think you may
be hurting your sight."

"I don't need the use of my eyes to knit, dear," was the
widow's return. "If I was sewing, 'twould be different."

"But the room looks so dark and gloomy," persisted Dick.


"And for some reason or other, it seems more silent than
usual. I wonder," turning his head to look about him, "what
it is I miss. Oh! oh!" To Mrs. Wilkins's dire dismay, he
started to his feet and pointed at an empty corner near the
door—"I know now!" he gasped forth. "It's the clock that's
gone! O mother! Mother! Where is it? What has become of
it? 'Twas the one thing that father prized above all else we
had, 'cos grandfather gave it to him on his wedding-day."

"Ah, my child," sobbed the poor woman, "I have been


forced to sell it to Squire Filmer in order to pay the rent.
The landlord was here yesterday, and he threatened to sell
us up if the money wasn't paid by to-morrow. It's a great
blow to me, but we must live."

There was a long pause; then Dick said: "O mother,


however can we get the money for poor Stranger's tax? O
mother! Mother! Whatever happens, we can't part with our
dog."

Laying aside her knitting, Dick's mother placed a tender


hand upon his heaving shoulder. "My dear," she said, "the
thought of it has worried me nearly as much as the trouble
about the rent; but I can't see any chance of our being able
to get the money to pay his tax."

"Then you really think we shall have to part with him?" cried
Dick. "Oh! God must be very cruel if He lets it come to that.
I know our Stranger wouldn't ever love other folks as he
loves you and me and the children. And if we sold him or
gave him away, his new owner might kick him about as—as
some people do their dogs."

"Well, there's all next month for us to look around and try
to serape the money together, dear," the widow summoned
heart enough to remind her little boy. "As long as it's paid
by the last day of January it will be in time; and if 'tis right
for us to keep our dog, why then we shall find ways and
means for doing so. Don't fret, child, more than you can
help. Whatever happens will be sure to be for the best.
Now, dry your eyes, and we'll have our supper cosy like in
front of the fire. If you lose heart, Dick, what'll become of
us all?"

But though the old year died and the new one took its
place, no sign of better fortune could Mrs. Wilkins or Dick
see. Stranger must be disposed of—that seemed certain
beyond a doubt; and if no one could be induced to offer him
a home, why then he would have to be killed. It would be
terrible indeed to part with so faithful a friend.

One evening at the end of January, little Dick was walking


homeward through the village by his mother's side, when a
large, square piece of paper, placed in a conspicuous
position in the post-office window, attracted his attention,
and he paused abruptly, saying,—

"Wait half a minute, mother; I want to read this notice."

Mrs. Wilkins stopped at once, and together they approached


the window, whereupon Dick read aloud:—

"LOST, in this neighbourhood (probably a month


or six weeks ago), a small carved ivory match-
box.
Finder will receive TEN SHILLINGS REWARD
by returning same to Colonel Flamank, Leigh
Grange."

"Dick, Dick, my little boy, what's the matter with you? Are
you ill?" demanded Mrs. Wilkins; for the small face at her
side had grown suddenly as pale as death, and the child
had clutched convulsively at her arm.

"Ill?—No! No! No!" was the emphatic reply. "I'm well


enough; only I can scarce believe 'tis true!"

"What's true, Dick? I don't know what 'tis you're talking of."

"Why, the box, to be sure—the little carved ivory match-box


that the colonel's offering ten shillings reward for. See!"
drawing it from his pocket, where he had thrust it in disgust
weeks and weeks ago. "Here it is! Now we can claim the
reward, pay dear old Stranger's tax, and keep him; besides
having a whole half-crown to spend as we're minded to
afterwards."

"O Dick, how wonderful! How like a miracle!" ejaculated the


woman, with a sob of thankfulness. "But are you sure
there's no mistake? Are you sure that that's the right box?"

"As good as sure," declared Dick. "Anyhow, I'll very soon


make certain. I shall go to Leigh Grange at once and ask to
see the colonel. Then if it's the right one, we'll pay
Stranger's tax the first thing to-morrow morning, and after
that's done, we shall feel he's safe."

"Very well, my dear; I daresay, we shall all sleep the better


to-night for having the anxiety about the poor dog taken off
our minds. But why didn't you tell me you had found that
match-box, Dick? You're not generally so close about
things."

"I didn't tell you because I didn't find it, and I could not
bring myself to worry you by saying how it got into my
hands," was the child's admission. Then, as they walked on
side by side in the direction of Leigh Grange, Dick narrated
the story of his meeting in the woods with Stephen Filmer,
adding, "And I thought God was so cruel to let that great
bully rob me of the shilling when I wanted it so badly. I little
dreamed that things would turn out as they have."

And now that the silver lining had appeared to his cloud,
Dick laughed merrily at the thought of how vexed the
squire's son would be when he discovered what he had lost
by not being able to restore the box.

"How shall you account to Colonel Flamank for having the


match-box in your possession?" Mrs. Wilkins presently
interrogated. "If he asks questions, you'll be bound to tell
him the whole story that you've just told me."

Dick hesitated a minute, after which he said,—

"I hadn't thought of that; and I shouldn't like to tell tales on


Master Stephen, though he did serve me shamefully bad
that afternoon in the woods. But there! Like as not, the
colonel won't want particulars; and if he doesn't, why then,
I needn't give him any."

Arrived at the entrance of Leigh Grange, Dick bade his


mother not to wait for him, lest she should take cold by
standing. Some seconds later, he was walking up the
colonel's trim carriage-drive, his heart beating, his legs
shaking beneath him, with nervousness and excitement.

CHAPTER V.
DICK'S INTERVIEW WITH THE COLONEL.

UPON Dick's stating the nature of his business at Leigh


Grange, he was admitted at once and shown into the
library, a handsomely-furnished room, the walls of which
were lined by rows and rows of books. For many minutes he
was left alone, and during that time, he feasted his eyes on
his surroundings. At length, however, he heard a footstep,
and a second afterwards Colonel Flamank came in.
"Good-evening," said that gentleman, in pleasant tones.
"What do you want with me, little boy?"

"Please, sir, I've brought back the match-box that you lost
some weeks ago," said Dick Wilkins, his heart beating so
loud that he fancied his questioner must hear it.

"You have brought back my match-box!" exclaimed the


colonel. "Come, now, this is very strange. Squire Filmer's
son came to me but a half-hour since, and said he had
found it, and would let me have it to-morrow. But what is
the matter?" he added, in surprise. "Surely you are not
going to cry!"

And the speaker took the match-box from the child's


shaking hand, whilst the latter burst into tears.

"O sir! O sir! Please to believe me when I tell you all about
it," sobbed poor Dick, "'cos Master Stephen's treated me
shameful, he has! He's the biggest bully in the place, and
he stole a shilling from me when he found me alone in the
woods."

Then seeing it was useless to keep back anything, the little


boy recounted the story of how the match-box had been
forced upon him in exchange for the coin that the artist had
given him for fetching his paints from the church. And so
earnest was his voice, and straightforward his manner, that
his hearer was inclined to think he told the truth.

"And you mean to tell me that you submitted to be robbed


by young Filmer?" questioned the colonel. "Why did you not
report the matter to his father? The squire would not have
shielded him, I am sure, if you had told him what you have
now told me."
"I—I threatened Master Stephen that I would; and he said if
I did, he'd get my mother out of her charing and washing at
the Manor House," sobbed the child bitterly. "And if you
don't believe me about the shilling, sir, please to ask the
artist gentleman, and he'll tell you that he gave it to me."

"And supposing I prove your story to be correct, and give


you the ten shillings reward, how shall you spend the
money?" asked the colonel.

"I shall pay the tax for our Stranger, sir. We should have
had to get rid of him if—if it hadn't been for this."

Dick Wilkins's countenance brightened.

"Stranger is a dog, I suppose?"

"Yes, sir; a retriever. He came to our cottage one awful


stormy night. His paw was cut and bleeding; and some
body'd been trying to drown him, 'cos he'd got a rope with a
stone at the end of it tied around his neck; and mother and
I let him in, and did what we could for him."

"So he's stayed with you ever since! I believe I have seen
him in the village on several occasions—a handsome
creature he seemed too."

"Yes! Yes!" assented Dick enthusiastically. "And if I'd had


him with me when Master Stephen came along that day, he
wouldn't have let him bully me—not he!"

The colonel remained silent for some minutes after this. He


put on his glasses and examined the match-box closely. At
length he turned towards Dick Wilkins again.

"I feel much troubled by what you have told me," he


commented. "And at the expense of some pain to the squire
and his wife, I mean to see you righted. It is not so very
late in the evening yet; therefore you and I will go down to
the farm together, and see the gentleman who, you say,
gave you the shilling."

"Yes, sir," agreed Dick, without the least hesitation. "He's


almost sure to be in, 'cos the daylight's too far gone for him
to be painting still."

Accordingly the two set out to pay a visit to the Smerdons'


lodger. But scarce had they gone a hundred yards in the
direction of the farm, when they came face to face with
Stephen Filmer. A strange expression overspread the bully's
features as he recognized the pair, and he would have slunk
past without speaking had not Colonel Flamank thought fit
to stop him.

"Wait, Stephen," said he; "I wish to speak to you about the
lost match-box that you assured me you would bring back
to-morrow. This lad has already brought it to me. What light
can you throw upon the matter?"

"I found the match-box," answered the boy sulkily. "I only
lent it to Dick Wilkins, and I suppose he's been dishonest
enough to claim the reward."

"Oh!" cried Dick, in shocked accents. "Oh! how can you say
so, Master Stephen?"

At this point, Colonel Flamank interposed and bade Dick be


silent.

"Stephen," he afterwards said, "you are telling me a


deliberate falsehood! You did not lend the match-box to this
child; you forced it on him, in return for his shilling, which
you had stolen."
"Well, he agreed to the exchange," said young Filmer,
forgetting that a moment since he had stated that the
match-box had been lent. "Do you think, if I'd treated him
as he says, that he wouldn't have made a fuss and told my
father about it?"

"You know what threats you employed to silence him,


Stephen," rejoined the colonel. "You know that you sealed
his lips by saying you would get his poor widowed mother
out of her work at the Manor House, if he carried his story
to the squire."

"And so I will, now he has sneaked upon me," was the


savage response.

"No, indeed, you will not," Colonel Flamank assured him.


"Had you shown any regret for your cowardly conduct, I
might have been inclined to spare you by letting the matter
pass. But you are far from repentant; and it is time a stop
was put to your tyranny. I shall thoroughly investigate this
affair, and prove or disprove the truth of Dick Wilkins's
statement. After that, I shall make it my business to lay the
facts of the case before both your parents. Now you can go
your way."

And the squire's son passed out of sight, for once in his life
really frightened and abashed.

"O sir!" gasped Dick, when he had gone, "I'm 'fraid he'll do
some mischief even now if he can. Supposing he should get
my mother out of her work at the Manor House! We should
starve! And—and our landlord's a hard man, he is!"

"Don't fear, my lad," returned the colonel reassuringly.


"Squire Filmer will see that no injustice is done. But here,"
he added, "we are already at Farmer Smerdon's gate. You
shall stay where you are, whilst I go in and interview the
donor of the shilling. And if I return satisfied with what he
tells me, I will at once hand you the ten shillings reward."

"Yes, sir, thank you. I'll wait here."

Nor had he long to wait, for Colonel Flamank returned to


him a few minutes later with a smile of encouragement
upon his face.

"Have you seen him, sir?" asked little Dick, scarcely able to
suppress his anxiety.

"Yes, and I am ready to give you what I promised."

So saying, the colonel laid a half-sovereign in Dick Wilkins's


hand. It was the first gold coin the child had ever touched.

CHAPTER VI.
HARD TIMES.

How proud Dick felt next day when he walked into the
grocery establishment that was also the post-office, laid his
half-sovereign on the counter, and said he had come to pay
his dog-tax. Stranger was with him, and in such high spirits
that he found it hard to believe the dog did not understand
the nature of their errand.

"So you're not going to get rid of the retriever after all,
then," remarked the post-mistress, after filling in and
handing Dick the receipt for his money.
"No," said the little boy; and then he pointed at the notice
that had not yet been removed from the window, and
added, "That's how I got my half-sovereign, Mrs.
Mortimore. The colonel gave it to me for bringing his match-
box back to him last evening."

"You don't say so, Dick Wilkins!" ejaculated the woman,


with good-natured interest. "Well, you are lucky, and no
mistake! Some one told me only yesterday how upset all
you children were at the thought of parting with your dog.
See! here's your two-and-sixpence change; and here's a
quarter-pound packet of tea that you can take home to your
mother as a present from me. Tell her I hope she'll enjoy it.
She was looking shocking thin and pale, I thought, when
last I saw her."

"Thank you very much for the tea," said Dick gratefully.
"Mother'll be glad of it, I'm sure." And with this he turned
towards the entrance of the shop, and would have gone his
way had not the talkative post-mistress called him back to
the counter again.

"If you take my advice, Dick Wilkins," she went on, "you'll
get that mother of yours to go and see the doctor. She's a
failing woman—you mark my words. Get Dr. Rogers to give
your mother something—there's a good boy!—or, in my
belief, you won't have a mother to care for you much
longer."

Now Mrs. Mortimore was a kind woman and a well-meaning


one. But she lacked discretion, as this fact she would have
realized could she have heard Dick Wilkins sob himself to
sleep in his own little room when night-time came. Never
did child love parent more devotedly than this one did his
mother. Therefore the post-mistress's words of warning
sank deep into his heart, and haunted him increasingly
during the long hours of the night.

Days passed, and work became even scarcer than hitherto.


The cold got more intense; and great was Dick's distress
one evening on finding his mother employed in cutting up
her warm shawl to make bodices for the twins.

"Mother," he burst forth, "oh, please, don't do it! You'll catch


your death of cold if you go out in this bitter wind without
anything over your shoulders. Let me go to the rector's wife
and ask her for a couple of cast-off wraps for Willie and
Joe."

"No, no! I couldn't think of it, Dick! I never begged in my


life!" was the widow's answer.

"Do you feel bad this evening?" asked the boy in anxious
tones. "I mean—does your side ache worse than usual?"

"No, dear, not worse than usual. Why, Dick, folks would
think I was a grand body, if they knew how careful you were
of me."

"I want you to see the doctor, mother. You do look ill and
bad!" declared Dick gravely.

"That's nonsense! It's the cold that nips me up," was the
prompt return. "'Tis freezing hard to-night again. I shouldn't
be surprised if the ice on the lake bears soon. Then you and
the children'll be able to go and watch the skating between
whiles. Lord Bentford is certain to throw his grounds open
to the public as usual. O—oh!"

"What made you cry out like that? Why, you've got your
hand tied up! What's amiss with it?"
"There's a sore place on one of the fingers; and when I
knocked it against the table, it made me cry out. 'Twill be
easier in a minute;" and Mrs. Wilkins turned her face aside
that Dick might not see it was drawn with pain.

"How long has your finger been bad?" the little boy
demanded.

"Not more than a few days. I hurt it on Tuesday with a pin


that one of the servants at the Manor House left in her
apron when she gave it to me to wash; but I didn't bind it
up till an hour ago."

"And you've been working with it sore all day!" cried Dick,
in much concern. "Hasn't it pained you, mother?"

His mother confessed that it had been painful, but that a


pennyworth of ointment would soon put it to rights. Dick,
however, insisted on her seeing the doctor, who told her
that her finger had been poisoned by the stab of the pin. He
told her, too, that her blood had got into an unhealthy state,
and that she must have plenty of good food if she was to
get well.

The poor woman was in despair. One by one her few


remaining sticks of furniture were sold for bread. Poor Dick
was sure that God would never desert them, and that help
would soon arrive.

Then all at once a bright idea flashed into Dick's mind. To-
morrow would be Saturday, and school holiday. He would
put a gimlet in his pocket, go to Lord Bentford's lake, which
by now was bearing, and try to earn a few coppers by
putting on the gentlefolks' skates. He would not breathe
one word of his intention to any one; no, not even to his
mother. So he went supperless to bed that night, full of
hope for the success of his new venture on the morrow.
CHAPTER VII.
A GALLANT RESCUE.

JUMPING out of bed early next morning, Dick dressed


himself in haste and went downstairs. It did not take him
long to sweep the kitchen, dust it, and kindle a bright fire;
and by the time that Mrs. Wilkins, Molly, and the twins put
in their appearance, the table-cloth was laid and the kettle
was singing cheerily.

The Wilkins's repast that morning was a poor, poor meal,


and Dick did not stop long over it. Before half-past nine, he
set out, gimlet in pocket, for Lord Bentford's lake.

Although he was early in getting there, he found at least


two dozen skaters already arrived. It had been freezing
hard all night, and the ice was in excellent condition—as
smooth as a sheet of glass.

"Blest if there isn't Widow Wilkins's youngster setting up in


opposition to us, Bill!" exclaimed a rough-looking idler to an
equally rough-looking companion.

The two men were standing on the edge of the lake,


whither they had come to earn a few shillings by putting on
people's skates, an employment needing but little exertion.
Turning a scowling countenance upon the child, the speaker
then asked with an oath,—
"What's your charge, young professional? Penny a pair, eh?
And chain the gentlefolks' attention whilst that sharp-nosed
retriever of yours makes off with a rabbit from the
plantation hard by."

Dick started and looked round quickly.

Not till this minute had it dawned on him that his dog had
followed. Had he loitered on his way, or even glanced once
behind, he must certainly have seen Stranger stealthily
tracking him. But he had done neither; and now as he
stared in vexation at the animal, his commonsense told him
that he must take him home before Lord Bentford or his
gamekeepers had a chance of seeing him.

This, however, was not to be. For no sooner had Dick


determined to retrace his footsteps than a heavy hand was
laid on his shoulder, and an angry voice demanded,—

"Have you had permission to bring that dog of yours here?"

"No," returned Dick, "I haven't."

It was Lord Bentford's head gamekeeper who had put the


question.

"I've only this minute seen him. He must have followed me


without my noticing. But I'm going to take him away at
once. He hasn't done any harm.—Stranger, old man, come
on."

But for once, in a way, the retriever was pleased to be


disobedient. He had caught sight of a couple of Lady
Bentford's collies scampering across the frozen lake, and
with a bark of delight had set off to join in their play—
behaviour that filled his young master's heart with dismay
and humiliation.
"Let me catch the lawless brute as much as looking into one
of the plantations and I'll shoot him, as sure as my name is
what it is," cried the exasperated gamekeeper, turning
angrily away.

Dick trembled at the threat, and set off after his wayward
property. But the ice was slippery, and he fell once or twice
and hurt himself badly. He had just picked himself up, when
a piercing shriek rang through the air, and was followed by
a woman's scream of alarm and a man's loud shout for
help.

The refreshment tent was deserted, and every one made a


dash for the spot whence the cries had come. Even
disheartened Dick and his retriever followed.

"A rope, a rope!" some one was calling. "Bring a rope this
minute. There's a child in the water, near the boathouse,
where the ice has been broken for the swans. Quickly,
quickly, or we shall be too late!"

"No, a ladder will be better," declared a second voice. "A


long ladder and a rope." Thereupon, a third informed the
crowd that it was Lord Bentford's little boy who was in peril
—his only child, indeed, and the heir to all his land.

"'Tis a wonder if the kid ain't drowned, for he's tumbled in


at the deepest part," was the grim remark of one of the
idlers who, a couple of hours since, had jeered at little Dick.
"But then Death don't make no distinctions. And it's no
more for his lordship to lose his youngster than 'twould be
for me to lose one of mine."

"Oh, my child! My darling! He will be drowned—I know he


will!" wailed the distracted mother. "Can nothing be done to
save him? Oh, he will get beneath the boathouse, and—"
"Please—please, your ladyship," gasped Dick, elbowing his
way through the crowd to the place where both parents
were standing, "Stranger'll do it, if you let him try.
Stranger'll save the little gentleman."

"Stranger?" Lord Bentford panted. "Who is Stranger, child?"

"My dog, your lordship. Here he is. He's a first-class


swimmer is our Stranger."

Then leading the retriever to the brink of the ice, Dick


pointed out the spot where the child had sunk.

"Fetch him," he cried incitingly. "Fetch him, good dog, good


dog!"

And needing no further bidding, Stranger plunged into the


lake and kept himself afloat while he looked eagerly about
him.

For several seconds there was breathless silence. The


unfortunate little boy had not yet risen, and there was the
chance that when he did he might come up at a spot that
was completely covered by ice. Happily, however, this
contingency had not to be met; for presently a dark object
rose a few feet from the boathouse, and the keen-sighted
dog struck out gallantly towards it. A moment later,
Stranger had fastened his long white teeth into the child's
kilted skirts, and set out snorting for the bank.

"Bravo! Bravo!" burst from at least a dozen lips.

Then as the dog, well-nigh exhausted, came within reach,


willing hands were stretched forth to relieve him of his
burden; and the snow-sweepers, making their reappearance
with ropes and a long ladder, saw that their assistance was
not wanted after all.
"Tell me," cried Lady Bentford, wringing her hands over the
dripping form of her child, "does he still breathe?"

"Yes, he is living," came the answer.

Hearing which, Lord Bentford, almost beside himself with


gratitude, turned impulsively aside to address the owner of
the dog.

The long spell of misery and privation, however, coupled


with the terrible excitement of the morning, had proved too
great a strain for Dick Wilkins's endurance. He had borne up
until the safety of Lord Bentford's son had been
accomplished. He had kept his senses whilst the crowd had
cheered and commended his dog; but now, he sank down
with a groan upon the bank close to the boathouse, and ere
his lordship reached his side consciousness left him, and he
fainted.

CHAPTER VIII.
STRANGER'S MISSION FULFILLED.

IT was not until several weeks later, that entire


consciousness returned to little Dick. And then, to his great
amazement, he found himself lying in a strange bed, too
weak to move either hand or foot, whilst a cheerful looking
nurse, clad in a dark dress and white cap, cuffs, and apron,
sat on a chair near the window watching him.

You might also like