Professional Documents
Culture Documents
The eighth edition of Bio::Blogs was originally posted online on the 2nd of February at:
1
Bio::Blogs #8
From my blog I picked an entry on network to the ecosystems. The problems range from
reconstruction (see page 4). I think the data access to data management and analysis.
increasing amounts of omics data should be The complexity and the different scales of
better explored than it currently is and network organization (i.e. molecules, environment,
reconstruction methods are a very good way of ecosystems, disease) make this a very promising
achieving this. In this paper, Jason Ernst and field for computational biologists.
colleagues used expression data to reconstruct
dynamic transcription regulatory interactions. I From ecological changes we move on to human
will try to continue blogging about the topic in evolution. On page 10, Phil tries to introduce
future posts. the possibility of humanity using technology to
improve itself. Having a strong interest myself
Comentaries in synthetic biology and man-machine
interfaces I would say that we are still far away
PLoS ONE was launched about two months ago from having such control. It is nevertheless
and it has produced so far an amazing stream of useful to discuss the implications of emerging
publications. However the initially proposed technologies to better prepare for the possible
goal of generating discussions online to changes.
promote post-publication reviews as been
lagging. Stew (on page 6) and Alf (on page 7) Reviews and tips
wrote two commentaries (summited by Greg) I start this section with a post from a brand new
regarding the progress of PLoS ONE. They both bioinformatics blog called Bioinformatics Zen.
discuss the current lack of infrastructures to Michael Barton submitted his post on useful
stimulate the online discussions at the PLoS tips to get organized as a dry lab scientist (see
ONE site. Stew goes even further by providing page 12). I agree with most of his suggestions. I
to anyone interested a nice Gresemonkey script have try to slightly fancier methods of
to add blog comments to the PLoS ONE papers. organizing my work using project managing
I hope Chris Surridge and the rest of the PLoS tools but I end up returning to a more
ONE team start deploying soon some of the straightforward folder based approach as well.
tools that they have talked about in their blog.
They need to make ONE feel like home, a place Finally on page 14 Neil Saunders presents a nice
were a community of people can discuss their tutorial on building AJAX pages for
papers. bioinformatics. It is a very well explained
introduction with annotated code. If you were
From Deepak we have a post dedicated (on ever interested in learning the basics of AJAX
page 8) to what he dubs EcoInformatics. The but never invested time in it, here is a good
importance of using computational methods to chance to try it.
analyze ecological changes from the molecules
2
Bio::Blogs #8
Researchers have discovered that moth antennae caused the moths to collide with
antennae have gyroscope-like sensors to help walls, to fly backwards or crash to the floor.
them control their flight through the air. However, when the antennae were glued back
Because they fly at night, the source of their in place, the moths regained their
smooth and graceful flight was a mystery maneuverability.
because they could not rely on visual cues.
Closer examination revealed that a structure,
But a research team headed by Sanjay Sane, a called Johnston's organ, found at the base of
biologist at the University of Washington, the moths' antennae, was crucial for flight
Seattle, found a structure at the base of the stability. This organ relies on vibrations from
antennae that senses when the moth's body the antennae, which remain in a fixed position
begins to pitch or roll, it relays this information during flight, to detect the spatial relationship
to the brain, which causes the body to of the moth's body to its antennae, behaving
compensate. like a gyroscope.
"Whenever a creature is moving about, it has to Johnston's organ sends this information to the
have sensory information to tell it what it has moth's brain, which then tells the moth to shift
done," said Sane. "If a person unintentionally its body back to the correct spatial position.
turns around, the inner ear system or eyes will
provide that information and allow for a course Previous studies found that two-winged insects,
correction. Flying creatures need to know that such as house flies or mosquitoes, also use
information too, and when the light is low, and gyroscope-like sensors to control their flight.
the visual cues are hard to see, they have to These are the "halteres" that are attached to
depend more on the mechanosensory system." their hindwings.
3
Bio::Blogs #8
In my last post I commented on a paper that factors regulate these group of genes. There is a
tried to find the best mathematical model for a simple example shown in figure 1, reproduced
cellular pathway. In that paper they used below.
information on known and predicted protein
interactions. This time I want to mention a In this toy example there is a bifurcation event
paper, published in Nature Mol. Systems at 1 h and another at the 2h time point. All of
Biology, that attempts to reconstruct gene the genes are assigned to a gene expression
regulatory networks from gene expression data path. In this case, the red genes are those that
and Chip-chip data. are very likely to show a down regulation in
between the 1st and 2nd hour and stay at the
The authors were interested in determining same level of expression from then on. Once
how/when transcription factors regulate their the genes have been assigned it is possible to
target genes over time. One novelty introduced search for transcription factors that are
in this work was the focus on bifurcation events significantly associate to each gene expression
in gene expression. They tried to look for cases path. For example in this case, TF A is strongly
where a groups of genes clearly bifurcated into associated to the pink trajectory. This means
two groups at a particular time point. that many of the genes in the pink group have a
Combining these patterns of bifurcation with known binding site for TF A in their promoter
experimental binding data for transcription region.
factors they tried to predict what transcription
4
Bio::Blogs #8
To test their approach, the authors studied the that Ino4 binds many more promoters during
amino-acid starvation in S. cerevisiae. In figure 2 amino acid starvation as compared to synthetic
they summarize the reconstructed dynamic complete glucose media. Out of 207 genes
map. The result is the association of TFs to bound by Ino4 (specifically during AA
groups of genes and the changes in expression starvation) 34 were also among the genes
of these genes over time during amino acid assigned to the Ino4 gene path obtained from
starvation. their approach.
One interesting finding from this map was that This results confirmed the usefulness of this
Ino4 activates a group of genes related to lipid computational approach to reconstruct gene
metabolism starting at the 2h time point. Since regulatory networks from gene expression data
Ino4 binding sites had only been profiled by and TF binding site information. The authors
Chip-chip in YPD media and not in a.a. then go on to study the regulation of other
starvation, this is a novel result obtained using conditions.
their method.
For anyone curious enough about the method,
To further test the significance of their this was done using Hidden Markov Models (see
observation they performed Chip-chip assays of here for available primer on HMMs).
Ino4 in amino acid starvation. They confirmed
5
Bio::Blogs #8
6
Bio::Blogs #8
PLoS One
This post was originally posted by Alf on the 13th of February 2007 at:
http://hublog.hubmed.org/archives/001447.html
PLoS One's been launched in beta for a while There was one article that caught my attention
now, but there had been technical problems because of the wording of the funding section:
that seem to have been fixed now. It's a great "The authors claim they did not receive any
idea: a much lower barrier of entry to article financial funding." - suggesting, perhaps, that
acceptance, publishing articles of any length, the publishers weren't entirely sure about that
with peer review provided by readers claim. This article in particular has quite a few
comments on the article after it's published and style errors (even one in the title), so while peer
publication charges paid for by the authors' review may come afterwards (and hasn't in this
funding agencies. case, yet, but maybe someone out there is
repeating the experiment themselves), there's
Technically, PLoS should be built on solid still a role for publishers in copy-editing articles
foundations, as it uses the NLM Journal for readability. It would be good if authors have
Publishing DTD to store articles as XML and enough control over their published papers that
retrieves all objects (articles, figures, etc) using corrections can be made at a later date, and
DOIs. The underlying TOPAZ software is with archiving of open access articles using
supposed to be released as open source, LOCKSS, updated articles could feasibly be
though there hasn't been anything made distributed to multiple archives.
available yet. Hopefully this project should
cross-pollinate well with OJS, which recently It's a shame that Chris Surridge is already
had its own annotation system added thanks to lamenting the lack of comments on papers,
Geof Glass's work on Marginalia. PLoS only when the infrastructure isn't in place to
producing RSS feeds and still not even getting properly handle discussions at the moment. It's
them right doesn't exactly inspire confidence not surprising that people are more willing to
though. comment on papers within their own
communities where they can see that
As far as published articles go, some people discussion threads are treated as important,
publish long, technical papers similar to those permanent content and displayed
found in existing journals; others publish short, appropriately.
one experiment papers (which will hopefully get
even shorter, if methods and introductions can
be referenced elsewhere).
7
Bio::Blogs #8
8
Bio::Blogs #8
data in combination with different studies at this makes the nature of metadata rather
the later date can help ecologists gain better complex and will require ecologists to spend a
insight. For the uninitiated, i.e. yours truly, this considerable amount of time developing data
screams for some data standards at the standards and better still, ontologies to come
minimum and the development of an ecological up with ways to enable the interoperability of
ontology in a perfect world. the datasets so that high quality data analysis
becomes possible.
Currently ecological data is spreadsheet based,
i.e. it is still document centric. A number of The good news for the field is that there are a
ecologists also use packages such as R and SAS, number of existing resources, e.g. The
since most ecological hypotheses are generated Knowledge Network for Biocomplexity which
via statistical modeling. Most people will tell seems to be a fairly modern resource for
you that this is a recipe for data disaster. There ecological data. There are attempts to provided
is a need for quality databases, and a data a unified interface to many ecological data
centric approach. Regardless of integrative sources, but one could argue that their time is
analysis or synthetic analysis, having data in better spent enabling the creation of search
well-designed databases will only help engines and interfaces to any resources of
ecologists in the long run. The authors spend ecological information since information will be
some time talking about metadata. In a field like generated by a variety of sources. The dynamic
ecology, metadata is critical, especially in cases nature of ecological data will be a significant
when studies are re-used at a later point in time challenge for data integration, especially since a
in conjunction with newer studies. The authors lot of continuous modeling and re-modeling.
seem to talk about metadata driven data There needs to be a way to store and version
collections as being separate from vertical different studies, to make sure people are not
databases (data warehouses). That seems to be making incorrect decisions.
too simplistic a view. Combining the two
paradigms is probably a more powerful For someone with only a peripheral knowledge
approach, one that has been discussed here in of ecology, but a good understanding of
the past. There is a role for core databases in bioinformatics, the review by Jones et al is a
the mode of Genbank, which can be combined very useful and interesting read. Ecologists are
with data on the edges. While the data on the trying to understand several critical problems
edges might lack the structure of a facing our society and planet. How they access
comprehensive databases, but by building data, interpret it, and publish their results
semantic intelligence and developing should be a problem with more eyeballs on it.
appropriate standards/ontologies, one can Given the very public interest in sustainable
combine the knowledge in metadata driven development and the environment these days,
datasets with the core knowledge housed in hopefully there will be more informatics-savvy
structured data warehouses. Ecological projects people working in the field to develop high
are very diverse, crossing species, societies, quality databases, data standards and
data types, data volume and data quality. All of ontologies.
Reference: M.B. Jones, Schildhauer, M.P., Reichman,O.J., and Bowers, S. The New Bioinformatics: Integrating Ecological Data
from the Gene to the Biosphere, Annual Review of Ecology, Evolution, and Systematics, 37: 519-544 (2006). Picture: Via bprm2
9
Bio::Blogs #8
Compared with the progress of modern science, before, by relying on the knowledge and
evolution is too slow and error prone. We are experience of other people. Eventually, people
entering a period of time when scientists will be could become far more intelligent,
able to greatly surpass natural evolution. knowledgeable, and capable than anybody who
has ever lived before. Eventually this could lead
First, genetic engineering will allow doctors to to completely artificial and intelligent life forms
modify existing genes or entirely replace genes that are not limited to evolution at all, since
to remove genetic defects, treat diseases, and they would not have any organic parts. Instead,
augment natural abilities. For instance, plants they could just upgrade themselves.
and animals could be modified to yield better,
healthier, and more food. What if your eyesight We already discussed forced evolution of plants
could be fixed without surgery? What if all your and animals for the sake of humanity, such as
future children would be guaranteed not to get better sources of food. Now consider modifying
your original bad genes of poor vision or a non-intelligence and non-sentient (sentient
genetic disease? What if people choose to have means self aware or self conscious) animals into
their vision enhanced to be far greater than the becoming an intelligent and sentient species
natural human range, such as a bird’s vision? through the methods described above. This
How about people with the reflexes and speed very possible idea is called "uplifting". Humanity
of a cat? Now imagine if a large number of our could artificially uplift lowly species to the ranks
children had perfect memory thanks to genetic of civilized and productive people, for better or
modification. These children would find school worse. Imagine the philosophy, music, and
very easy and would most likely be far more literature that intelligent animals, such as cats
successful than their parents. and dolphins, would invent. I am sure they
would be creative in ways that humans are
Second, cybernetics could artificially replace not… yet.
human body parts. People could replace limbs
with fully functional machines that would These methods of leapfrogging evolution will
seamlessly be directly connected to the human ultimately create hyper-intelligent super beings
brain. The human brain could be enhanced too. that are far more capable than human beings.
Imagine if people could immediately look up Hopefully these future people will care more for
information just by thinking about it. For humanity than we do for less fortunate people.
example, people would instantly know what to Maybe these super beings will be able to solve
do in medical emergencies or how to solve some of the world’s greatest problems, such as
complex problems that they never encountered
10
Bio::Blogs #8
global warming, pollution free energy, diseases, start restricting variation in our children before
etc. they are even born. As a result, our children will
be geniuses but not unique.
The only problem that I see with these methods
of artificial evolution is that the variety or Whatever methods that humanity will use to go
diversity of people would lessen. The more beyond natural evolution, the purpose should
people become alike, then the less likely we always be to better humanity, our children, and
would have another Mozart or Einstein if we ourselves.
11
Bio::Blogs #8
12
Bio::Blogs #8
13
Bio::Blogs #8
So you’ve heard about this wondrous thing for bioinformatics, but it should give you an
called AJAX. You’re dimly aware that it can idea.
generate interactive, user-friendly and dynamic
websites without the usual cycle of reloading 1. Getting set up
the page after submitting a request to the First, I went to the place where I do my web
server. You know that Google Maps and Flickr testing (e.g. /var/www/testing/) and created 3
are two excellent examples. You’re keen to directories: php for the PHP, js for the javascript
explore the possibilities for your bioinformatics and xml for - you guessed right. In fact no XML
web applications. What you need is a “minimal files were saved in this example but I like to be
example”. Where do you start? organised. The HTML files just go in the /testing
root, right above these 3 directories.
That’s the situation that I was in last weekend
and here’s what I did. 2. The HTML form
There’s nothing special about the form. I named
I’ll start by making it clear that much of what the file form.html and it goes like this:
follows is lifted from the W3Schools AJAX
tutorial, with minimal adaptation to make it 1. <html>
relevant for bioinformaticians. Please go there 2. <head>
3. <script>script
and read their excellent work. src="js/ncbi.js"</script>
4. </head>
When I figured out how AJAX works my 5. <body>
6. <h3>Get protein name from NCBI
response was: “Oh. Is that all it is?” AJAX, you Gene DB ID</h3>
see, is nothing new. In fact if you’re familiar 7. <form>
with web programming and know a little about 8. <b>Select a Gene ID:<b>
9. <select name="geneID"
HTML, server-side scripting, javascript and XML onchange="showName(this.value)">
- well, that’s all it is. It’s just combined in a 10. <option value="none"
clever way to produce a pleasing result. selected="selected">-----</option>
11. <option
value="54123">54123</option>
Here’s what we’re going to do. We’re going to 12. <option
construct a simple form with a drop-down list of value="21354">21354</option>
13. <option
options. The options will be UIDs from the NCBI value="11988">11988</option>
Gene database. When we select an option, our 14. </select>
form will display the protein name associated 15. </form>
16. <p>
with the UID - without the need to reload the 17. <div id="geneName"><b>Gene info
page. It’s the AJAX equivalent of “Hello World” will be listed here.</b></div>
14
Bio::Blogs #8
15
Bio::Blogs #8
16
Bio::Blogs #8
To recap then: one element, but hopefully you get the idea.
You can imagine all sorts of uses for this in
We select a gene UID from a drop-down bioinformatics applications: fetching the most
list in a normal HTML form recent data rather than local storage, XML-SQL
Javascript and PHP interact to perform interconversions, real-time BLAST results and so
an EUtils query at the NCBI and return on. As ever, the only limits are your creativity
an XML file and requirements.
The XML is parsed and appropriate
values retrieved You can see it in action for a short time at this
Using asynchronous requests to the location. Feel free to grab the files and/or copy-
server (that’s the first ‘A’ in AJAX), paste from here to try it on your own server. I
javascript updates the page with added a little extra to the javascript at that
progress and displays the result location to display a “spinning disk” progress
All without reloading the page indicator - see if you can figure out where the
addition goes. Finally this is all new and exciting
That’s it. That’s AJAX. It’s a particularly stupid to me so if you spot any shocking errors, do let
example - fetching a huge XML file to parse out me know.
17