Professional Documents
Culture Documents
Bioinformatics: A Definition
What is a Genome?
A genome is the complete set of genetic information of an organism. It contains all the
instructions for creating and maintaining life. Every living organism consists of a genome.
A human genome consists of nuclear and mitochondrial DNA. On the contrary, the genome of
a virus comprises RNA as the genetic material.
Our genome contains around 20,000 genes. They make up 1-5% of our genome. The DNA
between the genes is involved in gene regulation.
Genome Sequencing
The genome is a unique sequence of DNA. It is sequenced by certain machines to identify the
cause of a particular disease. Some diseases are caused by very little variation in the DNA.
Sequencing the genome can help us identify which DNA changes are causing the problem.
The genome of the tumour cells is altered when compared to normal cells. By comparing the
genome of the normal and cancer cells we can get clues about ways to treat cancer.
The sequencing of a human genome takes about a day. However, its analysis takes a longer time.
Applications of Genomics
Medical Applications
DNA and transgenes are used to create oral plant vaccines that stimulate immunity. Precision
medicine provides information about the genetic makeup of a patient to direct the type of
treatment they receive.
Biotechnology Applications
Genome sequencing is used in analysing the factors that are involved in the conservation of
species. For eg., the genetic diversity of a population can be used to predict the health and
conservation of species.
This helps in analysing the consequences of evolutionary processes and picking up genetic
patterns of a specific population. Analyses of these patterns can help to devise ways for the
conservation of species.
Smith-Waterman The Smith-Waterman algorithm was published in 1981 and is very similar to
the Needleman-Wunsch algorithm. Yet, the Smith-Waterman algorithm is different in that it is a
local sequence alignment algorithm. Instead of aligning the entire length of two protein
sequences, this algorithm finds the region of highest similarity between two proteins. This is
potentially more biologically relevant due to the fact that the ends of proteins tend to be less
highly conserved than the middle portions, leading to higher mutation, deletion, and insertion
rates at the ends of the protein. The Smith-Waterman algorithm allows us to align proteins more
accurately without having to align the ends of related protein which may be highly different. The
Smith-Waterman algorithm can be implemented by changing only a couple things in the
Needleman-Wunsch algorithm.
Q4:
PAM and BLOSUM matrices are used to score alignments between protein sequences, but they
differ in their applications. PAM matrices are used to score alignments between closely related
protein sequences, while BLOSUM matrices are used to score alignments between evolutionarily
divergent protein sequences1. PAM matrices are based on an explicit evolutionary model, while
BLOSUM matrices are based on an implicit model of evolution. PAM matrices are based on
mutations observed throughout a global alignment, while BLOSUM matrices are based on the
frequencies of amino acid substitutions in a database of aligned protein sequences