You are on page 1of 3

Algorithm.

A set of instructions given to a computer to perform a task.

Accession number
It is a unique identifier, often a combination of letters and numbers, that is

assigned permanently to an entry in a database. The entry could be a DNA

or protein sequence or other type of molecule. Accession numbers can also

be assigned to experiments in databases. Accession numbers are stable

through time.

Conceptual translation.
Of DNA/mRNA sequence into protein sequence. This is the process of

predicting the amino acid sequence of a polypeptide based on the sequence

of nucleotides of its mRNA/DNA. The prediction is guided by the genetic

code.

Homology annotation.
In bioinformatics, this term refers to the use of evolutionary conservation

as a basis for extrapolating functional characteristics from one gene or

protein to another.

Score (in BLAST).


This parameter describes how good the alignment between the query and

the subject is. It depends on the number of “good” and 'bad" matches. The

higher the score, the better the alignment is.

Expected Value (E-value).


In sequence similarity searches, this parameter describes the number of

hits that could be found by chance given the length of the sequence and

the size of the database. The lower the E-value, the higher the chance that

the observed alignment is due to homology. Learn more about e-values in

this BLAST help page and in this tutorial

Percentage identity.
In BLAST results, this value represent the number of residues (amino acids

or nucleotides) that match exactly at the same position between the query

and the subject expressed as a percentage of the whole sequence.

Primary database.
A resource database to which researchers can submit experimentally-

derived data, often sequenced DNA or mRNA, to be archived and made

available for the wider community. Other primary databases include three-

dimensional structure of proteins. More on databases from the European

Bioinformatics Institute here

Secondary database.
A resource where entries in the primary database are processed

informatically, to derive new information from them (for example, the

prediction of protein topology). Secondary databases provide “digested”

information. More on databases from the European Bioinformatics Institute

here

Conserved domain.
Of a protein. It is a part of a protein that, by assuming a defined three-

dimensional structure, confers a given function to a protein. Proteins can

have more than one conserved domain and, at the same time, one given

conserved domain may appear in different proteins. The amino acid

sequence of conserved domains is less likely to change (is more conserved)

than those not participating in conserved domains, that is to say, their

structure is better maintained throughout evolution.

https://en.wikipedia.org/wiki/Protein_domain

Flat file.
A plain text file containing records with no structured interrelationship.

The records themselves may have an internal structure. Also known as a

flat file database.

You might also like