You are on page 1of 6

BIF401 MID PREPARATION BY

BADSHA ALI

Difference between local and global alignment?

Local Alignment Global Alignment


A local alignment aligns a substring of A global alignment contains all letters from
the query sequence to a substring of the both the query and target sequences
target sequence.
gives the highest scoring local match between maximizes the number of matches between
both query and sequences the query and source sequences along the
entire length of both the sequences.
Examples of Local alignment tools: Examples of Global alignment tools:

 BLAST  EMBOSS Needle


 EMBOSS Water  Needleman-Wunsch Global Align
Nucleotide Sequences (Specialized
 LALIGN BLAST)

4 Steps involved in FASTA algorithm:


 Local regions of identity are found.
 Rescore the local regions using PAM or BLOSUM matrix.
 Eliminate short diagonals below a cutoff score.
 Create a gapped alignment in a narrow segment and then perform Smith Watermann
alignment

system biology and its application:


Systems biology is the computational and mathematical modeling of complex biological
systems. It is a biology-based interdisciplinary field of study that focuses on complex
interactions withinbiological systems.

application:
 Model protein and gene interactions
 Dynamical analysis of such models
 Understand system properties
 Predict system level behaviors

Define gene:
A gene is the basic physical and functional unit of heredity. Genes are made up of DNA. Some
genes act as instructions to make molecules called proteins.

ORF stand for and its functions:


The ORF Finder (Open Reading Frame Finder) is a graphical analysis tool which finds all open
reading frames of a selectable minimum size in a users sequence or in a sequence already in the
database. This tool identifies all open reading frames using the standard or alternative genetic
codes.

Types of RNA and their functions:


Three major types of RNA are mRNA, or messenger RNA, that serve as temporary copies of the
information found in DNA; rRNA, or ribosomal RNA, that serve as structural components of
proteinmaking structures known as ribosomes; and finally,tRNA, or transfer RNA, that ferry
amino acids to the ribosome to be assembled.

Categories of RNA:
RNA can be divided into two categories
 Coding RNAs:Coding RNAs as is obvious from their name, code for Proteins
 Non-Coding RNAs: Non-Coding RNAs regulate/assist in the process of translation

Why unpaired nucleotides should make the Structure of RNA destabilized:


RNA structures is formed due to folding of nucleotide with in RNA molecule but after folding
some nucleotides remain open for interaction. And they form hydrogen bonds together. These
unpaired nucleotides of 2’ structure interact with other unpaired nucleotides and form a third
structure called tertiary 3’ structure. For example 4 nucleotides in hairpin loop structure does.

About amino acids and protein which has 3 codons?


The three stop codons have names: UAG is amber, UGA is opal (sometimes also called umber),
and UAA is ochre. Stop codons are also called "termination" or "nonsense" codons.

FASTA algorithm:
FASTA can search sequence databases and identify unknown sequences by comparing them to
the known sequence databases. • This can help obtain information on the parent organism,
function and evolutionary history.

What are Substitutions & Indel?


Indels can also be contrasted with Tandem Base Mutations (TBM), which may result from
fundamentally different mechanisms. A TBM is defined as asubstitution at adjacent nucleotides
(primarily substitutions at two adjacent nucleotides, but substitutions at three adjacent
nucleotides have been observed.

Hydroxyl group is less stable in RNA as compared to DNA with diagram:


Unlike DNA, RNA in biological cells is predominantly a singlestranded molecule. WhileDNA
contains deoxyribose, RNA contains ribose, characterised by the presence of the 2'-hydroxyl
group on the pentose ring. This hydroxyl group make RNA less stable than DNA because it is
more susceptible to hydrolysis.

Complete the protein sequence .F-x(3)-x-R-F-K-x(4-5)-D-E-R?


FXXXXRFKXXXXDER, FXXXXRFKXXXXXDER

what is the acronym of expasy stands for:


Expasy stand for= Expert Protein Analysis System
Expasy provides access to a variety of online databases and tools. • Depending upon your
requirement, you find sequence information from Expasy.

How the DNA and RNA works together:


DNA (deoxyribonucleic acid) is the genetic material. It functions by storing information
regarding the sequence of amino acids in each of the body's proteins. This "list" of amino
acid sequences is needed when proteins are synthesized. Before protein can be synthesized, the
instructions in DNA must first be copied to another type of nucleic acid called messenger RNA.

Two Main Types of BLAST:


Nucleotides =Blastn: Compares a nucleotide query sequence against a nucleotide database.
Proteins=Blastp: Compares an amino acid query sequence against a protein database.

DOMAIN SHUFFLING:
Aligned portions of sequence can be considered in varying orders and this process is called as
domain shuffling.

ADVENTAGES
 We can compare the different length sequences
 Conserved domains can be determined from proteins
 Common function features can be identified.

Scoring matrices:
Scoring matrices are used to determine the relative score made by matching two characters in a
sequence alignment. These are usually log-odds of the likelihood of two characters being derived
from a common ancestral character. There are many flavors of scoring matrices for amino acid
sequences, nucleotide sequences, and codon sequences, and each is derived from the alignment
of "known" homologous sequences. These alignments are then used to determine the likelihood
of one character being at the same position in the sequence as another character.

ORF and FASTA stand for?


 ORF (Open Reading Frame)
 FASTA (Fast Alignment/Fast all)

Progressive alignment?
Progressive alignment (Feng and Doolittle, 1987) is a heuristic for multiple sequence alignment
that does not optimize any obvious alignment score. The idea is to do a succession of pairwise
alignments, starting with the most similar pairs of sequences and proceeding to less similar ones.

dynamic programming:
Dynamic programming is both a mathematical optimization method and a computer
programming method. Likewise, in computer science, if a problem can be solved optimally by
breaking it into sub-problems and then recursively finding the optimal solutions to the sub-
problems, then it is said to have optimal substructure.

Identity and Similarty and the formula of identity alignemts?


 Identity is the count of exact matches between two sequences.
 Gaps are excluded
 Similarity is the comparison between sequences calculated by using alignment approach.

Difference between TBLASTN and TBLASTX:


TBLASTN:
 Compares a protein query sequence against a nucleotide sequence database
 Nucleotide sequence dynamically translated into all reading frames
TBLASTX:
 Compares the six-frame translated proteins of a nucleotide query sequence against the
sixframe translated proteins of a nucleotide sequence database.

Diffrence acidic and basic amino acids?


Acidic amino acids have acidic side chains at neutral pH while basic amino acids have basic side
chains at neutral pH. carboxylic acid is the side chain for acidic amino acids and basic amino
acids contain nitrogen containing groups. Lysine, arginine and histidine are basic amino acids.

2’ RNA structures:
2’ RNA structures form as a result of bonding between complementary nucleotides within an
RNA molecule • However, some nucleotides are still left open for interaction. There are
structural patterns in RNA 2’ structure. These include: Helices, Loops, Bulges & Junctions.

Diffrence between Blast and FASTA?


BLAST is the most widely used tool for the local alignment of nucleotide and amino acid
sequences. FASTA is a fine similarity searching tool which uses sequence patterns or words.

What is dot plots?


In bioinformatics a dot plot is a graphical method for comparing two biological sequences and
identifying regions of close similarity. It is a type of recurrence plot.

Central dogma:
DNA has four nucleotides bases (A, C, T & G). RNA contains (A, C, U & G). And protein
contain 20 different amino acids. DNA to RNA then Protein is called as central dogma. Which
includes translation, transcription and protein modifications.

Components of MS:
 Sample Injection
 Ionization Source
 Mass Analyzer
 Ion Detector
 Spectra search using computational tools

Name some methods to form phylogenetic tree:


There are several widely used methods for estimating phylogenetic trees (Neighbor Joining,
UPGMA Maximum Parsimony, Bayesian Inference, and Maximum Likelihood [ML]), but this
article will deal with only one: ML.

How Dynammic programming (DP) creat scoring functions which deal with
the matches mismatches and gaps. Also calculate:

You might also like