You are on page 1of 17


Probability is a branch of mathematics, or rather science, which deals with the phenomenon of chance. Widely investigated by individuals of highest caliber, this subject is not recent. It is ancient. Our forefathers, the pristine seers, knew of it in greatest detail.

Genetics is a branch of biology that deals with the phenomenon of continuity of life: concepts of hereditary transmission and variation of inherited characteristics among similar or related organisms.


Certain definitions are required to be understood to proceed deeper into the areas of gene mechanics.

1. Gene: The basic biological unit of heredity. A segment of deoxyribonucleic acid (DNA) needed to contribute to a function.

2. Chromosome: A visible carrier of the genetic information.

3. DNA: Deoxyribo-Nucleic Acid. One of two types of molecules that encode genetic information.

4. RNA: Ribo-Nucleic Acid. The other type of molecule that encodes genetic information.

5. Genotype: The genetic constitution (the genome) of a cell, an individual, or an organism. The genotype of a person is his or her genetic makeup. It can pertain to all genes or to a specific gene.

6. Phenotype: The appearance of an individual, which results from the interaction of the person’s genetic makeup and his or her environment.

7. Heterozygous: Possessing two different forms of a particular gene, one inherited from each parent. A person who is heterozygous is called a heterozygote or a gene carrier.

8. Monohybrid Cross: A cross between two individuals identically heterozygous at one gene pair for example, Aa x Aa.

9. Dihybrid Cross: Hybridization using two traits with two alleles each.


This is the phenomenon of genetic transmission of characters from parent to offspring.

During breeding it is observed that the genetic characters occur in a predictable fashion. For example,

characters occur in a predictable fashion. For example, Two pea seeds [One having green colored seeds

Two pea seeds [One having green colored seeds and other having yellow] are cross-pollinated. In the succeeding generations they are self-pollinated. This results in a peculiar 3:1 ratio. The reason for this turns out to be that the system follows combinatorics. The various combinations between the alleles generate the above 3:1 ratio.

Further clarification follows.

The tree above demonstrates the mathematical nature of heredity. Another constraint that exists here is

The tree above demonstrates the mathematical nature of heredity.

Another constraint that exists here is the dominance of the allele. It so happens that there are two types of alleles: dominant and recessive. If a dominant allele and a recessive allele exist together, the individual shows traits of the dominant allele since the dominant allele masks the presence of the recessive allele.

Two principles of inheritance become clear from the above discussion.

1. The principle of segregation

2. The principle of independent assortment

According to the principle of segregation, for any particular trait, the pair of alleles of each parent separate and only one allele passes from each parent on to an offspring. Which allele in a parent's pair of alleles is inherited is a matter of chance. It is now known that this segregation of alleles occurs during the process of sex cell formation.

According to the principle of independent assortment, different pairs of alleles are passed to offspring independently of each other. The result is that new combinations of genes present in neither parent are possible. For example, a pea plant's inheritance of the ability to produce purple flowers instead of white ones does not make it more likely that it will also inherit the ability to produce yellow pea seeds in contrast to green ones. Likewise, the principle of independent assortment explains why the human inheritance of a particular eye color does not increase or decrease the likelihood of having 6 fingers on each hand. Today, it is known that this is due to the fact that the genes for independently assorted traits are located on different chromosomes.


One of the easiest ways to calculate the mathematical probability of inheriting a specific trait was invented by an early 20th century English geneticist named Reginald Punnett. His technique employs what we now call a Punnett square. This is a simple graphical way of discovering all of the potential combinations of genotypes that can occur in children, given the genotypes of their parents. It also shows us the odds of each of the offspring genotypes occurring. The procedure of constructing a Punnett square is as follows:

procedure of constructing a Punnett square is as follows: The genotype of one parent is written

The genotype of one parent is written along the topmost row. Similarly, the genotype of the other parent is written along the leftmost column. For example, if parent pea plant genotypes were YY and GG respectively, the setup would be:

Filling up the above table by combining the alleles gives us the predicted frequency of

Filling up the above table by combining the alleles gives us the predicted frequency of all of the potential genotypes among the offspring each time reproduction occurs.

genotypes among the offspring each time reproduction occurs. In this example, 100% of the offspring will

In this example, 100% of the offspring will likely be heterozygous (YG). Since the Y (yellow) allele is dominant over the G (green) allele for pea plants, 100% of the YG offspring will have a yellow phenotype.

In another example (shown below), if the parent plants both have heterozygous (YG) genotypes, there will be 25% YY, 50% YG, and 25% GG offspring on average. These percentages are determined based on the fact that each of the 4 offspring boxes in a Punnett square is 25% (1 out of 4). As to phenotypes, 75% will be Y and only 25% will be G. These will be the odds every time a new offspring is conceived by parents with YG genotypes.

time a new offspring is conceived by parents with YG genotypes. P (YY) = ¼ P


(YY) = ¼


(YG) = ½


(GG) = ¼

An offspring's genotype is the result of the combination of genes in the sex cells or gametes (sperm and ova) that came together in its conception. One sex cell came from each parent. Sex cells normally only have one copy of the gene for each trait (e.g., one copy of the Y or G form of the gene in the example above). Each of the two Punnett square boxes in which the parent genes for a trait are placed (across the top or on the left side) actually represents one of the two possible genotypes for a parent sex cell. Which of the two parental copies of a gene is inherited depends on which sex cell is inherited--it is a matter of chance. By placing each of the two copies in its own box has the effect of giving it a 50% chance of being inherited.

One more example: Suppose that husband and wife are both carriers for a genetically inherited disease cystic fibrosis. Let us define ‘A’ as being dominant normal allele and ‘a’ as being the abnormal recessive allele responsible for causing the disease.

recessive allele responsible for causing the disease. P (AA) = ¼ P (Aa) = ½ P


(AA) = ¼


(Aa) = ½


(aa) = ¼

Once again we get a binomial probability distribution with P (AA) = 25% P (Aa) = 50% and P (aa) = 25%.

with P (AA) = 25% P (Aa) = 50% and P (aa) = 25%. If both

If both parents are carriers of the recessive allele for a disorder, all of their children will face the following odds of inheriting it:

25% chance of having the recessive disorder 50% chance of being a healthy carrier 25% chance of being healthy and not have the recessive allele at all

Or, if one parent has a recessive disorder and the other is heterozygous:

has a recessive disorder and the other is heterozygous: If only one parent has a single
has a recessive disorder and the other is heterozygous: If only one parent has a single

If only one parent has a single copy of a dominant allele for a dominant disorder, their children will have a 50% chance of inheriting the disorder and 50% chance of being entirely normal.

Punnett squares are standard tools used by genetic counselors. Theoretically, the likelihood of inheriting many traits, including useful ones, can be predicted

using them. It is also possible to construct squares for more than one trait at a


probability suggested here. These are biological exceptions.

However, some traits are not inherited with the simple mathematical

The above examples dealt with monohybrid crosses. It is time to take an example of a dihybrid cross.

In the pea plant, two characteristics for the peas, shape and color, will be used to demonstrate an example of a dihybrid cross in a punnett square. R is the dominant gene for roundness for shape, with lower-case r to stand for the recessive wrinkled shape. Y stands for the dominant yellow pea, and lower-case y stands for the recessive green color. By using a punnett square (parents RrYy x RrYy; making the gametes are RY, Ry, rY, and ry):

The result in this cross is a 9:3:3:1 phenotypic ratio, as shown by the colors, where yellow represents a round yellow (both dominant genes) phenotype, green representing a round green phenotype, red representing a wrinkled yellow phenotype, and blue representing a wrinkled green phenotype (both recessive genes).

For the figure see below.


























Punnett Square for Dihybrid Cross Situations where Punnett squares do not apply The phenotypic ratios

Punnett Square for Dihybrid Cross

Situations where Punnett squares do not apply

The phenotypic ratios of 3:1 and 9:3:3:1 are theoretical predictions based on the assumptions of segregation and independent assortment of alleles. Deviations from expected ratios can occur if any of the following conditions exists:

1. The alleles in question are on the same chromosome and linked.

2. One parent lacks a copy of the gene, e.g. human males have only one X chromosome, from their mother, so only the maternal alleles have an effect on the organism.

3. The survival rate of different genotypes is not the same, e.g. one combination of alleles may be incompatible with life so that the affected offspring expires in-utero.

4. Alleles may show incomplete dominance or co-dominance.

5. There are genetic interactions (epistasis) between alleles of different genes.

6. The trait is inherited on genetic material from only one parent, e.g. mitochondrial DNA is only inherited from the mother.

7. The alleles are imprinted.


The genetic code is the set of rules by which information encoded in genetic material (DNA or RNA sequences) is translated into proteins (amino acid sequences) by living cells. The code defines a mapping between tri-nucleotide sequences, called codons, and amino acids. A triplet codon in a nucleic acid sequence usually specifies a single amino acid (though insertion of two amino acids at one codon can occur unambiguously in different places in the same protein). Because the vast majority of genes are encoded with exactly the same code, this particular code is often referred to as the canonical or standard genetic

code, or simply the genetic code, though in fact there are many variant codes. Thus the canonical genetic code is not universal. For example, in humans, protein synthesis in mitochondria relies on a genetic code that varies from the canonical code.

It is important to know that not all genetic information is stored as the genetic code. All organisms' DNA contain regulatory sequences, intergenic segments, chromosomal structural areas, which can contribute greatly to phenotype but operate using a distinct sets of rules that may or may not be as straightforward as the codon-to-amino acid paradigm that usually underlies the genetic code.

This code uses four letters: A (Adenine), G (Guanine), C (Cytosine), T (Thymine) or U (Uracil). DNA uses Thymine (T) whereas RNA uses Uracil (U). These letters are actually nitrogenous bases i.e. they are chemicals. They occur in the genetic code as triplets: eg. UUU, GAT, ACA

What DNA Does DNA carr ies all of the information for your physical characteristics, which are essentially determined by proteins. So, DNA contains the instructions for making a protein. In DNA, each protein is encoded by a gene (a specific sequence of DNA nucleotides that specify how a single protein is to be made). Specifically, the order of nucleotides within a gene specifies the order and types of amino acids that must be put together to make a protein.

DNA contains the information to make proteins, which carry out all the functions and characteristics

DNA contains the information to make proteins, which carry out all the functions and characteristics of living organisms.

A protein is made of a long chain of chemicals called amino acids Proteins have many functions:

1. Enzymes that carry out chemical reactions (such as digestive enzymes)

2. Structural proteins that are building materials (such as collagen and nail keratin)

3. Transport proteins that carry substances (such as oxygen-carrying hemoglobin in blood)

4. Contraction proteins that cause muscles to compress (such as actin and myosin)

5. Storage proteins that hold on to substances (such as albumin in egg whites and iron-storing ferritin in your spleen)

6. Hormones - chemical messengers between cells (including insulin, estrogen, testosterone, cortisol, et cetera)

7. Protective proteins - antibodies of the immune system, clotting proteins in blood

8. Toxins - poisonous substances, (such as bee venom and snake venom)

The particular sequence of amino acids in the chain is what makes one protein different from another. This sequence is encoded in the DNA where one gene encodes for one protein.

encoded in the DNA where one gene encodes for one protein. The genetic code consists of

The genetic code consists of 3-base "words" or codons that specify particular amino acids. The order of the codons designates the order of the amino acids in the protein.

How does DNA encode the information for a protein? There are only four DNA bases, but there are 20 amino acids that can be used for proteins. So, groups of three nucleotides form a word (codon) that specifies which of the 20 amino acids goes into the protein (a 3-base codon yields 64 possible patterns (4*4*4), which is more than enough to specify 20 amino acids. Because there are 64 possible codons and only 20 amino acids, there is some repetition in the genetic code. Also, the order of codons in the gene specifies the order of amino acids in the protein. It may require anywhere from 100 to 1,000 codons (300 to 2,000 nucleotides) to specify a given protein. Each gene also has codons to designate the beginning (start codon) and end (stop codon) of the gene.

The first step of building a protein is transcription. It involves a series of chemical reactions. The RNA synthesized from DNA plays a major role in tis process.

The next step is translation. The whole process takes place by codons and anti- codons. This works by the concept of complementary base pairs. The various steps in constructing a protein are as follows.


A ribosome binds to mRNA with the AUG codon in the P-site and the UUU codon in the A-site.

2. An amino acyl-tRNA (anti-codon = UAC) with an attached methionine comes into the P-site of the ribosome

3. An amino acyl-tRNA (anti-codon = AAA) with an attached phenylalanine comes into the A-site of the ribosome

4. A chemical bond forms between the methionine and phenylalanine (in a protein, this covalent bond is called a peptide bond).

5. The methionine-specific tRNA leaves the P-site and goes off to gather another methionine

6. The ribosome shifts so that the P-site now contains the UUU codon with the attached phenyl-alanine tRNA and the next codon (ACA) now occupies the A-site.

7. An amino acyl-tRNA (anti-codon) with an attached threonine comes into the A-site of the ribosome.

8. A peptide bond forms between the phenylalanine and the threonine.

9. The phenylalanine-specific tRNA leaves the P-site and goes off to find another phenylalanine.

10. The ribosome shifts down one codon so that the stop sequence is now in the A-site. Upon encountering the stop sequence:

10.1 The ribosome detaches from the mRNA and splits into its two parts

10.2 The threonine-specific tRNA releases its threonine and leaves

10.3 The new protein floats away

Several ribosomes can attach to a molecule of mRNA one after another and begin making proteins. So several proteins can be made from one mRNA. In fact, in E. coli bacteria, translation of the mRNA begins even before transcription is finished.

The figure for synthesis of protein is on the next page.



Two mathematical models are:

1. Deterministic:

2. Stochastic


This method is based on approximation of an infinitely large population size. Here fluctuation of gene frequencies can be neglected. Population dynamics can be described in terms of mean gene frequencies.


They describe probabilistic processes in finite size samples.

Due to inherent complexity of this area of mathematical biology it is not being dealt here.


Influence on Computer Science The science of genetics is having a profound influence on the computer science. It has given rise to a certain class of algorithms known as evolutionary algorithms. These algorithms mimic the chromosomes and the various biological processes in order to solve a computer science problem.

Genetic algorithms have been developed that provide a paradigm shift in the field of computing. A specific style of programming evolved from genetic algorithms called genetic programming has risen. These algorithms come under the field of Artificial Intelligence. It can also be said to be a form of biomimicry.

The Human Genome Project Genome and its importance

A genome is the entire DNA in an organism, including its genes. Genes carry information for making all the proteins required by all organisms. These proteins determine, among other things, how the organism looks, how well its body metabolizes food or fights infection, and sometimes even how it behaves.

DNA is made up of four similar chemicals (called bases and abbreviated A, T, C, and G) that are repeated millions or billions of times throughout a genome. The human genome, for example, has 3 billion pairs of bases.

The particular order of As, Ts, Cs, and Gs is extremely important. The order underlies all of life's diversity, even dictating whether an organism is human or another species such as yeast, rice, or fruit fly, all of which have their own genomes and are themselves the focus of genome projects. Because all organisms are related through similarities in DNA sequences, insights gained from nonhuman genomes often lead to new knowledge about human biology.

Project goals were to

identify all the approximately 20,000-25,000 genes in human DNA,

determine the sequences of the 3 billion chemical base pairs that make up human DNA,

store this information in databases,

improve tools for data analysis,

transfer related technologies to the private sector, and

address the ethical, legal, and social issues (ELSI) that may arise from the project.

The project was completed in the year 2003. It had lasted for 13 years.

There is a great future for the field of genetics. Israeli scientists have already made the world’s first DNA computer: a trillion of which would fit inside a drop of water. This computer could enter the human system and diagnose the diseases from inside. The scientists called it a “Doctor in a cell” vision. The computer, a liquid solution of DNA and enzymes, was programmed to detect the kind of RNA (a DNA cousin) that would be present if particular genes associated with a disease were active.

On the mathematical side, DNA has been used by scientists to solve mathematical problems. DNA’s role is to store and process information, so strands of DNA would be like a gigantic microprocessor. This path has led to the creation of a field known as DNA computing. It is a form of computing that involves DNA, biochemistry and molecular biology instead of traditional silicon based computer technologies. It is fundamentally similar to parallel computing.

The whole field of genetics is ever growing. Moreover it is the basis of continuity of life; and life itself. It is amazing to think that our physical body is governed, ultimately, by program code!