This action might not be possible to undo. Are you sure you want to continue?

Dr. Anjil K Srivastava Department of Biotechnology NIT, Durgapur - 9

Biostatistics Biostatistics has been defined as ³the application of statistical methods to biological sciences´. Development of biostatistics was made during the period of Sir Francis Galton (1822 ± 1911).

He applied the statistical methods to the analysis of biological variation, correlation and regression. Karl Pearson (1857-1936) regarded as the father of Modern statistics was motivated by the researches of the Sir Francis Galton. For measuring correlation the Karl Pearson¶s method, popularly known as Pearson¶s coefficient of correlation is the most widely used in practice.

Central tendency

Generally in any distribution, values of the variable tend to congregate around a central value of the distribution. This tendency of the distribution is known as measures of central tendency. There are usually five basic measures of the central tendency. Arithematic mean, median, mode, geometric mean and hormonic mean

It represents the entire data by one value which is obtained by adding together all the values and dividing this total by the number of observation. .Arithmetic Mean The most familiar and widely used measure of central tendency is the arithmetic mean.

The sample mean is the average of set of data and is computed as the sum of all the observed outcomes from the sample divided by the total number of events. We use x as the symbol for the sample mean .

. x 2 . x 3 .... .. ...Arithmetic Mean for series of individual observation x 1 ..... x n x! n xn...... x1 x 2 x 3 .. X = x/n ..

69 / 6 = 11. 15. 11. 7. 13 First.5 . 12 + 15 +11 + 11 + 7 + 13 = 69 Then divide by the number of data. 11.Example 12. find the sum of the data.5 The mean is 11.

Rs. Rs. 3250. Rs. What is the mean price? Find your answer «. . 3000. 5000..Example An electronics store sells CD players at the following prices: Rs. and Rs. Rs. 3500. 3750. 1000. 2750. Rs.

60 The mean or average price of a CD player is Rs. 3178.60.3500 + 2750 + 5000 + 3250 + 1000 +3750 + 3000 = 22250 22250 / 7 = 3178. .

Arithmetic Mean for discrete series X = fx/n where X = arithmetic mean f = Sum of frequency fx = sum of values of the variables and their corresponding frequencies .

Example The data recorded on the number of chlorophyll deficient plants in a lentil population is given below. Calculate the mean. Number of chlorophyll deficient plants Number of the plants 0 1 2 3 4 5 34 14 20 24 25 33 .

Number of chlorophyll deficient plants (x) Number of the plants (f) fx 0 14 40 72 100 165 fx = 391 0 1 2 3 4 5 34 14 20 24 25 33 f = 150 .

X = fx/n fx = 391 . f = 150 x = 391/150 = 2.61 x = 2.61 .

. It is based on all the observations. It is not affected by the fluctuation of the sampling. It provides good basis for comparison. It is rigidly defined.Merits of Mean It is easy to understand and easy to calculate. It is amenable to further mathematical treatment.

Demerits of Mean The mean is unduly affected by the extreme items. . It can not be accurately determined even if one of the values is not known.

Median A median is the middle value of the observations or the value which divides a distribution so that an equal number of items occur on either side of it. .

arrange the data in numerical order. 13 7. 13.5 . 7. 11. 11. 12.5 The median is 11.First. 15 Then find the number in the middle or the average of the two numbers in the middle. 11. 11 + 12 = 23 23 / 2 = 11. 12. 15. 11.

M = size of the n+1th 2 Where M = median n = number of observations .Median in a series of individual observation Arrange the data in ascending or descending order Median is located by finding the size of n+1/2th item.

1 2 3 4 5 6 7 8 9 Data Arranged in Ascending order 10 10 11 12 15 17 17 18 19 . Number of clusters = 10.18.10. No.Examples Find out the median from the data recorded on the number of clusters per plant in a pulse crop.12 Sl.17.17.11.19.15.

M = size of the n+1th 2 Median = 9+1/2 Median = size of 5th Item = 15 .

It is also recommended in unequal class distributions. The median will not be affected by the size of values of extreme items. .Merits of Median It is easy to define and easy to understand. However the value of mean can not be graphically ascertained. The value of median can be determined graphically.

. In this case the mean of two median values will be the estimate of the median.Demerits of Median It is not based on all observations since it is positional average. we can not calculate the median. If the number of observation even. Median is affected more by sampling fluctuation then by the value of mean. It may be unsuitable in case of large and small items.

Mode The mode is another measure of central tendency which is conceptually very useful. Mode is the most typical value of a distribution because it is repeated the highest number of times in the series. .

in which case it is said to be ³Unimodal´. When concentration of data occurs at two or more points such a series called bimodal or multimodal. A set of data may have a single mode.Definition ³The most commonly occurring value´ According Croxton and Cowden ³the mode of a distribution is the value at the point around which the items tend to be most heavily concentrated. .

15.12. . 11. 7. 11. 13 The mode is 11.

in the following set the numbers both the numbers 5 and 7 appear twice.Sometimes a set of data will have more than one mode. 7. 6. . 5 5 and 7 are both the mode and this set is said to be bimodal. 7. For example. 5. 2. 8. 9. 4.

11. 3. 12. 1 All the numbers in this set occur only once therefore there is no mode in this set.Sometimes there is no mode in a set of data. 6. . 8. 7. 2.

4 . 8 Mean 5 Median 4 Mode 4 . 5 . 2 . 6 . Median and Mode of Ungroup Data The weekly pocket money for 9 first year pupils was found to be: 3 . 12 . 1 . 4 .Example-: Find Mean.

Mode of Group Data (1 M 0 ! L1 h (1 ( 2 L1 = Lower boundary of modal class 1 = difference of frequency between modal class and class before it 2 = difference of frequency between modal class and class after H = class interval .

Calculate the mode Number of grains/ panicle Number of Plants 100-110 110 ±130 130-140 140-160 160-170 170-180 11 40 27 34 12 6 .

Number of grains/ panicle 100-110 110-120 120-130 130-140 140-150 150-160 160-170 170-180 Number of plants 11 20 20 27 17 17 12 6 .

i = 10 Mode = 130 + 7 x 10 = 130 + 70/17 = 130 +4. 1 = (27-20) = 7. 2 = 27-17 = 10.12 .Mode is lies in the 130-140 (1 M 0 ! L1 h (1 ( 2 L1 = 130.12 = 134.12 7+10 Mode = 134.

. It is not unduly affected by extreme items.Merits of mode The mode is easy to calculate and can be determined by mere observation. It is simple and precise. It is the point where there is more concentrations of frequencies.

Sometimes the exact value of the modal class can¶t be known by inspection of the data.Demerits of mode The mode is not based on all the observations. Therefore it is necessary to prepare the grouping table and analysis table to find out the modal class. It is not a rigidly defined measure. The value of the mode can not be determined in bimodal distribution. .

It is the most commonly used measure of spread. Firstly introduced by Karl Pearson in 1893.Standard Deviation The standard deviation formula is very simple it is the square root of the variance. . The algebraic sign as in mean deviation is overcome by taking the square of deviation thereby making all positive.

Standard Deviation (s) = X = arithmetic mean n = number of observations .

Fisher in 1913. A. The term ³Variance´ is used to describe the square of the standard deviation. .Variance Variance is also called mean square deviation. It helps us in isolating the effects of various factors. Term was first coined by R.

S2 = (x-x)2 n-1 x = arithmetic mean n = number of observations .The variance is defined as the mean of squares of deviations.

Pierre Simon De Laplace compiled the first general theory of probability. Kolmogorov .N.Probability In the ninteenth century. Markov and A. The modern theory of probability was developed by Chebychev. A. R. Fisher and Von Mises introduced the empirical approach to probability.A.

Definition Probability is the likelihood of occurrence of an event. .

Example For Animal other than poultry Male (XY) Female (XX) Parents Poultry Male (XX) Female (XY) X or Y X Gamete X X or Y XX Female XY Male) Progeny XX Female XY Male) .

Statistical Explanation If an event can happen in ³a´ ways and same event fail to happen ³b´ ways Then the probability of its happening ³p´ p= Number of events occurring Total number of trials p= a a+b .

Example If a surgeon transplants a kidney in 400 cases and succeeds in 160 cases. calculate the probability of survival after operation. Number of survival after the operation p= Total Number of patient operated P = 160 400 P= 2 5 .

Performing an experiment called trial and the outcome is termed as event.Event Any possible outcome of a random experiment is called an event. In simple terms ³An event is the occurrence of something´ Ex. . The occurrence of head and tail is an event.

The events due to chance are grouped in two categories:- Mutually exclusive events Independent events .

Baby born .Mutually Exclusive events Events that are so related among themselves are said to be mutually exclusive. Examples ± Coin Toss. if the occurrence of an event excludes the possibility of the other or in other words Two events are mutually exclusive if both can not occur simultaneously.

Independent Events A set of event said to be independent if the occurrence of any event does not affect the chance of the occurrence of any other event of the set.Toss of two different coins . Example:.

Theorems of Probability There are two basic rules of chances:-± Addition Rule Multiplication Rule ± .

Two events A & B are said to be mutually exclusive..Addition Rule (for mutually exclusive events) Suppose. the probability of the occurrence of either A or B is the sum of their individual probabilities. p (A/B/C) = p (A) + p (B) + p (C) . p (A/B) = p (A) + p (B) The same rule can be extended for three or more events«.

There are 4 kings and 4 queens in a pack of 52 cards. What is the probability that it is either king or queen? Events are mutually exclusive.Example From a pack of 52 cards. . one card is drawn at random. So the probability of king is 4/52 and for the queen same 4/52.

The probability the card is either a king or queen --p (A/B) = p (A) + p (B) 4/52 + 4/52 = 8/52 2/13 .

p (A/B) = p (A) + p (B) ± p (AB) .Addition Rule (for Independent events) When events A and B are not mutually exclusive it is possible to both events occur so the rule must be modified«.

p (A/B) = p(A) x p(B) . the probability of joint occurrence is given by the product of their separate probabilities.Multiplication Rule (For independent events) In this Rules if the two events. ³A´ and ³B´ are independent.

5 ± p(B) = probability of the head in second toss.½ =0.Example What is the probability of the heads on two or three successive tosses? ± p(A) = probability of the head in first toss.5 Combined probability p (A/B) = p(A) x p(B) ½ x ½ = ¼ =0.5 .½ =0.

Multiplication Rule (For Dependent events) If two events ³A´ and ³B´ are dependant. p (A&B) = p(A) x p(A/B) p (A. the probability of occurrence of one event is dependant on the occurrence of the other event. B & C) = p(A) x p(A/B) x p(C/AB) .

Example A bag contains 7 red and 3 black balls. Two balls drawn at random one after the other without replacement. What will be the probability that both the balls drawn are black? Probability of drawing black ball --- p (A&B) = p(A) x p(A/B) .

Probability of drawing black ball² p(A/B) = 3/ 7+3 = 3/10 Probability of drawing second black ball² p(A/B) = 2/ 7+2 = 2/9 The Probability that both balls drawn are black² p (A&B) = p(A) x p(A/B) .

p (AB) = 3/10 x 2/9 = 1/5 x 1/3 = 1/15 .

We can also use the probability in predicting the ratio of boys and girls. .Probability Application It is useful to find out the results of next generation. It help us to find out the probability of genetic diseases like Albinism. It can also be applied in solving the Mendel¶s problems of heredity It also helps in analyzing the pedigrees by breeders.

. They are not obtained by actual Observation but are mathematically deduced on certain assumption which are based on probability.Probability Distribution When the frequency distribution (Observation. Such distribution are called ³Probability Distributions´ or ³Theoretical Distributions´. like centtral tendency measures) of certain population needs to device mathematically.

These distribution may be discrete or continuous. There are three main types of Probability distribution which are widely used in different studies. ± Discrete Probability Distribution Binomial Distribution Poisson Distribution ± Continuous Probability Distribution Normal Distribution .

This distribution is also known as ³Bernoulli Distribution´.Binomial Distribution It is one of the most widely used probability distribution of random discrete variable. Since it introduced by Swiss mathematician J. dead or alive and male and female is possible. It applied where only one or two mutually exclusive outcome such as success or failure. . Bernoulli.

Thus (p+q) = 1 and binomial is (p+q)n . The probability of obtaining head (p) is ½ and the same ½ for tail (q). It means binomial distribution describes the distribution of probabilities where there are only two possible outcome for each trial or experiment. If a coin is tossed once there are two possible ways of outcome the head or the tail.

Example If two coins are tossed simultaneously. there will be four possible outcome:-T First Coin H H Second Coin H T H T Probability pp = p2 pq = 2pq qp qq = q2 Binomial Expansion is (p+q)2= p2+q2+2pq T .

Assumption of Binomial Distribution Each trial has only two possible outcome ³success´ or ³failure´. There should not be any relation between two experiment or trial. All trial must be independent of each other. . The success (p) and failure (q) remains constant for each experiment or trial.

the total number of possible ways of obtaining ³r´ success and failure (n-r) is: Probability (r success of n trials) p(r) = n! x prqn-r r!(n-r)! where p = probability of success ! = factorial Like 5! = 5x4x3x2x1 Factorial for 0 is always 1 .Formulation In ³n´ trials.

are rare events. number of defective articles produced by a high quality machine. Poisson in 1837. It was derived by Frenchman S. It applied where the event is very rare like when dying due to rare disease.D.Poisson Distribution It is also a discrete probability distribution and is used very widely. . in the sense the probability of their happening is very rare.

In these cases ³p´ is very small and ³n´ is the number of trial so. ³np´ is the fixed number known as Poisson distribution. It has a single parameter which is the mean of distribution and is denoted by ³m´ = np which remains constant .

2.Formulation Probability of ³r´ success = -mmr e ! p(r) = e-mmr r! r = 0.3«n success e = 2.7183 (constant) Where P= probability .1.

This is first discovered by De Moivre in 1733 . It is also called Normal Probability Distribution.Normal Distribution The most important distribution dealing with continuous variables is the Normal Distribution. It is extremely useful in the analysis of agricultural and the biological data.

By this method we will get a ³curve´ with peak with evenly distributed items on either side of the peak.This technique help us in drawing the interference about the population from the sample. Such a ³curve´ with important statistical properties is called the ³Normal Distribution Curve´ which denotes the normally distributed population. .

. As the sample size increases the distribution of mean of a random sample approaches to normal distribution. it serves as a good approximation of discrete distribution such as Binomial and Poisson.Importance of the Normal Distribution In the most of biological analyses. In large sample. values are often distributed in accordance with the normal distribution.

The mean of a normally distributed population lies at the centre of its normal curve. The mean. .Properties of Normal Distribution The normal curve is ³bell shaped´ and is symmetrical in appearance having single peak. The height of the curve declines on either side of the peak which occurs at the mean. median and mode all are equal in normal distribution. The two tails never touch the base.

x s Where z = number of standard deviation x = value of random variable x = mean of this distribution s = standard deviation of this ditribution .Formulation Normal Distribution (For sample) = z = x .

we can measure the extent of relationship between two sets of data. . By this coefficient.Correlation The correlation was first investigated by Sir Francis Galton Karl Pearson introduced a method of assessing correlation by means of the coefficient of correlation.

These sets of variables may show a certain relationship or may not show any. Example: Height of husbands and wives. 100 seed weight. But when both variables move together we say they are related. .Correlation measures the closeness of the relationship between the two variables.

If a relationship persist it has to be quantitatively expressed showing a degree of association between the sets of variables. the term correlation refer to the study of relationship between two variables. The statistical tool with the help of which this relationship between two variables is studied is called ³Correlation´. Means. .

Influence of some external factors on two variables. Influence of two variables on each other or mutual influence Influence of one variable upon the other. .Reason behind correlation The correlation may be due to pure chance.

Types of Correlation Positive / Negative correlation Simple/ Partial / multiple corelation Linear/ Non-linear correlation .

Methods of studying Correlation Scatter Diagram method Graphical method Correlation coeficient .

Correlation Coefficient

First two methods do not provide any numerical measures of correlation. The degree of relationship can be established by calculating coefficient called Correlation Coefficient. Which always gives a quantitative measure of the degree of closeness between the two attributes. Karl Pearson developed this theory so it is also called Pearsonian Coefficient of the Correlation´ denoted by ³r´.

Regression

Regression analysis is concerned in measuring the probable form of the relationship between the two variables. The term first used by the Sir Francis Galton while studying the relationship between height of Father and son The method which help us to estimate the unknown value of one variable from known value of the related variable, is called Regression.

Galton studied the average relationship between two variables graphically and called the line describing the relationship, the line of regression.

Regression technique only applicable where two or more relative variables have the tendency to go back to the mean.

Test of Significance

The two samples drawn from the same population will show the differences in the mean values. This difference between the sample can be reduced but can¶ be eliminated. A procedure to assess the significance of this difference is known as the ³Test of Significance´. It help us to determine weather observed differences between two samples are actually due to chance or they are really significant.

Procedure for significance test Laying down of hypothesis ± ± Null Hypothesis Alternative hypothesis Level of Significance One or two tailed hypothesis .

Good Luck ! .

Sign up to vote on this title

UsefulNot useful- DFoulMauP
- 302LN-1B
- C5606_Statistik
- Principles of Biostatistics
- Prior Probability
- Chap04 Probability Bhbp Statistics
- 01+S241+Introduction to+Probability
- Edward Greenberg - Introduction to Bayesian Econometrics (2007)
- Glossary
- Sample.exam1a
- Module 4 - Fundamentals of Probability
- 111111
- Probability
- Biostatistics Constantin Yiannoutsos
- Introduction to Statistics
- Lecture 6
- 2. IInference for Two-Way Contingency Tablesnference for Two Way Contingency Table
- Literature and Schedule Overview VT2015
- Chapter 8 Solutions
- Stats 210 Course Book
- Intro to Stats Book
- Chapter 5 Probability
- SYLLABUS_SQQS1013_A122
- Untitled
- Probability Theory
- 2-Prob-Distribution.pdf
- 08_Review_of_Part_I
- E_S_T
- 2011 Mathematics HCI Prelim Paper 2
- Biostatistics - Descriptive Stat
- Bio Statistics

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd