This action might not be possible to undo. Are you sure you want to continue?

Dr. Anjil K Srivastava Department of Biotechnology NIT, Durgapur - 9

Biostatistics Biostatistics has been defined as ³the application of statistical methods to biological sciences´. Development of biostatistics was made during the period of Sir Francis Galton (1822 ± 1911).

He applied the statistical methods to the analysis of biological variation, correlation and regression. Karl Pearson (1857-1936) regarded as the father of Modern statistics was motivated by the researches of the Sir Francis Galton. For measuring correlation the Karl Pearson¶s method, popularly known as Pearson¶s coefficient of correlation is the most widely used in practice.

Central tendency

Generally in any distribution, values of the variable tend to congregate around a central value of the distribution. This tendency of the distribution is known as measures of central tendency. There are usually five basic measures of the central tendency. Arithematic mean, median, mode, geometric mean and hormonic mean

. It represents the entire data by one value which is obtained by adding together all the values and dividing this total by the number of observation.Arithmetic Mean The most familiar and widely used measure of central tendency is the arithmetic mean.

We use x as the symbol for the sample mean .The sample mean is the average of set of data and is computed as the sum of all the observed outcomes from the sample divided by the total number of events.

. x1 x 2 x 3 .. x n x! n xn.........Arithmetic Mean for series of individual observation x 1 .. .. X = x/n ... x 2 ..... x 3 . .

11. find the sum of the data. 69 / 6 = 11.Example 12. 7. 13 First. 12 + 15 +11 + 11 + 7 + 13 = 69 Then divide by the number of data.5 . 15.5 The mean is 11. 11.

. 5000. 3000.. 2750. 1000. Rs. 3500. 3750. 3250. What is the mean price? Find your answer «. Rs. Rs. Rs. and Rs.Example An electronics store sells CD players at the following prices: Rs. Rs.

3178.60 The mean or average price of a CD player is Rs.60. .3500 + 2750 + 5000 + 3250 + 1000 +3750 + 3000 = 22250 22250 / 7 = 3178.

Arithmetic Mean for discrete series X = fx/n where X = arithmetic mean f = Sum of frequency fx = sum of values of the variables and their corresponding frequencies .

Calculate the mean.Example The data recorded on the number of chlorophyll deficient plants in a lentil population is given below. Number of chlorophyll deficient plants Number of the plants 0 1 2 3 4 5 34 14 20 24 25 33 .

Number of chlorophyll deficient plants (x) Number of the plants (f) fx 0 14 40 72 100 165 fx = 391 0 1 2 3 4 5 34 14 20 24 25 33 f = 150 .

61 x = 2.61 . f = 150 x = 391/150 = 2.X = fx/n fx = 391 .

It is rigidly defined. It provides good basis for comparison. It is not affected by the fluctuation of the sampling. It is amenable to further mathematical treatment. It is based on all the observations. .Merits of Mean It is easy to understand and easy to calculate.

It can not be accurately determined even if one of the values is not known.Demerits of Mean The mean is unduly affected by the extreme items. .

Median A median is the middle value of the observations or the value which divides a distribution so that an equal number of items occur on either side of it. .

13 7.First. 11. 11. arrange the data in numerical order. 11. 12. 11. 7. 12. 13.5 . 11 + 12 = 23 23 / 2 = 11. 15. 15 Then find the number in the middle or the average of the two numbers in the middle.5 The median is 11.

Median in a series of individual observation Arrange the data in ascending or descending order Median is located by finding the size of n+1/2th item. M = size of the n+1th 2 Where M = median n = number of observations .

18.12 Sl. 1 2 3 4 5 6 7 8 9 Data Arranged in Ascending order 10 10 11 12 15 17 17 18 19 .17.17.15.19.11. No.10. Number of clusters = 10.Examples Find out the median from the data recorded on the number of clusters per plant in a pulse crop.

M = size of the n+1th 2 Median = 9+1/2 Median = size of 5th Item = 15 .

However the value of mean can not be graphically ascertained. The value of median can be determined graphically.Merits of Median It is easy to define and easy to understand. . The median will not be affected by the size of values of extreme items. It is also recommended in unequal class distributions.

Demerits of Median It is not based on all observations since it is positional average. Median is affected more by sampling fluctuation then by the value of mean. In this case the mean of two median values will be the estimate of the median. It may be unsuitable in case of large and small items. we can not calculate the median. . If the number of observation even.

Mode The mode is another measure of central tendency which is conceptually very useful. Mode is the most typical value of a distribution because it is repeated the highest number of times in the series. .

in which case it is said to be ³Unimodal´.Definition ³The most commonly occurring value´ According Croxton and Cowden ³the mode of a distribution is the value at the point around which the items tend to be most heavily concentrated. When concentration of data occurs at two or more points such a series called bimodal or multimodal. A set of data may have a single mode. .

13 The mode is 11. 11. 11. 15. 7.12. .

8. 7. For example. 2.Sometimes a set of data will have more than one mode. 5. 5 5 and 7 are both the mode and this set is said to be bimodal. in the following set the numbers both the numbers 5 and 7 appear twice. 7. 6. 4. 9. .

7. 2. 3. 6. 1 All the numbers in this set occur only once therefore there is no mode in this set. 8. 11. . 12.Sometimes there is no mode in a set of data.

8 Mean 5 Median 4 Mode 4 . 12 . 4 . 1 . 6 . 4 . 2 . Median and Mode of Ungroup Data The weekly pocket money for 9 first year pupils was found to be: 3 . 5 .Example-: Find Mean.

Mode of Group Data (1 M 0 ! L1 h (1 ( 2 L1 = Lower boundary of modal class 1 = difference of frequency between modal class and class before it 2 = difference of frequency between modal class and class after H = class interval .

Calculate the mode Number of grains/ panicle Number of Plants 100-110 110 ±130 130-140 140-160 160-170 170-180 11 40 27 34 12 6 .

Number of grains/ panicle 100-110 110-120 120-130 130-140 140-150 150-160 160-170 170-180 Number of plants 11 20 20 27 17 17 12 6 .

12 7+10 Mode = 134. i = 10 Mode = 130 + 7 x 10 = 130 + 70/17 = 130 +4.Mode is lies in the 130-140 (1 M 0 ! L1 h (1 ( 2 L1 = 130.12 = 134. 2 = 27-17 = 10. 1 = (27-20) = 7.12 .

It is simple and precise. . It is the point where there is more concentrations of frequencies. It is not unduly affected by extreme items.Merits of mode The mode is easy to calculate and can be determined by mere observation.

Sometimes the exact value of the modal class can¶t be known by inspection of the data. It is not a rigidly defined measure.Demerits of mode The mode is not based on all the observations. . The value of the mode can not be determined in bimodal distribution. Therefore it is necessary to prepare the grouping table and analysis table to find out the modal class.

The algebraic sign as in mean deviation is overcome by taking the square of deviation thereby making all positive. . It is the most commonly used measure of spread. Firstly introduced by Karl Pearson in 1893.Standard Deviation The standard deviation formula is very simple it is the square root of the variance.

Standard Deviation (s) = X = arithmetic mean n = number of observations .

Term was first coined by R.Variance Variance is also called mean square deviation. The term ³Variance´ is used to describe the square of the standard deviation. It helps us in isolating the effects of various factors. . A. Fisher in 1913.

The variance is defined as the mean of squares of deviations. S2 = (x-x)2 n-1 x = arithmetic mean n = number of observations .

A. A.N. R. The modern theory of probability was developed by Chebychev. Markov and A. Pierre Simon De Laplace compiled the first general theory of probability. Fisher and Von Mises introduced the empirical approach to probability. Kolmogorov .Probability In the ninteenth century.

Definition Probability is the likelihood of occurrence of an event. .

Example For Animal other than poultry Male (XY) Female (XX) Parents Poultry Male (XX) Female (XY) X or Y X Gamete X X or Y XX Female XY Male) Progeny XX Female XY Male) .

Statistical Explanation If an event can happen in ³a´ ways and same event fail to happen ³b´ ways Then the probability of its happening ³p´ p= Number of events occurring Total number of trials p= a a+b .

Number of survival after the operation p= Total Number of patient operated P = 160 400 P= 2 5 . calculate the probability of survival after operation.Example If a surgeon transplants a kidney in 400 cases and succeeds in 160 cases.

In simple terms ³An event is the occurrence of something´ Ex.Event Any possible outcome of a random experiment is called an event. The occurrence of head and tail is an event. . Performing an experiment called trial and the outcome is termed as event.

The events due to chance are grouped in two categories:- Mutually exclusive events Independent events .

Baby born .Mutually Exclusive events Events that are so related among themselves are said to be mutually exclusive. if the occurrence of an event excludes the possibility of the other or in other words Two events are mutually exclusive if both can not occur simultaneously. Examples ± Coin Toss.

Example:.Independent Events A set of event said to be independent if the occurrence of any event does not affect the chance of the occurrence of any other event of the set.Toss of two different coins .

Theorems of Probability There are two basic rules of chances:-± Addition Rule Multiplication Rule ± .

the probability of the occurrence of either A or B is the sum of their individual probabilities. Two events A & B are said to be mutually exclusive. p (A/B) = p (A) + p (B) The same rule can be extended for three or more events«. p (A/B/C) = p (A) + p (B) + p (C) ..Addition Rule (for mutually exclusive events) Suppose.

one card is drawn at random. What is the probability that it is either king or queen? Events are mutually exclusive.Example From a pack of 52 cards. . So the probability of king is 4/52 and for the queen same 4/52. There are 4 kings and 4 queens in a pack of 52 cards.

The probability the card is either a king or queen --p (A/B) = p (A) + p (B) 4/52 + 4/52 = 8/52 2/13 .

p (A/B) = p (A) + p (B) ± p (AB) .Addition Rule (for Independent events) When events A and B are not mutually exclusive it is possible to both events occur so the rule must be modified«.

³A´ and ³B´ are independent. p (A/B) = p(A) x p(B) .Multiplication Rule (For independent events) In this Rules if the two events. the probability of joint occurrence is given by the product of their separate probabilities.

5 ± p(B) = probability of the head in second toss.5 .½ =0.Example What is the probability of the heads on two or three successive tosses? ± p(A) = probability of the head in first toss.½ =0.5 Combined probability p (A/B) = p(A) x p(B) ½ x ½ = ¼ =0.

B & C) = p(A) x p(A/B) x p(C/AB) .Multiplication Rule (For Dependent events) If two events ³A´ and ³B´ are dependant. the probability of occurrence of one event is dependant on the occurrence of the other event. p (A&B) = p(A) x p(A/B) p (A.

What will be the probability that both the balls drawn are black? Probability of drawing black ball --- p (A&B) = p(A) x p(A/B) . Two balls drawn at random one after the other without replacement.Example A bag contains 7 red and 3 black balls.

Probability of drawing black ball² p(A/B) = 3/ 7+3 = 3/10 Probability of drawing second black ball² p(A/B) = 2/ 7+2 = 2/9 The Probability that both balls drawn are black² p (A&B) = p(A) x p(A/B) .

p (AB) = 3/10 x 2/9 = 1/5 x 1/3 = 1/15 .

Probability Application It is useful to find out the results of next generation. It help us to find out the probability of genetic diseases like Albinism. . We can also use the probability in predicting the ratio of boys and girls. It can also be applied in solving the Mendel¶s problems of heredity It also helps in analyzing the pedigrees by breeders.

. They are not obtained by actual Observation but are mathematically deduced on certain assumption which are based on probability. Such distribution are called ³Probability Distributions´ or ³Theoretical Distributions´. like centtral tendency measures) of certain population needs to device mathematically.Probability Distribution When the frequency distribution (Observation.

These distribution may be discrete or continuous. ± Discrete Probability Distribution Binomial Distribution Poisson Distribution ± Continuous Probability Distribution Normal Distribution . There are three main types of Probability distribution which are widely used in different studies.

.Binomial Distribution It is one of the most widely used probability distribution of random discrete variable. It applied where only one or two mutually exclusive outcome such as success or failure. Bernoulli. Since it introduced by Swiss mathematician J. dead or alive and male and female is possible. This distribution is also known as ³Bernoulli Distribution´.

Thus (p+q) = 1 and binomial is (p+q)n . It means binomial distribution describes the distribution of probabilities where there are only two possible outcome for each trial or experiment. If a coin is tossed once there are two possible ways of outcome the head or the tail. The probability of obtaining head (p) is ½ and the same ½ for tail (q).

there will be four possible outcome:-T First Coin H H Second Coin H T H T Probability pp = p2 pq = 2pq qp qq = q2 Binomial Expansion is (p+q)2= p2+q2+2pq T .Example If two coins are tossed simultaneously.

. All trial must be independent of each other.Assumption of Binomial Distribution Each trial has only two possible outcome ³success´ or ³failure´. There should not be any relation between two experiment or trial. The success (p) and failure (q) remains constant for each experiment or trial.

the total number of possible ways of obtaining ³r´ success and failure (n-r) is: Probability (r success of n trials) p(r) = n! x prqn-r r!(n-r)! where p = probability of success ! = factorial Like 5! = 5x4x3x2x1 Factorial for 0 is always 1 .Formulation In ³n´ trials.

It applied where the event is very rare like when dying due to rare disease. Poisson in 1837. in the sense the probability of their happening is very rare. are rare events. It was derived by Frenchman S.Poisson Distribution It is also a discrete probability distribution and is used very widely. . number of defective articles produced by a high quality machine.D.

In these cases ³p´ is very small and ³n´ is the number of trial so. It has a single parameter which is the mean of distribution and is denoted by ³m´ = np which remains constant . ³np´ is the fixed number known as Poisson distribution.

2.Formulation Probability of ³r´ success = -mmr e ! p(r) = e-mmr r! r = 0.7183 (constant) Where P= probability .1.3«n success e = 2.

Normal Distribution The most important distribution dealing with continuous variables is the Normal Distribution. It is also called Normal Probability Distribution. This is first discovered by De Moivre in 1733 . It is extremely useful in the analysis of agricultural and the biological data.

Such a ³curve´ with important statistical properties is called the ³Normal Distribution Curve´ which denotes the normally distributed population.This technique help us in drawing the interference about the population from the sample. . By this method we will get a ³curve´ with peak with evenly distributed items on either side of the peak.

values are often distributed in accordance with the normal distribution. . it serves as a good approximation of discrete distribution such as Binomial and Poisson. In large sample.Importance of the Normal Distribution In the most of biological analyses. As the sample size increases the distribution of mean of a random sample approaches to normal distribution.

The mean of a normally distributed population lies at the centre of its normal curve. The two tails never touch the base.Properties of Normal Distribution The normal curve is ³bell shaped´ and is symmetrical in appearance having single peak. The mean. The height of the curve declines on either side of the peak which occurs at the mean. . median and mode all are equal in normal distribution.

Formulation Normal Distribution (For sample) = z = x .x s Where z = number of standard deviation x = value of random variable x = mean of this distribution s = standard deviation of this ditribution .

we can measure the extent of relationship between two sets of data. By this coefficient. .Correlation The correlation was first investigated by Sir Francis Galton Karl Pearson introduced a method of assessing correlation by means of the coefficient of correlation.

. Example: Height of husbands and wives. But when both variables move together we say they are related.Correlation measures the closeness of the relationship between the two variables. These sets of variables may show a certain relationship or may not show any. 100 seed weight.

the term correlation refer to the study of relationship between two variables. Means. The statistical tool with the help of which this relationship between two variables is studied is called ³Correlation´.If a relationship persist it has to be quantitatively expressed showing a degree of association between the sets of variables. .

. Influence of some external factors on two variables. Influence of two variables on each other or mutual influence Influence of one variable upon the other.Reason behind correlation The correlation may be due to pure chance.

Types of Correlation Positive / Negative correlation Simple/ Partial / multiple corelation Linear/ Non-linear correlation .

Methods of studying Correlation Scatter Diagram method Graphical method Correlation coeficient .

Correlation Coefficient

First two methods do not provide any numerical measures of correlation. The degree of relationship can be established by calculating coefficient called Correlation Coefficient. Which always gives a quantitative measure of the degree of closeness between the two attributes. Karl Pearson developed this theory so it is also called Pearsonian Coefficient of the Correlation´ denoted by ³r´.

Regression

Regression analysis is concerned in measuring the probable form of the relationship between the two variables. The term first used by the Sir Francis Galton while studying the relationship between height of Father and son The method which help us to estimate the unknown value of one variable from known value of the related variable, is called Regression.

Galton studied the average relationship between two variables graphically and called the line describing the relationship, the line of regression.

Regression technique only applicable where two or more relative variables have the tendency to go back to the mean.

Test of Significance

The two samples drawn from the same population will show the differences in the mean values. This difference between the sample can be reduced but can¶ be eliminated. A procedure to assess the significance of this difference is known as the ³Test of Significance´. It help us to determine weather observed differences between two samples are actually due to chance or they are really significant.

Procedure for significance test Laying down of hypothesis ± ± Null Hypothesis Alternative hypothesis Level of Significance One or two tailed hypothesis .

Good Luck ! .

Sign up to vote on this title

UsefulNot useful- Quantitative Techniques Statistics Test 2
- DFoulMauP
- Distribution-function.ppt
- STA301SolvedMCQsWithReference
- Statistics 1 - Notes
- How the Shapes of the Binomial Probability Distribution Vary With the Different Values of Probability
- A short course on Statistics, Probability and Applications
- Module 24 - Statistics 1
- 7 Fundamentals of Probability
- Probability Basic Outline of stuff to learn
- Quantitative Methods 2 ICMR Workbook
- Rev Quiz
- 44852464-Probability-and-Statistics.pdf
- Essential Stats for Decision Making-1 Descriptive Stats-2011
- Session 7 (Chpt 5 & 6)(3)
- lab3
- Law of Large Number
- 19659chapter-14
- Rprob
- Discrete and Continuous Probability Distributions PPT @ BEC DOMS
- PROBABILITY DISTRIBUTIONS
- UW AFM 472
- S1 OCR as Syllabus
- Assign2_13
- Study Set
- Business Statistics- CW2.doc
- 205 ch06-ContinuosDist
- 11002192
- 09 Factor of Safety and Probability of Failure
- Chapter 6 Lecture Notes
- Bio Statistics