You are on page 1of 44

Bahir Dar University

College of Agriculture and Environmental Sciences


Department of Plant Sciences
MSc Programs

Advanced Biometry and Software Applications, Plag601 (3+0)


Advanced Biometry and Software Applications, Hort601 (3+0)
Principles, Designs and Analysis of Agricultural Experiments
General Introduction
• Is there new thing in research?
• What the researcher did is unbearing or discovering the hidden
things and present it to the audience = to unfold the hidden reality
• Quality, effectiveness, success or failure of a program, policy, package
etc. can be determined by research

Researcher: who conduct


Problem research that can solve solution
the problem

• Research is a process comprising defining and redefining problem,


formulating hypothesis and testing the hypothesis using appropriate
techniques
– Defining – conceptualizing the existing problem
– Redefining – adjusting/modifying the problem as per the existing
General introduction …
• Effective research execution requires basic understandings of
biometry

• Biometry is the statistical analysis of biological observations and


phenomena

• In research we may take samples – that represent the population of


our interest to draw conclusions

• We will collect data from samples and subject it to analysis,


interpretation and draw conclusion

• Proper problem identification, design, implementation, analysis and


interpretation is critically important to draw appropriate conclusions
General introduction …
• The main objective of this course is to teach us
– How to initiate research questions/objectives/hypothesis
– How to test the set objectives
– Which methodologies we have to follow to set objective
– Which designs are appropriate for our objective
– Which variables has to be collected to address our objectives
– Which data analysis techniques are appropriate
– How we shall present our results
– What confounding factors will be introduced in our experiment
– What are the sources of errors
– What are the techniques used to minimize error
– What are the statistical softwares that we can use
Definition of terms
Introduction
• Statistics: The science of collecting, describing, and interpreting data
• Biometry is the statistical analysis of biological observations and
phenomena
• Biometrics the measurement and analysis of unique physical or
behavioral characteristics (fingerprint or voice patterns) especially as
a means of verifying personal identity
• Experiment: A planned activity whose results yield a set of data

• Parameter: A numerical value summarizing all the data of an entire


population

• Population: A collection, or set, of individuals or objects or events


whose properties are to be analyzed
– Two kinds of populations: finite or infinite
Definition…
Introduction…
• Sample – subset of the population
• Variable – characteristic about each individual element of a
population or sample that can be measured
• Statistic - a numerical value summarizing the sample data

• Data – the value of the variable associated with one element of a


population or sample

– This value may be a number, a word, or a symbol


Classification of statistics
1. Descriptive statistics = (data description)

2. Inferential statistics = (parametric tests)

1. Descriptive statistics

• It is a branch of statistics that consists of the collection, organization,


summarization, and presentation of data

• It describe the properties of a sample with respect to the given


variable or variables

• It include mean, median, mode, percentiles, standard deviation,


variance, coefficient of variation, correlation coefficient, etc.
Classification of statistics…
2. Inferential statistics

• Making decisions and drawing conclusions about populations

• These includes like standard errors

– which are not restricted within the limits of a sample unlike the
descriptive statistics

– go beyond the sample and help to make inferences and generalize


them from the sample to the entire population

• They find applications in testing of hypothesis, finding the


significance of d/c b/n statistics of d/t parameters, and working out
confidence intervals of parameters
Kinds of variables
1. Qualitative/Attribute/Categorical Variable

• A variable that categorizes or describes an element of a population

• Arithmetic operations: addition and averaging, are not meaningful for


data resulting from a qualitative variable.

2. Quantitative/Numerical Variable

• A variable that quantifies an element of a population.

• Arithmetic operations: addition and averaging, are meaningful for


data resulting from a quantitative variable.
Kinds of variables…
Example: Identify each of the following examples as attribute
(qualitative) or numerical (quantitative) variables

1. Skin color for shallot bulbs (-----------------------)

2. Marketable bulb weight per shallot plant (------------)

3. Marketable yield of potato per hill (---------)

4. The color of etiolated stems of ornamentals in the house (--)

5. Thousand seed weight of wheat plant (--------)

6. The pungency status of onion (high, medium, mild, low


(---)
Kinds of variables…
Qualitative and quantitative variables may be further
subdivided:
Nominal= naming
Qualitative Ordinal= ordering

Variable Discrete= Integers


Quantitative
Continuous= infinite Nos.
Dependent = response/outcome
Independent = predictor, explanatory, or exposure
Kinds of variables…
• Nominal Variable: categorizes (or describes, or names) an
element of a population: fertilizer type, variety, leaf color,
gender
• Ordinal Variable: incorporates an ordered position, or
ranking: highest, lowest
• Discrete Variable: can assume a countable number of
values: block number, tuber number, seed number
– there is a gap between any two values
• Continuous Variable: assume an uncountable No of values
– assume any value along a line interval, including every
possible value b/n any two values
Kinds of variables…
• Independent variable

– a variable that can be manipulated by the researchers and


affect the response variable

• Dependent variable

– is a variable whose value is dependent on the


independent variable
Measurement scales

• 4 measurement scales
– Nominal
– Ordinal
– Interval
– Ratio
Measurement scales…

1. Nominal scales: like “names”


• Used for labeling variables, without any quantitative value
• Mutually exclusive (no overlap) and none of them have any
numerical significance
• A nominal scale with only two categories is called “dichotomous”
– Examples: male/female
Measurement scales…
2. Ordinal scales
• It is the order of the values in what’s important and significant, but the
d/ces b/n each one is not really known

• For example, is the d/ce b/n “OK” and “Unhappy” the same as the
d/ce b/n “Very Happy” and “Happy?”  We can’t say!!!
• Measures of non-numeric concepts like satisfaction, happiness,
discomfort, etc.
• “Ordinal” is easy to remember because is sounds like “order”
Measurement scales…
3. Interval scales
• Numeric scales in which we know the order and the
exact differences between the values
• Example:
– small (20–40 g),
– medium (40–60 g), and
– large (60–80 g)
Measurement scales…

4. Ratio scales
• Give the exact value between units: length, weight,
width
• Have an absolute zero–which allows for a wide range of
both descriptive and inferential statistics to be applied
• Provide a wealth of possibilities when it comes to
statistical analysis
• Can be meaningfully added, subtracted, multiplied,
divided (ratios)
Types of measurements and measurement scales…
Statistical analysis

• There are three kinds of statistical analysis:

1. Non-parametric statistics
2. Index numbers
3. Parametric tests
Statistical analysis

1. Non-parametric tests/distribution free statistics


• Used to analyze nominal and ordinal scales
• Used to test hypotheses that do not involve specific
population parameters such as:
– mean, variance and ratio/proportion
• Commonly used non-parametric methods are:
a. The Sign Test
b. The Wilcoxon Signed Rank Test
c. The Spearman’s Rank Correlation Coefficient
c. The Kruskal-Wallis Test
Statistical analysis …
2. Index numbers

• It is a technique that measures ∆ in a variable over time relative to the


value of the variable during a specific base/reference period

• It is a special type of an average that provides a measurement of


relative ∆es from time to time or place to place

• Simple index = when an index is used to summarize a single item

• Aggregate index = when an index is used to summarize several items

• It can be

– Price index (∆ in fertilizer price)

– Quantity index (∆ in quantity of agricultural production)

– Value index (∆ in total value from one period to the base period)
Statistical analysis …
3. Parametric tests
• are statistical tests for population parameters (means,
variances, and proportions) about populations from which
the samples were selected
• Statistical tests such as z, t, and F tests =parametric tests
• Used to analyze interval and ratio scales
• Parametric statistics typically require
– Interval or ratio variables have distributions shaped like
the bell (normal) curve: Normality
– Some other assumptions: Homogeneity of variance,
Randomness and Independence
Kinds of distributions
• Probability distributions

 It lists all of the possible outcomes of a random variable


(x) along with the probability associated with each
outcome
 Frequency distribution that summarize sample
observations
 It is a listing of all of the possible outcomes of a variable
that has been divided into classes
 along with the frequency associated with each class
Kinds of distributions…
Types of probability distributions
• Binomial distribution
– The outcomes of a binomial experiment (with 2
outcomes only) with their corresponding probabilities
• Poisson distribution
– A probability distribution used when a density of items is
distributed over a period of time
Types of probability distributions
• Standard
  Normal: A normal distribution in which the
mean is 0 and the standard deviation () is 1. It is denoted by
Z
– Z-score: also known as z-value – a standardized score in
which the mean is zero and the is 1
• The z scores tells how far a score is from the mean in
standard deviation units
• The formula to convert any score (x) into its corresponding
z score is , s=sd; x=observation; = x of observations
• If the population parameter  and σ are known, the z score
can also be calculated as
Types of probability distributions
• The z scores is a way of telling how far a score is from the
mean in standard deviation units
Table 3.3. Yield (t/ha) of five inbred lines evaluated at Melkasa
______________________________
Yield (x) (x- )2  
______________________________
10 2.25
8 12.25
12 0.25 Here, for x = 8, what will be the corresponding z
score? Using the above data,
15 12.25
13 2.25
11 0.25
_____________________________
Mean = 11.5 Sum = 29.5
_____________________________
Types of probability distributions
•• z  score is also used to make comparisons between different
distributions
• Let the yield of a new variety be 78 at Koga, 67 at Geray, and 57 q/ha
at Adiet upon evaluation with other varieties
• These values do not tell us any thing about the performance of the
new variety in relation to the rest of the varieties
• If these variables are normally distributed in the population, we can
make direct comparisons by using the z score approach

• Let the over all mean of the varieties at each location be 75, 77 and 60
q/ha, respectively
Types of probability distributions
• Let
  the standard deviations of the locations were also given:
6, 12 and 10, in that order
• To compare the z score for the three locations, the z score
should be calculated for each location
• At Koga:
• At Geray:
• At Adiet:
• At what site the new variety perform best??
Types of probability distributions…
•Normal
  Distribution
• A probability distribution of any variable when ½ of it is
below the mean and ½ of it above the mean
• Biological data we encountered has a normal distribution
• Normally distributed population has a continuous variable
with an infinite range N~)
• It is a continuous, symmetric, bell-shaped distribution of the
variable
• Sample means will become normally distributed by
increasing the sample size = Central Limit Theorem
Kinds of distributions…
•Normal
  distribution…
• Normal Distribution can be normal and skewed (+vely or –
vely)
= , where =sd
Distributions of sample means:
• A
  sampling distribution of sample means
• a distribution obtained by using means computed from
random samples taken from a population
• If samples are not randomly selected with appropriate sizes
=>
• These differences are caused by sampling error
• Sampling error is the d/ce b/n the sample measure and the
corresponding population measure
– sample is not a perfect representation of the population
Distributions of sample means…
• When
  the sample is representative to the population
1. Sample mean =population mean ( =µ)
2. The standard deviation of the sample means < the standard
deviation of the population ( ϭ(s) < ϭ(p))

– where,
– ϭ(s)= standard deviation of sample mean
– ϭ(p) =population standard deviation, and
– s= sample size
Kinds of distributions…
•  
Sampling distribution…
 It describes the way in which a statistic (w/c is the function
of the random variable X1, X2, …, Xn) will vary from one
sample to another sample of the same size
 Such sampling distributions have given the avenue to the
test statistics for hypotheses testing
 Sampling distributions and associated tests are:
1. Student t-distribution = t-test
2. F-distribution = F-test
3. Chi-square distribution =-test
Sampling distributions ..
•   Student t-Distribution
1.
• It is a probability distribution value that arises when
estimating the mean of a normally distributed population in
situations where
– the sample size is small and
– population standard deviation is unknown
 Where
n = sample size
• t-distribution is used in testing =sample mean,
= population mean,
– the significance of sample meanss= sample standard deviation

– the difference between two sample means


– sample correlation coefficient
Sampling distributions ..
•1.   Student t-Distribution…
• The types of research questions that can be addressed are:
– Is of a single variable in a single group of individuals d/t
from a particular hypothesized population value?
– Are the of a single variable d/t b/n two d/t groups of
individuals?
Example:
• Is there any life expectancy difference b/n smokers and
smokers
• Is there wealth d/ce b/n technology adaptors and non-
adopters
• Is there yield d/ce b/n improved variety & local variety
Sampling distributions ..
•   F-Distribution: developed and described by R.A. Fisher
2.

 is the probability distribution associated with the f


statistic
 Let represent that sample variance of two d/t pop ns
 If both popns are normal and the popn variances =,
then the sampling distribution of

and >, called F-distribution


Sampling distributions
Properties of F-distribution
1. Is determined by the DF corresponding to the variance in the
numerator (d.f.N) and the DF corresponding to the variance in the
denominator (d.f.D)
2. F-distribution are positively skewed
3. The total area under each curve of an F-distribution =1
4. F-values are always ≥1
5. For all F-distribution, the mean value of F~1
Sampling distributions
2. F-Distribution…
• F-distribution is used for testing d/t H0 of population
parameters
• It is used to test for equality of variances
• It is used for testing for d/ce in means in three or more
groups
– It is the back bone of ANOVA
Sampling distributions
•3.   Chi-square () Distribution
 well known sampling distribution
 It is computed by
– Where;
• O=observed frequency, and
• E= Expected frequency
• test is applied when you have two categorical
variables from a single population
Sampling distributions
•3.   Chi-square () Distribution
• It is used to determine whether there is a significant
association between the two variables (categorical)
• Example: in a variety adoption survey, farmers are
classified by
– Gender (male/female) and

– Variety preference (Improved, Local, or maturity class)

• test is used for independence to determine whether gender


is related to variety preference
Sampling distributions
•   Chi-square () Distribution…
3.
• The types of research questions that can be addressed are:

– Are two (or more) proportions in a single categorical


variable different from hypothesized population values?

– Is there an association or dependence between


downscaled and recorded RF?
– Are two (or more) proportions different from each other?
Basic statistical concepts…
Research can be
• Basic – for the sake of knowledge

• Applied – to solve apparent problem

• Field – conducted in uncontrolled environment

• Laboratory – conducted in controlled environment

You might also like