Professional Documents
Culture Documents
----------------------------------------------------------------------------------------------------------------------------------------------------
Role and importance of statistics in analyzing of Central Tendency- Mean, Median, Mode -concept
assessment data, Population and Sample and methods of finding each measure
Data, Types of Data- Primary & Secondary, and when to use each measure.
Quantitative & Qualitative Measures of Variability/Dispersion- Range, Mean
Classification of Data, Frequency Table Deviation, Quartile Deviation, Standard
(Grouped & Ungrouped) Deviation-concepts and methods of finding
Graphical Representation of Data- need and each measure and When to use each measure.
importance, Representing data using Bar Correlation-meaning and importance,
Diagram and Pie Diagram, Histogram, Concept of Coefficient of correlation, Types
Frequency Polygon, Frequency Curve and of Correlation- Positive, Negative, Zero and
Ogives, Interpretation of graphical Perfect Correlation, Rank Difference Method
representations. of calculating Coefficient of correlation,
Descriptive Statistical Measures : Measures interpretation of correlation.
----------------------------------------------------------------------------------------------------------------------------------------------
The word statistics derived from the Latin word ‘Status’ which means a ‘Political State’. It was applied only to
such facts and figures as the state required for its official purpose. Statistics is a body of methods for making
wise decisions in the face of uncertainty. It embodies a methodology of collection, classification, description
and interpretation of data obtained through the conduct of surveys and experiments. In recent time statistics
has come to be used in two sense; as numerical data & as statistical method. The word statistics denotes
some numerical data. In this case it has numerical description of quantitative aspect of things. They take the
form of counts or measurements. Statistical refers to the principles and methods used in collection, analysis
and interpretation of data.
Definition of statistics
“Statistics may be called the science of counting” A L Bowley
“Statistics can be defined as the collection, presentation and interpretation of numerical data” Croxton and
crowed
Statistics as a subject or branch of knowledge is defined as one of the subjects of study that helps us in
the scientific collection , presentation , analysis and interpretation of numerical facts.
“aggregates of facts to a marked extend by multiplicity of causes numerically expressed, enumerated or
estimated according to reasonable standards of accuracy, collected in a systematic manner for a pre
determined purpose and placed in relation to each other” HORACE SECRIST
The term statistics is used as a plural noun as well as a singular noun. In plural form it refers to the
numerical data collected in a systematic manner with some definite aim or object in view. In singular
sense The technique and methods used in collection , analyses and interpretation of data.
Characteristics
❖ Aggregate of facts
❖ Numerically expressed
❖ Affected to a marked extend by multiplicity of causes and not by a single cause
❖ Collected in a systematic manner
❖ Collected for a predetermined purpose
❖ It should be placed in relation to each other
❖ The reasonable standard of accuracy should be maintained in statistics
Functions (Steps of statistical analysis)
➢ Collection
➢ Classification
➢ Tabulation
➢ Analysis
➢ Interpretation
➢ Comparison
Importance of statistics
1. Statistics in business- statistics in extremely used in modern activities in business. A businessman must
make a proper analysis past, record to forecast the future business conditions. Every businessman have to
make use the statistical tools to estimate the trend of prices and of economic activities.
2. Statistics and the state- Statistics are the eyes of state as they help in administration. State conducts
the population census to estimate the figures of National Income and prosperity of the country.
3. Statistics in economic planning -In India various plans that have been prepared or implemented.
National Sample Survey Scheme was introduced to collect the statistical data for the use of planning.
4. Importance in defense and war- Statistical Tools are very useful in the field of defense and war because
it helps to compare the military strength of different countries in terms of manpower, tanks, war aero-
plains, missile etc. It also helps in planning future military strategy of the country. It helps to estimate the
loss due war. It helps to arrange the war finance.
5. Importance in research -In the field of industry and commerce researches are made to find out the
causes of variations of different products.
6. Importance in physical science.- In the sphere of physical science like physics, chemistry, Botany etc a
large number of measurements are taken which are found to vary from actual results.
7. Statistical method are vital in all educational problems.- Books dealing with educational science,
educational articles in magazines, educational surveys are repeat with statistics. If teacher wants to learn
these matters he must have a familiarity with statistical terminology.
8. Statistics in mathematics.-The accuracy of conclusion based on statistical methods can be easily tested
and verified.
9. Study or comparison of group of individuals.-It is not possible to squeeze out general conclusion merely
by examining a set of large number of individuals scores. In such a case certain representative values or
norms have to be calculated.
Other uses of Statistics are
i.Statistics has developed powerful tools which enable as to make valid inferences regarding
characteristics of a population by studying only a representative part of it, called a sample.
ii.Huge amount of quantitative information may be collected in reasonable time at minimum expenses
with the desired degree of accuracy using statistical method.
iii.For a physician to test the effectiveness of a new drug.
iv.For a political commentator of a country in a future date.
v.For a sociologist to forecast the population of a country in a future date.
vi.To enable the investigator find ratios, proportions etc.
Population A population is the aggregate of all the units under study in any field of enquiry. It is a
collection of individuals or of their values which can be numerically specified. It is also called as
universe. A population can be Finite population or Infinite population
Sample A finite subset of a population, selected from it with the objective of investigating its
properties is called a sample of that population. A sample is selected in such a manner that it represents
the population. It is a minute model or replica of the population. The representative proportion of the
population is called a sample. The sample must have sufficient size to warrant statistical analysis.
Sampling Sampling is the process by which a relatively small number of individuals or measures of
individuals, objects or events is selected and analyzed in order to find out something about the entire
population from which it was selected.
It helps to reduce expenditure, save time and energy, permit measurement of great scope, or produce
greater precision and accuracy. Sampling procedures provide generalizations on the basis of a relatively
small proportion of the population
Methods of sampling
Probability sampling Or Random sampling It is based on the probability for selection of each item.
Also known as chance sampling.
Non-probability sampling It is that sampling which does not afford any basis for estimating the
probability for each item to be included in the sample.
Differences between population and sample
population sample
Population refers to the collection of all elements Sample means a subgroup of the members of
possessing common characteristics, that comprises population chosen for participation in the study.
universe.
The target population is the total group of individuals from A sample is the group of people who take part in the
which thesample might be drawn. investigation. The people who take part are referred to as
“participants”
Population is always a large group a part of the population so comparatively smaller
Includes Each and every unit of the group. Includes Only a handful of units of population.
Data collection utilizes Complete enumeration or Data collection utilizes Sample survey or sampling
census
Focus on Identifying the characteristics. Focus on Making inferences about population.
DATA AND TYPES Of DATA Statistics is the study of the collection, organization, analysis, interpretation and
presentation of data. The first step in statistical work is to obtain data. Data constitute the foundation of
statistical analysis and interpretation.
Data denotes raw facts and figures. Data can be defined as a collection of facts or information from
which conclusion may be drawn.
Selection Of Appropriate method For Collection Of Data
Nature and scope of enquiry
Availability of financial resources
Availability of time and money
Degree of accuracy desired
Status of the investigator
Education and level of the respondents
Classification of data
On the basis of who collect data ,data can be classified into two
Primary data - Primary data are those data which are collected for the first time and are original in
character. Primary data are in the shape of raw materials from which the investigator draws
conclusions by applying statistical methods for analysis and interpretation.
“By primary data we mean those data which are original,that is those in which little or no grouping has
been made, the instance being recorded or itemized as encountered.they are essentially raw materials”
-HORACE SECRIST
ADVANTAGES OF PRIMARY DATA
They are the first hand information
The data collected are reliable as they are collected by the investigator for himeself
The primary data are useful for knowing opinion ,qualities and attitudes of respondents
DISADVANTAGES OF PRIMARY DATA
Expensive and time consuming
Scope for personal bias
Selection of a representative sample is not an easy task
Methods Used For Collecting Primary Data
• Observation method
• Interview method
• Questionnaire method
• Schedule method
Secondary data-Secondary data are those which have been collected by some other person for his purpose
and published, They are in the shape of finished products. “secondary data are those already in existence
and which have been collected for some other purpose than answering of the question of hand”-
M.M.BLAIR
Advantages Of Secondary Data
The information can be collected by incurring least cost
The time required for obtaining the information is very less
Available at large quantity of data
It helps the researcher to defining the problem and formulating hypothesis
It helps in interpreting the primary data with more insight
Disadvantages Of Secondary Data
Inappropriate and inadequate
Inaccurate and unreliable
The secondary data may contain certain errors
Sources of secondary data
• external -personal and public
• internal
• Official reports of central ,state and local govt.
• Official publication of the foreign govt.and international bodies like UNO and its subordinate bodies
• reports and publication of trade association, banks,cooperative societiesand similar semi govt.and
autonomous organisations
• Publications of research organisations, centres, institutes,and reports submitted by
economists,research scholars etc.
• Technical journals,news papers, books periodicals etc.
Difference Between Primary And Secondary Data
Primary data Secondary data
Primary data are original in character Secondary data are not original
Primary data are in the form of raw mateial Secondary data are in the form of finished
product
The collection of primary data require large Secondary data are easily available from
sum,energy,and time secondary sources
Primary data after use becomes secondary Secondary data can’t be converted into
data primary data after its use
Precautions are not necessary in the use of Precautions are necessary in the use of
primary data secondary data
It can be collected by different method via It can be collected by copying down from
observation,interview,questionnire,and published and unpublished sour
schedule method
During the process of assessment or research a large amount of information is gathered which can be
either qualitative or quantitative. On the basis of measurement ,data can be classified into two
Qualitative data - Qualitative data is a categorical measurement expressed not in terms of numbers,
but rather by means of a natural language description. When a person collects data in qualitative terms
the assessment is called qualitative. Qualitative observations are defined as any observation made using
the five senses. Because people often reach different interpretations when using only their senses,
qualitative evaluation becomes harder to reproduce with accuracy; two individuals collecting data
regarding the same thing may end up with different or conflicting results. In research and business,
qualitative data may involve value judgments and emotional responses. A similar example of a
qualitative data is "Our Company created more visually compelling projects last year than this year."
Qualitative data is more concerned with detailed descriptions of situations or performance; therefore it
can be much more subjective but can also be much more valuable in the hands of an experienced
person. The method of qualitative data collection rely on descriptions rather than numbers. It collects
data that are not analyzed by quantitative methods but rather by interpretive criteria. Here informal
methods like observation, interview, field notes, diary, document collection, anecdotes etc are used.
Examples: Description of procedure or skill demonstrated by student (based on observation),
Feedback on a demonstration or skill test, on case study or written assignment etc.
Quantitative data - Quantitative data is a numerical measurement. Expressed not by means of a
natural language description, but rather in terms of numbers. When the person collects data in
quantitative terms the assessment is called quantitative. Quantitative observations are made using
scientific tools and measurements. The results can be measured or counted, and any other person trying
to quantitatively assess the same situation should end up with the same results. An example of a
quantitative evaluation would be "This year our company had a total of 12 clients and completed 36
different projects for a total of three projects per client." Includes methods that rely on numerical
scores or ratings and collected data can be analyzed using quantitative methods. In quantitative data, the
process involves the collection, analyzes and interpretation of data is in terms of numbers. A
quantitative data collection uses values from an instrument based on a standardized system where the
data collected is limited to a selected or predetermined set of possible responses. In this data is
collected using more formal methods like tests, questionnaires, inventories, rating scale etc. Examples
:Number correct responses on a test, Ratings on an end-of-term course evaluation, Number of steps
missed during a skill or procedure demonstration.
Quantitative data Qualitative data
Collection and analysis of data in quantitative Collection and analysis of data in qualitative
terms. Data collected and analyzed in terms of terms. Data collected and analyzed in terms of
numbers (numerical data).Raw data are numbers descriptions (narrative data). Raw data are words
More objective in nature More subjective in nature
Uses numerical score or rating Uses detailed descriptions of situations or
performance
Can be considered as an analytical approach Can be considered as a holistic approach
Uses more structured and well constructed Method of data collection is mostly unstructured
methods of data collection (formal and rigid) (informal and flexible)
Response freedom is limited More freedom of response
Objective scoring Judgmental scoring
Easier analysis possible and can arrive at group Difficult to analyze and arrive at generalizations
generalizations
Gives insight into the child's cognitive, affective Gives insight into the other behavioural
and skills. characteristics
A person should utilize both qualitative and quantitative assessment for the complete evaluation of the
pupil. Both quantitative and qualitative data have their benefits, though one is usually more appropriate
than the other in any given situation. Both are supplementary to each other. A student's score in an
attitude scale can be justified by collecting data by observation.
CLASSIFICATION of data It is a technique with the help of which the collected data are divided into various
groups etc. It helps To reduce the complexities of the data., To facilitate the understanding., To facilitate the
comparison., To analysis and interpretation.
Classification of data should be
1 clearly understood.
2 It should be stable.
3 It should be flexible.
4 It should be clearly defined.
5 Quality or attributes should be expressed quantitatively.
TYPES OF CLASSIFICATION
• Geographical {population distribution}
• Qualitative {sex ,Color, literacy etc.}
• quantitative {Hight,Weight,Mark,Income}
• chronological {Time period}
TABULATION OF DATA - “Tabulation is a process of an orderly arrangement of data in
columns and rows”. -BLAIR.
Tabulation of data is done for Systematic presentation of statistical data., Classification of problem in
brief and simplicity, Facilitating the interpretation., To present the data in the form of Graph, Chart,
Diagram etc., To help comparison study.
FREQUENCY DISTRIBUTION
Frequency distribution is an arrangement of the values that one or more variables take in sample. Each entry
in the table contains the frequency. A frequency distribution has minimum of 2 coloumns. The leftmost one
listing the variable found in the data and the next is giving the frequency for that value.
TYPES OF FREQUENCY DISTRIBUTION
GROUPED FREQUENCY DISTRIBUTION- When there is a large number of scores, It is useful to group them
into a Manageable number of intervals by Creating intervals of equal widths and Computing the frequency
of fall into Each interval. Such a distribution is Called grouped frequency distribution.
UNGROUPED FREQUENCY DISTRIBUTION -If the number of distinct values it takes is Small,classification can
be done by Preparing a table which has no classes And gives only the frequency of each Value.Such a table
is called an Ungrouped frequency distribution.
DISADVANTAGES
• If the frequency distribution is grouped,the identity of the observation is lost.
• The selection of the class interval and lower bound of the first class are to a certain extent arbitrary.so
different frequency tables into which the same data is classified may give contradictory impressions.
GRAPHICAL REPRESENTATION OF DATA
Graphical representation of data means the pictorial representation and manipulation of data. Graphic
representation is the geometrical image of a set of data. It is a mathematical picture. It enables us to think
about a statistical problem in visual terms. It is a creative process that combines art and technology to
communicate idea. Different types of graphs are used in data representation. The graphic representation of
data proves quite an effective and an economic device for the presentation, understanding and interpretation
of the collected statistical data. Complicated data through a diagram or graph can easily be understood.
Some of them are listed below:-
For ungrouped data or discrete data
• Line graph
• Bar graph
• Pie graph
• Pictogram
For grouped data
• Histogram
• Frequency Curve
• Frequency Polygon
• Ogive
b) Subdivided bar diagrams (Component Bar Chart). First a simple bar diagram is drawn with the lengths
of the bars proportional to the totals of the component parts and is subdivided into parts of length
proportional to the component magnitude and each part is given a different color or shading. Used
when the observations have different components and when a comparison of the component parts are
needed.
c) Percentage bar diagrams. This is the modification of the sub divided bar diagram. Here the component
parts are expressed as the percentages of the total and a component bar diagram is drawn with all bars
having equal length.
d) Multiple bar diagrams. Grouped bars are used to represent related sets of data. For
example, imports and exports of a country together are shown in multiple bar chart. Each bar in a
group is shaded or coloured differently for the sake of distinction. Used for representing two or more
interrelated data for facilitating comparison.
e) Deviation bar diagrams. Used to represent net quantities like net profit, balance payable, deficit, etc.
Base line is drawn in the middle of the paper horizontally and positive values are indicated by bars of
proportional length drawn above the horizontal line and negative by bars of proportional length drawn
below the horizontal line.
PIE DIAGRAM
• Pie diagrams or pie charts are circle drawn to represent statistical data. The data is represented
through the sections or portions of a circle. It brings out the relative importance of the various
components. For drawing a pie diagram, we construct a circle of any diameter and this is broken into
various segments. Angle 360 degree represent 100percent and the corresponding angles for each
component can be found by multiplying 360 degree with percentage of the component
HISTOGRAM
A Histogram is a graphical display of frequency distribution. The term Histogram was just termed by ‘Karl
Pearson’ in 1895 as a term for a common form of graphic representation. A histogram is a graphic
representation of a continuous frequency distribution through special kind of vertical bar charts. There are no
gaps between the bars. The scale on the x axis must be continuous, the upper boundary of one class coinciding
with the lower boundary of next class. In the histogram, the class intervals should be in the exclusive form. If
the class intervals are in the inclusive form then it should be converted into exclusive form.
FREQUENCY POLYGON A frequency polygon is a graph of frequency distribution. It is an improvement over the
histogram. It is constructed either after drawing a histogram or without drawing a histogram. In the frequency
polygon, midpoints of all the class intervals are taken and frequencies corresponding to the midpoints are
marked. The points of frequencies are joined through straight lines to get frequency polygon.
LESS THAN OGIVE - in less than ogive we start with the upper limits of the classes and go on adding the
frequencies. When these frequencies are plotted, we get a rising curve.
MORE THAN OGIVE- in more than ogive we start with the lower limit of the classes and from the total
frequencies we subtract the frequency of each class. When these frequencies are plotted we get a declining
curve.
Measurers of central tendency: For a given set of large data we usually find that there will be very
few persons with very high and very low scores. Most of the person’s scores would lie in between the
highest and the lowest scores. This tendency of the distribution to cluster around the middle value is
called central tendency and the typical score around which most of the scores cluster or the value
between the extreme scores that is shared by most of the persons is referred to as measure of central
tendency. It is a measurement of data that indicates where the middle of the information lies. Tate
(1955) defines a Measure of Central Tendency as “a sort of average or typical value of the items in the
series and its function is to summarize the series in terms of this average value.” There are
three common measures of central tendency including the Arithmetic mean or mean, the median,
and the mode.
Some of the common uses of a measure of central tendency are
Each of them is a representative characteristic of the whole group. The performance of the
group as a whole can be described by a measure of central tendency, in its own way.
They help in the comparison of two or more groups and samples in terms of their typical
performance.
They indicate where the center of the distribution tends to be located.
They tells us about the shape and nature of the distribution (for normal distribution mean=
mode=median).
They give us a concise picture of large data.
They give a general picture of the whole group by use of the sample data alone.
To find the mathematical relationship between different groups.
Where
l - Exact lower limit of the Median class
F – Cumulative frequency up to or above the median class
f – Frequency of the median class
i – Class interval
N – Total frequency( 𝑁 = 𝑓)
When to use median
Used to summarize ordinal or highly skewed interval or ratio scores
When we have to get the exact mid-point of the distribution median is computed.
When a series contains extreme measures median is a more representative measure than mean.
In the case of open ended distributions computation of mean is impossible so median is more
reliable.
When we have to calculate a measure of central tendency from a graph median is the most
suitable.
Median is used specifically for those quantities like health, honesty, intelligence etc. that cannot
be measured in quantities.
Advantages of Median
It is easily understood and determined and located with greater exactness than mode.
Median is a better measure of central tendency than mode.
Only one score can be the median.
It is the most representative measure of central tendency when the distribution contains extreme
scores.
It is useful in the case of open ended classes and skewed distributions.
It will always be around where the most scores are.
It can be calculated even if a value is missing if its relative position is known.
It can be computed from a graph.
Limitations of Median
It is a non-algebraic measure. We cannot calculate the total score or the combined median etc.
It is a less dependable measure of central tendency than mean.
It is not used in higher statistical analysis.
It cannot be used in the case of nominal data.
Mode: Mode is the value that occurs most frequently in a set of data. It is typically useful in describing
the central value when the scores reflect a nominal scale of measurement. It is the point on the scale
that corresponds to the maximum frequency of the distribution. In any series it is the value of the item
which is most characteristic or common and is usually repeated the maximum number of times.
For ungrouped data mode is the value repeating most or with highest frequency.
For grouped data mode is calculated using the formula
𝒇𝒑 𝒇𝒎 − 𝒇p
𝑴𝒐 = 𝒍 + [ 𝒇𝒑 + 𝒇𝒔 ] 𝒊 or 𝑴 = 𝒍 + [ 𝟐𝒇 𝒎− 𝒇p− 𝒇𝒔 ] 𝒊
Where, l - Exact lower limit of the Model class (the class in which mode lies i.e., the class
corresponding to the highest frequency)
fm- Frequency of the modal class
fp – Frequency of the class preceding the modal class (above the modal class)
f s– Frequency of the class succeding the modal class (below the modal class)
i – Class interval
When to use mode
In nominal data – Since we cannot use mean or median
Also in ordinal, interval or ratio data, along with mean and median
When a quick and approximate measure is to be determined, we compute mode.
Mode is a very useful measure in the manufacturing industry as the most sold item i.e., modal
value is given more priority.
When a histogram or frequency polygon is given, the measure that can be easily computed is
mode.
When we wish to know the most typical case.
Advantages of Mode
It is easily understood even by a common man.
Mode can be easily be computed merely by looking at the data. All that one has to do is to find
out the score which is repeated maximum number of times.
It is an average widely used in everyday life. When we speak of average we generally refer to
mode e.g., average shoe size refers to that which is most sold.
It is useful in situations in which it is desirable to eliminate extreme cases.
It encourages attention to bimodal and multimodal distribution.
It can be computed from a graph.
Limitations of Mode
It is the most unstable measure of central tendency.
It not at all reliable in small samples. E.g., it the model salary of 50 workers is Rs.500 per
month but 45 out of them gets different salaries the mode is very unreal and gives a false
picture.
It is incapable of further algebraic treatment
A distribution can have more than one mode.
It is not used in higher statistical analysis.
Range (R)
Range is the simplest measure of variability or dispersion. It is calculated by subtracting the lowest
score from the highest score in the series or data. It takes only extreme scores into consideration and
ignores the variation of individual items.
Range = Highest value – Lowest value
The computation of range is recommended when
We need to know simply the highest and lowest scores of the total spread.
The group or distribution is too small
We want to know the variability within the group with no time.
We require speed and ease in the computation of a measure of variability.
The distribution of the scores of the group is such that the computation of other measure of
variability is not much useful.
Merits of range
It is very easily determined and understood.
It is very useful as a supplementary measure. In addition to other measures it helps in the
description of data.
It is a moderately reliable measure in large unimodal samples.
It is a very simple measure of variability.
Demerits of Range
It is not a representative measure of variability.
It is based on only two extreme scores and tells nothing about the variation among other
intermediate scores.
Where
l1 - Exact lower limit of the Q1 class, l3 - Exact lower limit of the Q3 class
F1 – Cumulative frequency upto or above the Q1 class
F3 – Cumulative frequency upto or above the Q3 class
f1 – Frequency of the Q1 class, f3 – Frequency of the Q3 class
i – Class interval, N – Total frequency( 𝑁 = 𝑓)
The use of this measure is recommended when
The distribution is skewed, containing a few very extreme scores.
The measure of tendency is available in the form of median.
The distribution is truncated (irregular) or has some indeterminate end values.
We have to determine the concentration around the middle 50 per cent of the cases
The various percentiles and quartiles have been already computed.
Merits
It is more representative than the range as it is not dependent on the extreme values.
It is very easy to compute, to understand and to interpret.
It is the most useful measure of variability in which median is used.
It is applicable even in that frequency distribution which have unequal class-intervals.
It is quite useful in small samples and when there are extreme measures in the distribution.
Demerits
25% of the scores fall below Q1 and 25% above Q3. Therefore Q1 and Q3 are measures of only
50% of the scores.
It is a non-algebraic property and so less reliable than SD.
Demerits
As it based on all items it may be inflated or depresses by a single extreme value which is very
high or very low.
As the signs are discarded and only absolute values are taken it is not an algebraic measure and
so cannot be reliably used in mathematical operations
CORRELATION
In measures of central Tendency and Dispersion, our studies had been confined to one variable only. But we
often come across problems involving two or more variables, where items of one variable bears some relation
with the item of the other variable or influence the values of the other variable. For example rainfall and
agricultural yield, height and weight, age of husband and wife. The term correlation is used to indicate the
relationship between two such variables in which with changes in the values of one variable, the values of the
other variable also change. Thus, if with a change in the price of a commodity, the demand for that
commodity changes, we would say that the price and demand are related with each other. “A connection or
relationship between two or more things that is not caused by chance. “
Thus correlation analysis refers to the technique used in measuring the closeness of the relationship between
the variables.
L R CORNER “If two or more quantities vary in sympathy, so that movements in one tend to be accompanied by
corresponding movements in the other then they are said to be correlated “
A.M. Tuffle defined correlation “ an analysis of the co-variation of two or more variables”.
Importance of correlation
Most of the variables show some kind of relationship. For instance, there is relationship between price
and supply, income and expenditure etc... With the help of correlation analysis we can measure in one
figure the degree of relationship.
It helps to ascertain the traits and capabilities of pupils while giving guidance or counselling.
Once we know variables are closely related, we can estimate the value of one variable given the value
of another. This is known with the help of regression.
Correlation analysis contributes to the understanding of economic behaviour, aids in locating the
critically important variable on which others depend.
Progressive development In the methods of science and philosophy has been characterized by increase
in the knowledge of relationship.
The effect of correlation is to reduce the range of uncertainty. The prediction based on correlation
analysis is likely to be more variable and near to reality.
Co-efficient of correlation is vital for all kinds of research work
It helps in establishing validity or reliability of an evaluation tool.
TYPES OF CORRELATION
Simple, partial and multiple correlation
The distinction between simple, partial and multiple correlation is based on the number of variables studied.
When the relationship between any two variables only is studied. It is a case of SIMPLE CORRELATION.
When the relationship between any two out of three or more variables is studied ignoring the effect of
the other related variables, it is a case of PARTIAL CORRELATION.
When the relationship between three or more variable is simultaneously, it is a case of MULTIPLE
CORRELATION.
Positive and negative correlation A correlation may be positive or negative depending upon the direction of
range of the variables.
POSITIVE CORRELATION is one where values of both the variables under study move in the same direction. The
data of positive correlation when plotted on a graph paper give an upward curve.
INTERPRETATION OF CORRELATION
By interpretation we intend to point out how high is any given coefficient of correlation is. Any coefficient
of correlation that is not zero and that is also statistically significant denotes some degree of relationship
between the two variables. As regards the strength of relationship in between the two variables, the
coefficient of correlation does not give directly anything like percentage that is indicated by an ‘r’. The
coefficient of correlation is an index number, not a measurement on a linear scale of equal units. There is
no denying fact that correlation enable us to find out relationship between the two variables. The values of
r (correlation ) reflects the strength of relationship between the variables. The strength of relationship
between the two variables can be described roughly as under for various r’s:
less than .20 slight, at most negligible relationship
.20 to .40 low correlation
.40 to .70 moderate correlation
.70 to .90 high correlation
.90 to 1.00 very high correlation.
It may be noted that the relationship i.e., correlation may be either positive or negative
but in no case the value of correlation may exceed (the value of r more than ) plus/
minus 1.