You are on page 1of 8


Week 10_1. On analysing Quantitative data

Quantitative data refer to all such data and can be a product of all research strategies. It can range from simple
counts such as the frequency of occurrences to more complex data such as test scores, prices or rental costs.
To be useful these data need to be analysed and interpreted. Quantitative analysis techniques assist you in this
process. They range from creating simple tables or diagrams that show the frequency of occurrence and using
statistics such as indices to enable comparisons, through establishing statistical relationships between variables
to complex statistical modelling.

Quantitative data can be divided into two distinct groups: categorical and numerical.
Categorical data refer to data whose values cannot be measured numerically but can be either classified into
sets (categories) according to the characteristics that identify or describe the variable or placed in rank order.
They can be further sub-divided into descriptive and ranked. In descriptive we also include dichotomous data
(female-male, youngsters-adults, related-unrelated, English-non English, European-not European). Ranked (or
ordinal) data are a more precise form of categorical data. Rating or scale questions, such as where a respondent
is asked to rate how strongly she or he agrees with a statement, collect ranked (ordinal) data. Despite this, some
researchers argue that, where such data are likely to have similar size gaps between data values, they can be
analysed as if they were numerical interval data.
Numerical data, which are sometimes termed ‘quantifiable’, are those whose values are measured or counted
numerically as quantities. There are two possible ways of sub-dividing numerical data: into interval or ratio
data, alternatively, into continuous or discrete data. If you have interval data you can state the difference or
‘interval’ between any two data values for a particular variable, but you cannot state the relative difference. This
means that values on an interval scale can meaningfully be added and subtracted, but not multiplied and
divided. In contrast, for ratio data, you can also calculate the relative difference or ratio between any two data
values for a variable. (Figure 1).

Your initial analysis should explore data using both tables and diagrams. Your choice of table or diagram will
be influenced by your research question(s) and objectives, the aspects of the data you wish to emphasise, and
the scale of measurement at which the data were recorded. This may involve using:
– tables to show specific values;
– bar charts, multiple bar charts, histograms and, occasionally, pictograms to show highest
– and lowest values;
– line graphs to show trends;
– pie charts and percentage component bar charts to show proportions;
– box plots to show distributions;
– scatter graphs to show relationships between variables.
(see examples)

All data should, with few exceptions, be recorded using numerical codes to facilitate analyses. Where possible,
you should use existing coding schemes to enable comparisons. For primary data you should include pre-set
codes on the data collection form to minimize coding after collection. For variables where responses are not
known, you will need to develop a codebook after data have been collected for the first 50 to 100 cases. You
should enter codes for all data values, including missing data. Why?
– you save time;
– the data are normally well tested;
– you allow comparisons of your results with other (often larger) surveys
– easily check for errors
– in case of missing data and say goodbye to a whole research for reasons:
• data were not required from the respondent, perhaps because of a skip generated by a filter question in a
• The respondent refused to answer the question (a non-response).

• The respondent did not know the answer or did not have an opinion. Sometimes this is treated as implying an
answer; on other occasions it is treated as missing data.
• The respondent may have missed a question by mistake, or the respondent’s answer may be unclear.

Subsequent analyses will involve describing your data and exploring relationships using statistics. As before,
your choice of statistics will be influenced by your research question(s) and objectives and the scale of
measurement at which the data were recorded. Your analysis may involve using statistics such as:
– the mean, median and mode to describe the central tendency;
– Chi square, Cramer’s V and Phi to test whether two variables are significantly associated;
– Kolmogorov-Smirnov to test whether the values differ significantly from a specified population;
– T-tests to test if null hypothesis is supported,
– Levent, to test whether groups are significantly different or; and ANOVA or Friedman test used to detect
differences in treatments across multiple test attempts.
– correlation and regression to assess the strength of relationships between variables;
– regression analysis to predict values
– descriptive analysis, made simply to describe the variables

When Designing your diagrams and tables consider:

– Does it have a brief but clear and descriptive title?
– Are the units of measurement used stated clearly?
– Are the sources of data used stated clearly?
– Are there notes to explain abbreviations and unusual terminology?
– Does it state the size of the sample on which the values in the table are based?

For diagrams specif.

– Does it have clear axis (affiliation) labels?
– Are bars and their components in the same logical sequence?
– Is more dense shading used for smaller areas?
– Have you avoided misrepresenting or distorting the data
– Is a key or legend included (where necessary)?

For tables specif.

– Does it have clear column and row headings?
– Are columns and rows in a logical sequence?

One of the questions you are most likely to ask in your analysis is: ‘How does a variable relate to another
variable?’ In statistical analysis you answer this question by testing the likelihood of the relationship (or one
more extreme) occurring by chance alone, if there really was no difference in the population from which the
sample was drawn. There are two main groups of statistical significance tests: non-parametric and
Non-parametric statistics are designed to be used when your data are not normally distributed. Not
surprisingly, this most often means they are used with categorical data. Non-parametric data are those which
make no assumptions about the population, usually because the characteristics of the population are unknown.
In contrast, parametric statistics are used with numerical data. Parametric data assume knowledge of the
characteristics of the population, in order for inferences (implications) to be able to be made securely; they often
assume a normal, Gaussian curve of distribution, as in reading scores. Non-parametric data are often
derived from questionnaires and surveys (though these can also gain parametric data), while parametric data
tend to be derived from experiments and tests (e.g. examination scores). Although parametric statistics are
considered more powerful because they use numerical data, a number of assumptions about the actual data
being used need to be satisfied if they are not to produce false results (Blumberg et al. 2008). These include:

• the data cases selected for the sample should be independent, in other words the selection of any one case for
your sample should not affect the probability of any other case being included in the same sample;
• the data cases should be drawn from normally distributed populations
• the populations from which the data cases are drawn should have equal variances
• the data used should be numerical.

Qualitative data are non-numerical data that have not been quantified. They result from the collection of non-
standardised data that require classification and are analysed through the use of conceptualisation. Qualitative
analysis generally involves one or more of: summarising data, categorising data and structuring data using
narrative to recognise relationships, develop and test propositions and produce well-grounded conclusions. It
can lead to reanalysing categories developed from qualitative data quantitatively.

The processes of data analysis and data collection are necessarily interactive. There are a number of aids that
you might use to help you through the process of qualitative analysis, including temporary summaries, self-
memos and maintaining a researcher’s diary. Qualitative analysis procedures can be related to using either a
deductively based or an inductively based research approach. The use of computer-assisted qualitative data
analysis software (CAQDAS) can help you during qualitative analysis with regard to project management and
data organisation, keeping close to your data, exploration, coding and retrieval of your data, searching and
interrogating to build propositions and theorise, and recording your thoughts systematically.

The differences between quantitative and qualitative data:

Quantitative data Qualitative data

Based on meanings derived from numbers Based on meanings expressed through words
Collection results in numerical and standardised Collection results in non-standardised data requiring
data classification into categories
Analysis conducted through the use of diagrams Analysis conducted through the use of
and statistics conceptualisation

Ways of collecting data and the representation method of them according to the purpose of the variables
into discussion.

Gaussian curve of distribution




Pictogram Line graph


Pie chart Scatter graph

Percentage Component Bar Chart Stacked Bar Chart

The correlation coefficient enables you to quantify and determine the relationship between the variables.

Ways of collecting data and the representation method of them according to the result or outcome to get
from the variables into discussion.