You are on page 1of 9

Topic: Data Analysis

Data Analysis

Topic Preview
The purpose of this section is to consider how to approach data
analysis by reflecting on the following points:
 how to organise, analyse and represent qualitative data
 Counting, categorising, relating and predicting quantitative data

Topic Content

Introduction

Your choice of data analysis is connected directly to your data collection


methods. If you chose qualitative methods such as interviews, you will
need to analyze and present the data using qualitative analysis
techniques. Likewise, if you use quantitative collection techniques such
as questionnaires, you should be using quantitative analysis techniques.
In terms of how your results should look, remember that if you collected
‘words’ (interviews, focus groups), you should be analyzing and
presenting ‘words’. If you collected ‘numbers’ (questionnaires, surveys,
secondary statistics), your results should be analyzed and presented as
‘numbers’ as well. Following is a brief description of some of the major
qualitative and quantitative data analysis techniques.

Qualitative Data Analysis

The qualitative data analysis process can be broken down into three
stages: reduction, organisation and interpretation of the data.

Data Reduction
Qualitative research provides a vast amount of data which needs to be
reduced into a manageable form. This can be done by coding the data.
Open interviews require a detailed coding system which allows for the
richness of the data to be processed. Coding is when key words or short
two or three letter codes are used to represent a theme. The researcher
methodically works through the transcript or copy of notes, line by line or
paragraph by paragraph, and assigns the codes in the margin of the text.
This enables the researcher to begin to see patterns, categories or
themes.

Data organisation
The organisation part of the process is the systematic collating of coded
bits of data on the same theme from across all the interviews. If your
records are electronic then is it simply a matter of copying (not cutting)
and pasting the relevant bits of text within a new themed document.

© The Robert Gordon University 2018 1


Topic: Data Analysis

Again, this is a laborious task but relatively simple if the coding has been
carried out efficiently.
There are a limited number of computer packages which help with
analysis of coded qualitative data by pulling together text associated with
each code. However, the researcher still has the laborious task of
examining the text in detail to assign the codes. Packages like this
require time to get to grips with and most researchers, particularly at the
Masters stage, find MS Word adequate.
If the notes or transcripts are on paper then the process is more difficult
and is carried out by copying and then cutting the paper or by rewriting
the information on another thematic sheet. Any manual rewriting like
this of course introduces the chance of error and care must be taken to
do a spot check to ensure that copying has been done accurately.
These new thematic documents of verbatim quotes and notes provide the
basis for the next stage - analysis or interpretation.

Interpretation and Presentation


This is the point at which you begin really analysing what is actually
being said. Each thematic document will contain quotes and notes from
across all the interviews which have been identified as relating to one
theme, for example "job satisfaction". Are these quotes saying the same,
similar or different things about job satisfaction? What do the quotes say
about the employee job satisfaction in general or within each company or
type of position within companies?
The next consideration is how the data is presented to your reader and
how your interpretation is presented in a meaningful and clear manner.
Qualitative analysis is usually presented textually under thematic heading
which are likely to follow the framework of your coding categories. This
text is then illustrated with one or two examples from the original data in
the form of quotes. A meaningful quote enables the reader to see the
evidence for your interpretation. At this stage you are likely to use cross-
referencing to check the original quote. Quotes should be verbatim, even
if they are grammatically incorrect. However, if you need to fill a gap to
make the statement clearer then the convention is that the word or
phrase is inserted within square brackets [ ] to indicate that this was not
in the original quote. It is important to think carefully before inserting
words to ensure that the real, intended sense is not compromised and
that your words are clearly identified with square brackets.
Sometimes in qualitative analysis it is possible and desirable to quantify
or semi-quantify data. For example, you might be in a position to say
50% of employees interviewed expressed a positive reaction to an issue,
25% were negative and the remainder was neutral in their response.
Again, it is useful to illustrate this with quotes, perhaps representing both
sides of the issue.
However, the majority of the findings will be written up thematically and
discursively indicating how you have interpreted the findings and what
they mean in relation to the research problem or question. Even at this
stage, patterns and trends will begin to emerge that enable a different

© The Robert Gordon University 2018 2


Topic: Data Analysis

perspective or new approach to the subject. If new issues or patterns


begin to emerge at this advanced stage it is important to go back
through the data to ensure that the evidence associated with the issue
has been fully explored. In other words, the cyclical process begins
again. This will require time towards the end of the study so ensure you
allow for this in the planning stage. The discussion section of the
dissertation will be where the implications for the subject as a whole are
written up and presented in the light of previous research or future
development

Quantitative Data Analysis

Types of Quantitative Data


It is very important when applying any kind of statistical calculation to
consider the type of data you are dealing with. There are three types of
data:
 numerical (cardinal or interval) – data which has an absolute
numeric value, and the difference or interval between two values is
a meaningful measure itself, e.g. age, salary, weight
 nominal - data categories which have no actual numerical value
(although you can code them as numbers for the purpose of
analysis) and where the attributes are identified by name e.g.
gender, country of residence, etc
 ordinal - data categories have no absolute numerical value but
there is a scale or ranking order amongst the categories e.g.
priorities on a scale of 1 to 5.

Statistical calculations need to be appropriate to the type of data. For


example, it might be very interesting to calculate the highest, lowest and
average salary or age of respondents within a sample. However it would
be meaningless to calculate the average gender or country of origin.
To sum up, basic descriptive statistics can be used to analyse ranges,
frequencies, and variation of data. This is often quite sufficient to enable
patterns in the data to be explored and conclusions drawn. Many
researchers, especially at the postgraduate stage, find that they do not
need more complex statistics. The output from such calculations can be
shown as tables and graphs – useful to the researcher when interpreting
the data and when reporting and writing up the research. However, it is
important to think carefully about the types of data you are dealing with
before applying even the most basic statistical calculations.

Preparation for quantitative data analysis


Before quantitative data can be processed, particularly if a computer
package such as SPSS or Excel is used, there are various steps that need
to be taken to ensure the data is in a form that is compatible with
electronic processing. In order to apply statistical procedures (which will
be discussed at great length in other topics within this module) or to use

© The Robert Gordon University 2018 3


Topic: Data Analysis

a spreadsheet and statistical package to produce tables and graphs, it is


important to ensure that the data is in a numeric form.
While you may be dealing with some data which is already in a numeric
form (for example age, salary, frequency, etc.), a typical questionnaire
will also contain much data which is not in pure numeric form (e.g.,
information about the respondent’s country of birth, gender, employment
status, preferences, strength of agreement with various statements,
priorities, etc.) Such data needs to be converted to a series of numbers
by a process known as coding.
To pre-code responses, the researcher assigns each answer to a
particular question a number that can then be used to represent that
statement during analysis. This allows the researcher to numerically
manipulate non-numeric data. For example:
Exams should be abolished from all university courses

Strongly agree 1
Agree 2
Disagree 3
Strongly Disagree 4

The codes are visible to the respondent who may be asked to circle the
relevant number rather than ticking an empty box. The codes have no
scalar meaning – they are simply code numbers which can be input to a
stats package for the purpose of calculating the frequency of the various
responses in the sample.
However, codes can be confusing for respondents who may assign a
significance to the codes which does not exist. In the above example, a
respondent might be tempted to assume that there was some kind of
scalar significance to the codes, and that “strongly disagree” was
somehow considered to be 4 times as important as the “strongly agree”
option. To avoid any chance of misunderstanding of the significance of
the codes, many researchers prefer to assign codes to answers after the
return of questionnaires. This is called post-coding. With post-coding
the respondent is unaware of any code numbers in the questionnaire and
questions would typically invite the respondent to tick a box:
How frequently on average do you use the library?
(Please Tick)
Daily 
Once a week 
Once a month 
Annually 
Never 

© The Robert Gordon University 2018 4


Topic: Data Analysis

Upon return of the completed questionnaire the researcher would


manually go through the responses coding the answers. Codes are
usually written in alongside the answer, in a column left blank for that
purpose (often marked “for office use only”). A coding schedule or
template should be set up to aid the post-coding process (i.e. a master
questionnaire on which the code number assigned to each possible
answer is recorded). For example:

Master Template
Do you get information from any of the following sources? (Tick any
that apply)
for office use
School Policy resource collections 1
Personal buying, borrowing and reading 2
Other teachers 3
Schools Library Service 4
School Librarian(s) 5
Educational Librarians 6

Sometimes questionnaires contain a mixture of pre- and post-coding:


Q 18. Please indicate the quality of communication between the
specified groups on a scale of 1-4 (where 1=low and 4=high):

1 2 3 4 for office
use
You and other teachers o o o o 1
You and the library o o o o 2
You and pupils o o o o 3
You and senior management o o o o 4
Your primary and associated
Secondary schools o o o o 5

In this example there is a scalar relationship between optional answers –


the respondent is being asked to indicate the quality of communication
on a scale of 1-4 - and this can be made clearly visible to aid the
respondent. However, each sub-category within this question may also
be coded for ease of identification within the spreadsheet or stats
package and the researcher may wish to post-code these using the right
hand column.

© The Robert Gordon University 2018 5


Topic: Data Analysis

Open questions may also be coded for analysis but in this case the
procedures follows that for qualitative data although the themes
emerging could easily be assigned a numeric code if you want to
calculate frequency with which the various themes arose within the
sample.
When coding questionnaires, it is good practice to code up answers to
one question across the whole sample of returned questionnaires before
moving on to code the next question across the sample. In other words,
do not be tempted to code a complete questionnaire at a time, where you
will have to move between questions and run the risk of errors occurring
as your eye moves between different sets of codes on the template.
As with coding, data is usually input into a spreadsheet or stats package
on a question by question basis. Data is input into a series of cells. Each
column represents the answers to one question, while each row
represents the answers from one respondent. For example:
Identity Q1 Q2 Age Q3 etc
Gender Attitude
1 2 21 4
2 2 20 3
3 1 20 1
4 1 25 1
etc
Table1: Data Input
The quality and reliability of the analysis depends on accuracy at all
stages - the more stages of manipulation and transcription of data the
more likelihood there is of errors occurring. It is important to carry out
spot checks on the use of codes, especially if you are manually post-
coding the data, to ensure that codes have been assigned accurately.
You might, for example, check every fifth or tenth questionnaire for
accuracy of coding depending on the size of the sample.
Similarly it is important to conduct spot checks on the accuracy of data
input if you are using a statistical package or spreadsheet. Input the
data and go back and check the accuracy against a sample of the original
questionnaires. As with qualitative data, always make sure that you can
cross reference to the original data.

Counting and Categorising


The first level of analysis of quantitative data involves looking at the
range and variation within the data:
 How frequently do the various categories occur in the data set?
 How variable is the data?

Some basic calculation can be done even manually at this stage,


providing the data set is not too large. With pencil and paper, simple
tables can be set up to score or tally up the frequency and range of

© The Robert Gordon University 2018 6


Topic: Data Analysis

responses. For example, the following table shows how the frequency of
the various responses to a question on abolishing exams can be
calculated by scoring each response as a tally mark (tick or “1”);
counting up the total number of occurrences of each response category;
calculating the frequency of each category as a percentage of the total
number of responses to the question as a whole.

Attitudes to abolishing exams


______________________________________________
Attitude Tally/scores Frequency
_____________________________________________________
strongly agree 11111 11111 I 11 (22.5%)
agree 11111 11111 11111 1111 19
disagree 11111 11111 11111 11 17
strongly disagree 11 2
_____________________________________________________
Total 49

This immediately allows the researcher to describe the data using simple
statistics in a concise way – for example we can see that respondents
were relatively evenly split between those who strongly agreed, agreed
and disagreed, though it is clear that very few strongly disagreed. While
the overall trend is towards agreeing, as many as 35% (17/49) disagree.
The data can be presented in a research report as a table (see Table 2
below) or as a graph (e.g., a simple bar chart).

Frequency
Attitude Number (n = 49) Percentage
Strongly agree 11 22.5
Agree 19 38.8
Disagree 17 34.7
Strongly disagree 2 4.1
Table 2: Attitude towards abolition of exams

This basic level of descriptive statistics can be calculated manually if


necessary, though with a larger data set it will be worth setting up a
spreadsheet or a stats package. Frequencies can be calculated and
output as tables or different types of graphs in seconds and with a much
larger data set.
In addition to simple frequencies it might be useful to calculate other
measures of distribution of the data such as means, standard deviations

© The Robert Gordon University 2018 7


Topic: Data Analysis

(to examine how variable the data is around the mean value), and/or
percentile points (e.g. the 25 th percentile is the value below which 25% of
the data falls; the 75th percentile is the value below which 75% of the
data falls). These kinds of calculations allow the range and distribution of
the data to be examined and interpretations drawn.

Relating and Predicting


Are there any strong relationships between key variables? The research
problem may require another level of analysis – you may be interested in
the relationship between variables. This is the kind of research that
seeks not only to describe what is happening but also wants to examine
how one factor impacts on another, asking questions like:
 Are there any strong relationships between the key variables?
 How significant is the difference between a and b?
 What would happen if we altered one variable in the future?

Here there are a number of approaches that may be useful such as cross-
tabulation, correlations and regressions. See the topics in this module
that cover statistical analysis for more detailed information about the use
and execution of these statistical tests.

Presenting Quantitative Data


Just as tables and graphs can be useful to the researcher at the analysis
stage they also offer useful methods of presenting and reporting the data
to others. A traditional approach to reporting quantitative research in a
report or dissertation is to present the basic findings in the form of tables
and/or graphs in a “Findings” chapter or section and follow this with a
“Discussion” section where you interpret the findings in relation to the
research problem, and finally a broader “Conclusions” section. Shorter
journal articles may merge the discussion with the findings and/or
condense it into a single discussion/conclusions section.
The text accompanying tables and graphs should be confined to
descriptions of the main features and key points shown in each table or
graph. The text should draw attention to the significant patterns such as
high and low points in the data; trends and tendencies; significant
relationships, etc - it should not simply repeat every figure which already
appears in the table or graph.
Spreadsheets or stats packages will allow the production of different
types of graphs, e.g. line graphs, pie charts, bar charts (simple or
clustered), histograms, scatter graphs. (stats packages tend to use the
term chart instead of graph.) At the analysis stage you should explore a
number of options to help you spot significant patterns in the data. If
you find a particular graph helps you “see” the trends and patterns
clearly, then it is likely that the graph will also help your reader
understand the findings too.

© The Robert Gordon University 2018 8


Topic: Data Analysis

Further Reading

Anderson V (2004) Research Methods in Human Resource Management


London:CIPD (Chapters 7 & 8)

References and
Bibliography
O’Leary, Z. (2004) The Essential Guide to Doing Research. Sage: London
(chapter 12)

Saunders, M.; Lewis, P.; and Thornhill, A. (2003) Research Methods for
Business Students 3E. FT Prentice Hall: Harlow, UK (chapter 11,
12)

Topic Review
This topic has provided infroamtion about quantitative and
qualitative data analysis techniques.

© The Robert Gordon University 2018 9

You might also like