You are on page 1of 13

4

Quantitative data analysis

The use of descriptive


statistics in nursing
research
Christine Hallett PhD, BNurs, BA, RGN, HVCert, DNCert, is Lecturer,
School of Nursing Studies, University of Manchester, Manchester.

Descriptive statistics offer nurse researchers valuable options for


analysing and presenting large and complex sets of data, suggests
Christine Hallett.
If the methods used by nurse researchers fall into two categories -
qualitative and quantitative - and are broadly understood in terms of
two epistemologies - the interpretive and the positivist - the use of the
techniques known as ‘descriptive statistics’ may be seen to pose a
dilemma to those working within this field. These methods fulfil some
of the purposes of both quantitative and qualitative research. They
offer both techniques for handling the figures generated in quantitative
work and means for expressing data in a descriptive, diagrammatic or
pictorial manner.
Descriptive statistics represent a series of methods by which the
researcher may adopt a positivist approach enabling a large amount of
empirical data to be handled in an unbiased and objective manner, in
which the researcher distances him - or herself from the processes of
data collection and analysis, and in which relationships of correlations
between data may be represented clearly. If the positivist approach is
part of the tradition of the ‘mathematisation of nature’ (1), descriptive
statistics constitute the clearest and most straightforward means of

NURSE RESEARCHER. VOL 4 NO 4, SUMMER 1997


5

expressing mathematical relationships between variables.


At the same time, these techniques may be seen to offer an element
of qualification and even interpretation to qualitative work by
providing a set of methods by which complex meanings and
relationships within a set of data can be clarified and understood.
However, some might argue that, in fact, these techniques merely
oversimplify the data.
Descriptive statistics may be used to fulfil any of three purposes:
• To present a set of data in a coherent manner which permits its main
characteristics to be identified
0 To summarise the main features of a large and complex data set
• To identify and present coherently the relationships between two or
more variables within the data.
The first purpose can be fulfilled using frequency tables and the
more complex tabular and graphic forms which may be developed
from them, such as cumulative frequencies and ogives. The second
may be achieved by means of measures of central tendency and
dispersion. And the third is possible through the use of correlation of
coefficients.
These methods were used by the author and colleagues as part of an
English National Board-funded research study in the early 1990s (2).
The data generated by that study and the means used to analyse them
will be considered later in this paper.
An Australian writer, Sarantakos (3), has identified six main
activities which comprise a quantitative research study:
• Data preparation (coding, categorising, editing and checking)
• Counting (the measurement of the frequency with which variables
occur)
• 'file grouping or ordering of data
• Identifying relationships within the data
0 Predicting
0 Statistical testing.
The uses of descriptive statistics fall into the first four of these
categories of activity.
The nurse researcher in the 1990s is liberated from much of the hard

NURSE RESEARCHER, VOL 4 NO 4. SUMMER 1997


6

Quantitative data analysis

work ot formulating and presenting descnptive statistics by a vast


range of computer programs which can now do this work on his or her
behalf. During the last three decades the number of such programs has
increased so spectacularly that the novice researcher is faced by a
bewildering array of choices. Sarantakos (3) compiled a list of these
programs which, although presented with an Australian audience in
mind, are also of value to UK researchers in clarifying some of the
differences between the various options and providing some warnings
about pitfalls.
The most important data processing programs identified by
Sarantakos (3) include Statistical Programs for Social Scientists (also
known as Statistical Package for the Social Sciences or SPSS),
Statistical Analysis System (SAS), Minitab, and Complete Statistical
System. There are many others and the programs listed here do have
many limitations. They provide a fairly low quality of analysis and
include no standards of control or evaluation. 'ITiey may also be used
by researchers who have very little understanding of statistics,
resulting in poor choice and implementation of statistical techniques.
Nevertheless, they offer such rapidity and ease of data analysis and
presentation that most research studies involving any form of
descriptive statistics tire likely to make use of them. The nurse
researcher who wishes to use these approaches, or who wants to read
and understand research reports which employ quantitative methods,
must therefore be aware of the role of computer programs in this kind
of work.

Preparing data for the use of descriptive statistics


Quantitative analysis begins with the preparation of data. The data
must be translated into a form in which it can be analysed either
manually by the researcher or by computer. Usually, data are ‘coded’;
this is a means by which elements of the data - or pieces of
information - arc translated into numbers. The data then need to be
edited and checked to ensure that they are clear and relevant. In

NURSE RESEARCHER. VOI. 4 NO 4. SUMMER 1997


7

Figure 1. I lie normal distrihution

particular, it is important to check the data for reliability, that is, to


ensure that consistency has been maintained. For example, in a
questionnaire survey reliability relates to factors such as the extent to
which respondents intetpreted questions in the same way and the
extent to which researchers were consistent in their coding of the data.
Usually, one of the most important stages of data preparation is the
inputting of data into the computer prior to analysis. This may be done
by means of computer cards or, more commonly, directly through a
computer terminal.
Data arc divided by convention into four categories or types:
nominal, ordinal, interval and ratio, and are discussed in greater detail
in this issue by Hazel Watson. The type of statistics which can be used
to present and manipulate data depends on the type of data being
considered. Nominal data refers to a set of data which is grouped with
no form of scale or ranking. Where numbers are assigned to the
groups they have no mathematical meaning; they are simply a means
of identification.
In the cases of ordinal and interval data, the information is presented
in sequence from lowest to highest. Tire numbers used have actual
mathematical meaning and these forms of measurement are essentially
quantitative. Interval data differs from ordinal in having a fixed scale;
the units between the measurements are equal. Where ratio data is
presented, the information is once again sequenced and has a fixed

NURSE RESEARCHER. VOL. 4 NO 4, SUMMER 1997


8

Quantitative data analysis

Table 1. Durations of Project 2000 placements

Number of Days Responses

1 29
2 23
3 13
4 26
5 25
6 10
7 6
8 5
10 12
11 1
12 5
14 2
15 1
16 1
18 1
20 7
22 1
24 1
30 1
31 1

scale. However, it also has a fixed ‘zero’ point. Heights and weights
might be considered good examples of ratio data.
The use of statistical methods and techniques depends on the way in
which data are ‘distributed’. A distribution is the form in which a set
of data is presented. The most common type of distribution is the
frequency distribution, though proportional and cumulative
distributions may also be used. ITe normal distribution which, if
presented as a frequency graph produces a bell shaped curve (Fig. 1),
is useful for certain types of statistics. For example, a normal
distribution is required for the use of standard deviation.

NURSE RESEARCHER, VOL 4 NO 4, SUMMER 1997


9

Figure 2. Histogram of durations of


Project 2000 placements

160

140
120

100

80

60
40
20

■____ ■
1-10 11-20 21-31

Duration of placement (days)

Figure 3. Frequency polygon of data on


duration of Project 2000 placements

30

25-

c 20

° 15

a- 10-
Uc

2 4 6 8 Duration of placement (days) 24 26 28 30 32

NURSE RESEARCHER. VOL 4 NO 4, SUMMER 1997


10

Quantitative data analysis

Figure 4. Ogive of durations of Pro ject 2000

160

140

120

100
80

60

JS 40

U 20

2 3 4 5 6 7 8 10 II 12 14 15 17 18 19 20
Duration of placement (days)

Methods used in descriptive statistics


The types of descriptive statistics which can be used in nursing
research will be considered here according to their main purposes: as
means for representing data coherently, as methods for summarising
the main features or characteristics of a data set, and as ways in which
the relationships between two or more variables within a data set may
be clarified.
The presentation of data Descriptive statistics provide a range of
methods through which data can be presented in a clear, sometimes
pictorial form. One of the simplest ways in which this may be done is
by arranging data from the lowest to the highest in an ‘array’. Once
this has been achieved, data may also be compressed into classes.
Grouped or nominal data may be presented as a bar chart. Frequency
distributions, in which scores or classes arc presented alongside the
frequencies with which they occur, allow for clear presentation of a

NIJRSH RESEARCHKR. VOL 4 NO 4. SUMMER 1997


11

complete data set. These may then be presented graphically using the
histogram - a bar chart in which there are no spaces between the bars
and in which the number of observations in any particular class is
represented by the area within each rectangle - or the frequency
polygon, which is a depiction of the frequency distribution plotted onto
a line graph. The cumulative frequency distribution may be
represented graphically by means of an ogive.
In a questionnaire survey as part of the ENB-funded study (2). the
author collected data on community nurses’ work with diploma
students in the community setting.
One of the types of data collected as pail of the study related to the
durations of Project 2tX)0 placements. Community nurses were asked
to specify the average length of time a diploma student spent with
them in any one placement. There was a range of responses which
were then presented in a frequency table (Table I). The information
contained in Table I can be translated into diagrammatic form as a
histogram (Fig. 2), a frequency polygon (Fig. 3), and an ogive (Fig. 4).
Summarising the main features of a data set Measures of location
or central tendency offer the simplest and perhaps the clearest methods
for summarising a set of data, and are discussed in greater detail in this
issue by Hazel Watson. The ‘mode’ is the value which appears most
frequently. The ‘median’ is the mid-point within the distribution, and
the ‘arithmetic mean’ is a more complex measure which takes all the
values in a distribution into consideration and offers a measure which
represents the sum of all the observations in the distribution divided by
their number. Numerous other measures of central tendency may be
used, but these are the most common. In the data set on durations of
Project 2000 placements, the mode would be 1, the median 3, and the
mean 1.5.
The most commonly used measures of variation are the ‘range’, the
‘interquartile range’ and the ‘standard deviation’ although again, there
are numerous other possibilities. The range simply represents the
difference between the highest and the lowest values within the
distribution, whilst the interquartile range is the difference between the
values which fall after the first quarter and before the third quarter of

NURSE RESEARCHER, VOI. 4 NO 4. SUMMER 1997


12

Quantitative data analysis

the distribution. The standard deviation is more complex and is


calculated by obtaining every value in a distribution and then
determining the extent to which each of these values deviates from the
mean. An aggregate of these values is then arrived at.
Clarifying the relationships between variables Kidder and Judd (4)
summarised the differences between the statistics used in experimental
work and those employed in survey work; it is the latter which
constitute the descriptive statistics which are being considered here.
Kidder and Judd observed that in experiments the aim is to establish a
causal relationship between to variables with a view either to creating
and establishing universal scientific laws, or clarifying and extending
existing laws. They pointed out that, rather than establishing such
laws, survey research aims at ‘establishing facts and relationships prior
to the elaboration of causal laws’ (4). This means that such work is
designed not to establish whether X caused Y, but simply to ascertain
'whether X and Y co-vary or under what conditions they co-vary’ (4).
One of the main purposes of descriptive statistics is, therefore, to
ascertain whether relationships exist between variables. Where a
relationship is found to exist it is presented as a correlation coefficient.
A survey which studied the smoking habits of a group within the
population, and which collected data on respiratory conditions within
that group, might use correlation coefficients to determine whether
there was a relationship between smoking and respiratory conditions.
A correlation coefficient could be used to demonstrate a positive
relationship between the two variables (for example, as the number of
cigarettes smoked increases the incidence of respiratory conditions
also increases), in which case the coefficient would have a positive
value. Alternatively, the coefficient could demonstrate a negative
relationship between the variables (for example, as the number of
cigarettes smoked increases, the incidence of respiratory conditions
decreases), in which case the coefficient would have a negative value.
Kidder and Judd (4) warned that correlations may be ‘spurious’, that
is, two variables may co-vary without the existence of any causal

NURSE RESEARCHER. VOL 4 NO 4, SUMMER 1997


13

relationship between them. Correlations may be presented graphically


using scattergrams, graphs in which the two variables under
consideration are plotted one on each axis.
A useful way of demonstrating a correlation between two variables
is to present data relating to the variables in the form of a ‘contingency
table’, in which the variables are compared by being presented in the
form of a grid (5). Another term which is used commonly to refer to
this technique is ‘cross-tabulation’. These tables can be produced by
computer programs such as those mentioned previously, often with the
addition of measures of statistical significance. It is important to note
that contingency tables are suitable for continuous (or interval) data,
whereas cross-tabulations are suitable for nominal and ordinal data.

Examples of current work


There is a significant bias in nursing research in the UK of the 1990s
towards quantitative work. This does not mean that little qualitative
work is undertaken. On the contrary, there is a healthy interest in
qualitative methods and approaches - from in-depth interview studies
to techniques which use interpretive approaches to the consideration of
the data produced by participant or non-participant observation. It
would appear however, that funding is more readily available for
quantitative and positivist work than for work within a more
interpretive paradigm. This emphasis on quantification is a reasonably
long-term trend. In 1989, Wilson-Bamett and Robinson (6) edited a
collection of research papers entitled Directions in Nursing Research:
Ten Years of Progress at London University, in which several papers -
most of them using quantitative techniques - were presented. These
papers not only provided a set of interesting and informative examples
of the use of descriptive statistics, but also stand as an important
example of the predominance of quantitative methods in nursing at the
end of the 1980s - a predominance which has surv ived into the 1990s.
The use of descriptive statistics as a method of data analysis is
particularly valuable when the data to be considered take the form of
the large and complex data sets which arise out of large scale
questionnaire surveys. The type of information being sought through

NURSE RESEARCHER, VOL 4 NO 4, SUMMER 1997


14

Quantitative data analysis

surveys is often material which is intended to give an insight into the


knowledge or opinions held by groups of individuals about certain
health or health service-related topics. The area of HrV and AIDS -
and in particular the knowledge and information held by nurses about
these topics - is one in which descriptive statistics have been used to
good effect.
Tierney (7) offered a review of research current in that area in 1995.
She observed that studies conducted in the 1980s presented nurses as
somewhat lacking in knowledge and very fearful of HIV. Research
also indicated that current provision in nurse education programmes
was inadequate (7). A follow-up of other articles relating to HIV and
AIDS reveals the widespread use of descriptive statistics as methods
both for presenting and analysing data. Papers by Armstrong and
Hewitt (8). Bond et al (9). Breault and Polifrani (10), Brown et al (11),
Chitty (12), and Cole and Slocumb (13), have provided sound
examples of these methods.
Research examining the use of health services has also provided a
fruitful source of examples of the use of descriptive statistics. For
example, a study by van-Teijlingen and Bryar (14) examined the
uptake of midwifery-led care, and a survey by Bjorn (15) of the
nursing care needs of elderly patients found that community health
assessment can offer valuable insights into this subject area. In
particular, Bjom used correlation to demonstrate that elderly people in
urban areas experienced greater problems with loneliness and lack of
companionship than those in rural areas. Other quantitative work on
the experience of older people has been undertaken by Long et al (16)
who examined the effect of visual impairment on community travel.
Equity in health care in Canada provided the subject for a study by
Newbold et al (17) which considered whether the distribution of
hospital services corresponded to the distribution of need.
One further interesting study which made valuable use of descriptive
statistics was the recent work of Hale and Trumbetta (18) on the
prevention of sexually transmitted diseases among women. The survey
sample consisted of 308 college women; research participants’

NURSE RESEARCHER. VOL 4 NO 4. SUMMER 1997


15

responses to a series of questions about their risk of contracting


sexually transmitted disease were analysed using descriptive statistics.
The authors used measures of central tendency and dispersion and
examined the correlations between variables in order to present a
useful picture of the levels of knowledge and of the behaviour of
women within the sample. For example, they found that levels of
knowledge of sexually transmitted diseases were high, with a ‘mean
correct response of 87 per cent’ (18). They also stated that there was
an association between ‘perceived self-efficacy’ and ‘sexual risk
behaviour’.

Problems and pitfalls


Descriptive statistics provide a valuable means by which the researcher
may analyse and present clearly a large and complex set of data.
However, it is important to be aware of the limitations of these
approaches. The level of analysis achieved by descriptive statistics is
only very superficial. Essentially, this type of analysis does little more
than present data clearly, there is little actual manipulation of the data.
One of the most significant pitfalls to be aware of when adopting these
methods of analysis is the tendency to assume that the data mean more
than is actually the case. For example, the presentation of a correlation
coefficient should not be taken to imply a causal relationship between
two variables. Another problem with these methods of analysis is their
tendency to oversimplify data. The presentation of a mean and a
standard deviation can lead the researcher or reviewer to draw false
conclusions about the configuration, size and grouping of a data set.

Conclusion
Within their limitations, descriptive statistics offer useful ways of
analysing and presenting the data obtained in quantitative studies.
They exist almost on the boundaries of the conceptual territories
between quantitative and qualitative work, because they offer a set of
methods by which quantitative data may be presented in a qualitatively
more meaningful way. The three main purposes of descriptive
statistics, as identified in this paper are:
• The clarification of a set of data

NURSE RESEARCHER. VOL 4 NO 4. SUMMER 1997


16

Quantitative data analysis

• The simplification and summarisation of the data’s main features


• The elucidation of the relationships between variables in the data.
If treated with caution, descriptive statistics may fulfil these purposes
and provide the basis for more complex methods of data analysis and
interpretation.

References. 249-255.
1. Bleicher J. The Hermeneutic 10. Breault AJ, Polifrani EC. Caring for
Imagination. Outline of a Positive Critique people with AIDS: nurses’ attitudes and
of Scientism and Sociology. London. feelings. Journal ofAdvanced Nursing.
Routledge and Kegan Paul. 1982. 1992. 17,21-27.
2. Hallctt CK el al. The Provision of 11. Brown Y et al. The effect of knowledge
Learning Experiences in the Community on nursing students' attitudes towards indi­
for Project 2000. Five Papers. Manchester. viduals with AIDS. Journal of Nursing
University of Manchester. 1992. Education. 1990. 29, 367-372.
3. Sarantakos S. Social Research. 12. Chilty KK. A national survey of AIDS
Basingstoke, Macmillan Education. 1993. education in schools of nursing. Journal of
4. Kidder LH. Judd CM. Research Methods Nursing Education. 1989. 28, 150-155.
in Social Relations. Fifth edition. New 13. Cole FL, Slocumb EM. Nurses’ atti­
York NY, Holt, Rinehart and Winston. tudes towards patients with AIDS. Journal
1986. ofAdvanced Nursing. 1993. 18, 112-117.
5. Polit DF. Hungler BP. Study Guide for 14. van-Tcijlingen E, Bryar R. Midwifery
Nursing Research. Principles and Methods. led care. Selection guidelines for place of
Fifth edition. Philadelphia PA, JB birth. Modern Midwife. 1996. 6, 8, 24-27.
Lippincott Company. 1995. 15. Bjom AM. Community health assess­
6. Wilson-Bamett J, Robinson S. ment and nursing care needs of the elderly.
Directions in Nursing Research: Ten Years Unpublished PhD thesis. Manchester,
of Progress at London University. London, University of Manchester. 1989.
Scutari Press. 1989. 16. Long RG et al. Older persons and com­
7. Tierney AJ. HIV/AIDS - knowledge, munity travel: the effect of visual impair­
attitudes and education of nurses: a review ment. Journal of Visual Impairment and
of the research. Journal of Clinical Blindness. 1996. 90, 4, 302-313.
Nursing. 1995.4, 1, 13-21. 17. Newbold KB et al. Equity in health
8. Armstrong-Estlier C, Hewitt WE. The care: methodological contributions to the
effect of education on nurses’ perception of analysis of hospital utilization within
AIDS. Journal of Advanced Nursing. 1990. Canada. Social Science and Medicine.
15,638-651. 1995.40,9, 1181-1192.
9. Bond et al. HIV infection and AIDS in 18. Hale PJ, Trumbelta SL. Women’s self-
England: the experience, knowledge and efficacy and sexually transmitted disease
intentions of community nursing staff. preventive behaviours. Research in
Journal ofAdvanced Nursing. 1990. 15, Nursing and Health. 1996. 19,2, 101-110.

NURSE RESEARCHER. VOL 4 NO 4, SUMMER 1997

You might also like