You are on page 1of 16

Week 7: Finding Patterns in Data: Charts and Tables

Students’ Learning Outcomes


• distinguish between different types of data
• tabulate data into an ungrouped or a grouped frequency table, as
appropriate
• use diagrams to present data, and to know how to draw appropriate
conclusions from these diagrams
Task – Forum
Find at least 2 examples of diagrams in academic publications (ProQuest),
newspapers or company reports and write a paragraph on each chart discussing
the following points:
• What are the conclusions that you can draw from the diagram?
• If you were the author of this source, did you choose a different diagram
to explain the data?

Introduction
In week 6, we discussed about the differences between primary and secondary data; we
explained the differences between quantitative, qualitative and mixed methods research
designs, how to identify the full variety of available data, which secondary data we have to
choose in order to answer our research questions and objectives. Moreover, we understood the
main advantages and disadvantages of using secondary data as well as the range of techniques
to search for them and how to evaluate and the secondary data. This week, we will learn about
how to find patterns in data and how to explain the data with diagrams and tables. If you think
of data as just a bunch of numbers, you might be surprised to learn that there are several
classifications or levels of measurement. Knowing which category your data belongs to is
critical because it determines the type of statistical analysis you will perform. Data analysis is
the process by which a researcher discovers relationships and gains an understanding of what
the information gathered from the data collection truly means and how it is relevant (Albers,
2017). Designing a valid and reliable study necessitates critically considering why the data
measurements are required (Velleman & Wilkinson, 1993).
As mentioned in the introduction, there are several classifications or levels of Data
measurement. Each category depends on what kind of secondary data is.
There are four levels of measurement. These are:
1. Continuous Data: Data that is measured on a scale, such as weight or temperature. The scale
can be subdivided into as many intervals as required, depending on the accuracy of the

UU-MBA-711-ZM - Dissertation Page 1


measuring equipment. Time is also continuous as it is measured using a clock, although it can
be treated as discrete. According to McCue (2007), continuous data are extremely beneficial in
inferential statistics; however, they tend to be less useful in data mining and are frequently
recoded into discrete data or sets.
2. Discrete Data: Data that takes on whole values. Counting data, such as the number of
defective items in a batch, is an obvious example of discrete data. The fact that discrete data
cannot be subdivided (for example, you cannot have half of a defective item) is an important
feature (McCue, 2007; Howell, 1992). The price of an item or the size of a shoe are two other
examples of discrete data. Time is really a continuous measure, but for practical purposes, it is
often treated as discrete: people usually work a defined number of hours (or fractions of an
hour), for example, and they give their age as a whole number of years. However, simply
rounding a continuous quantity to a whole number does not transform it into a discrete quantity
(Oakshott, 2016)
3. Ordinal Data: Data that can be arranged in some meaningful order. An example of this type
of data is the assessment consumers might give to a product (Oakshott, 2016; McCue, 2007).
They might be asked to rate the product using a score from 1 to 5, where 5 is probably
“excellent”, and 1 is “poor”. Although 5 is better than 1, it is not necessarily 5 times better or
even 4 points better. One thing to keep in mind is that simply assigning a numerical value to a
category does not make it discrete data. You can't do anything with this number except say
things like "the higher the number, the better" (or "the worse").

4. Nominal Data: Data that does not have a numerical value and can only be placed in a suitable
category (Oakshott, 2016; Cliff, 2014) Nominal data can be qualitative as well as quantitative.
Words, letters, and symbols may be included (Cliff, 2014). People's names, gender, and
nationality are some of the most common examples of nominal data. The only thing we can do
with nominal data is categorize it.
Ordinal and nominal data are usually referred to as categorical data.
Tabulation of data
Let’s see an example and understand how we use the secondary data.
A small survey was carried out into the mode of travel to work. The information below related
to a random sample of 20 employed individuals.
Person Mode of Travel Person Mode of Travel
1 Car 11 Car
2 Car 12 Bus
3 Bus 13 Walk
4 Car 14 Car
5 Walk 15 Train
6 Cycle 16 Bus
7 Car 17 Car
8 Cycle 18 Cycle
9 Bus 19 Car
10 Train 20 Car

UU-MBA-711-ZM - Dissertation Page 2


How would you classify this data?
As the table shows, this data is categorical (nominal) because the mode of travel does not have
a numerical value. This data would be better presented as a frequency table. For a clear
understanding of this example's frequency table, see the table below:
Mode of Travel Frequency Relative Frequency (%)
Car 9 45
Bus 4 20
Cycle. 3 15
Walk 2 10
Train 2 10
Total 20 100%

The frequency of each category is simply the number of times it appeared. The relative
frequency has been calculated in addition to the actual frequency. This is frequency expressed
as a percentage, which is calculated by dividing a frequency by the total frequency and
multiplying by 100. The sum of the proportions, therefore, adds to 1 instead of 100. The order
in which you write these down is not essential, although ordering by descending size of
frequency makes comparison clearer.
Let’s see another example. Below you will see the number of foreign holidays sold by a travel
agent over the past four weeks.
Day No. sold Day No. sold Day No. sold Day No. sold
Monday 10 Monday 13 Monday 11 Monday 11
Tuesday 12 Tuesday 10 Tuesday 18 Tuesday 13
Wednesday 9 Wednesday 12 Wednesday 10 Wednesday 10
Thursday 10 Thursday 8 Thursday 10 Thursday 14
Friday 22 Friday 12 Friday 11 Friday 13
Saturday 14 Saturday 12 Saturday 9 Saturday 12

Can the travel agent sell a fraction of a holiday? Assuming that a holiday is a holiday regardless
of length or cost, this is clearly discrete data that would have been obtained by counting.
Examining the figures, you should notice that 10 sales occur the most frequently, with a range
of 8 to 22 sales. You could aggregate the data into a table to make this information more visible:
Number sold Frequency
8 1
9 2
10 6
11 3
12 5
13 3
14 2
More than 14 2

UU-MBA-711-ZM - Dissertation Page 3


Because the number sold has not been grouped, this table is known as an ungrouped frequency
table. This table can be used to summarize a small set of discrete data. There are two extreme
values or outliers of value 18 and 22 sales and these have been included by the use of a ‘more
than’ quantity. According to the table above, you can see that between 10 and 12 holidays are
usually sold each day.

Diagrammatic representation of data


Diagrams have been used in data collection in a variety of fields, including education,
engineering, environmental science, geography, industrial design, psychology, and others
within the social sciences (Umoquit et al., 2013; Wheeldon & Ahlberg, 2012). Mers (2008),
for example, provided a collection of articles that demonstrate various ways in which
diagramming has been used in the health and social sciences, where diagrams are used as a
data collection tool but also play an important role in analysis and argument construction. One
challenge arising from the development of this approach is that, in the absence of clear
boundaries and standard terminology, the development of this data collection approach has
been isolated within disciplines (Umoquit et al., 2013; Umoquit et al., 2011). Although
frequency tables can provide more information than raw data, it can be challenging to absorb
all of the information contained in the data. Diagrams can assist in providing this additional
information while also displaying the data in a more visually appealing manner. You do lose
some detail, but this is a small price to pay for the additional information that diagrams provide.
There are various types of diagrams, and the choice is influenced primarily by the type of data,
and by your intended audience. When creating diagrams these days, most people will use a
spreadsheet. Spreadsheets can generate high-quality charts that are easily updated when the
data changes. However, some experience of drawing diagrams by hand is still helpful.

The sales by department of a high street store over the past three years are shown in the table
below:

Table: Sales by department and year


2012 2013 2014
Clothing $1.7m $1.4m $1.4m
Furniture $3.4m $4.9m $5.6m
Electrical goods $0.2m $0.4m $0.5m
Total $5.3m $6.7m $7.5m

The table above shows that total sales have increased over the last three years, despite a decline
in clothing sales. Diagrams should aid in highlighting these and other differences.

UU-MBA-711-ZM - Dissertation Page 4


Pie Charts
A pie chart is an excellent choice of diagram for comparing the relative sizes of frequencies. It
is typically used for categorical data, with each category represented by a circle segment. Each
segment's size reflects the frequency of that category and can be represented as an angle. People
rarely draw a pie chart by hand because a protractor is needed to measure the angles, but if you
must, the angle is calculated by measuring the percentage of the category and then multiplying
by 360. For example, for the sales for 2014 in the example below, the angle would be calculated
as follows:

Clothing as a percentage is 1.4 𝑥𝑥𝑥𝑥 100 = 18.7%


7.5

The angle is therefore 18.7 𝑥𝑥𝑥𝑥 360 = 67°


100

The complete pie chart for the sales for 2014 is shown in the figure below. This diagram
demonstrates that the furniture department has contributed the bulk of the total sales for this
year. (Note: adjustments have been made to allow for rounding errors.)

Sales
Electrical Goods
6%
Clothing
19%

Furniture
75%

Electrical Goods Clothing Furniture

Bar Charts
Although pie charts are a popular way to compare the size of categories, they have the
disadvantage of not being suitable for displaying multiple sets of data at the same time. You
would, for instance, need three separate pie charts to represent the data in the table above (see
Table: Sales by department and year). Another effective way to display categorical data or an
ungrouped frequency table is with a simple bar chart. A vertical bar is drawn for each category,
with the height proportional to the frequency. The figure below illustrates total sales (in
millions of dollars) in the form of a simple bar chart.

UU-MBA-711-ZM - Dissertation Page 5


£8.0
£7.0
£6.0
£5.0
£4.0
£3.0
£2.0
£1.0
£-
2012 2013 2014

Year

When a category is subdivided into several subcategories, the simple bar chart is insufficient
because each subcategory requires a different bar chart. A multiple bar chart is used when you
want to see changes in the components but not the totals. The figure below is a multiple bar
chart (in millions of dollars) for the data in the table above (see Table: Sales by department
and year).
$6.00

$5.00

$4.00
Clothing
$3.00
Furniture
$2.00 Electrical Goods

$1.00

$-
2012 2013 2014

A component bar chart is used if you want to compare totals and see how totals are made up.
The figure below is a component bar chart for the data in the table above (see Table: Sales by
department and year). This graph depicts the variation in total sales from year to year (in
millions of dollars), as well as how each department contributes to total sales.

$8.00
$7.00
$6.00
$5.00
Electrical Goods
$4.00
Furniture
$3.00
Clothing
$2.00
$1.00
$-
2012 2013 2014

UU-MBA-711-ZM - Dissertation Page 6


A percentage bar chart may be more interesting if you are more interested in the proportion of
sales in each department. This is depicted in the figure below. This chart is similar to the pie
chart, but it has the advantage of displaying multiple sets of data at the same time.

100%

80%

60% Electrical Goods


Furniture
40%
Clothing
20%

0%
2012 2013 2014

Line graphs
When data is in the form of a time series a line graph can be useful means of showing any
trends in the data.

$8.0
$7.5
$7.0
$6.7
$6.0
$5.0 $5.3
$4.0
$3.0
$2.0
$1.0
$-
2012 2013 2014

Total sales in million dollars

Figure above is a line graph for the total sales given in the table above (see Table: Sales by
department and year) and this line graph shows the rise in sales over the three years (from 2012
to 2014). When this type of diagram is shown in company publications, the scale on the y-axis
is frequently broken. This will exaggerate sales or other measures and can be misleading unless
you are aware of what is going on. This can also be justified if none of the values are close to
zero; however, the break in scale should be clearly visible in this case.
You should be able to mention that:
1. Total sales have increased over the three years, although the largest increase was
between 2012 and 2013.
2. Most of this increase has been the result of sales of furniture.
3. Clothing has shown a decrease in sales from 2012 to 2013 but has then remained steady.
4. The sales of clothing as a proportion of total sales have declined, while the proportion
of electrical sales has increased.

UU-MBA-711-ZM - Dissertation Page 7


Histograms
Histograms are frequency distribution plots for a set of continuous data that allow inspection
of the underlying distribution, such as the population's normal distribution, outliers, skewness,
and so on. The data is divided into classes known as bins, and each bin represents a period with
the number of occurrences in the data set. The frequency of occurrences for each bin is
indicated by the bar area, which is the product of the height multiplied by the width of the bin.
The histogram is then built by tabulating and plotting the frequencies in each bin against the
intervals. There is no formula for determining the ideal bin size, but the bins must be neither
too small nor too large, or the underlying pattern of frequency distribution will become elusive.
Histograms represent continuous data sets, so there are no "gaps" between the bars, though
some bars may be missing, indicating that there are no frequencies (Oakshott, 2016).
A histogram is a visual representation of numerical or categorical data. These are useful not
only for quickly conveying a large amount of information in the form of charts, but also for
estimating a variable's mean, standard deviation, skewness, and kurtosis, all of which describe
the underlying distribution. Histograms can help you determine whether the outputs of two or
more processes are normally distributed or not, whether a process has changed over time
intervals and, if so, how the shapes of the distributions may vary, and whether processes can
meet specific requirements.
Histograms contain several different types of distributions. A normal distribution is symmetric,
with the mean in the center and the probability of points falling on either side of the average
being equal. On the other hand, a bimodal distribution has two peaks instead of one, and the
data is analyzed as different normal distributions. There is the right-skewed and left-skewed
distribution where a large number of data values occur on the right or left side, respectively.
Lastly, a random distribution is one that lacks a pattern and usually exhibits multiple peaks, in
which case the data should be analysed separately.
Example: In a University, there are 20 BBA students whose ages in increasing order are as
follows: 18,18,18,19,20,20,20,20,21,22,22,23,24,25,26,27,30,32,33,33. This data can be
represented in a frequency distribution table as follows:
Age Frequency
18 3
19 1
20 4
21 1
22 2
23 1
24 1
25 1
26 1
27 1
30 1
32 1
33 2

UU-MBA-711-ZM - Dissertation Page 8


Range Start (Inclusive) Range End (Exclusive) Count
18 20 8
20 22 3
22 24 2
24 26 2
26 28 1
28 30 1
30 32 1
32 34 2

UU-MBA-711-ZM - Dissertation Page 9


Creating charts with Microsoft excel
Microsoft Excel is a powerful spreadsheet package available for Microsoft Windows and the
Apple Macintosh. Spreadsheet software is used to store information in columns and rows which
can then be organized and/or processed. Spreadsheets are designed to work well with numbers
but often include text. Excel organizes your work into workbooks; each workbook can contain
many worksheets; worksheets are used to list and analyze data. Although Excel was not
designed to be a research data entry tool, it is commonly used because almost every researcher
already knows the basics of how to use it (Elliott et al., 2016).

Menu Bar – used to access and execute Title Bar


demands Formula Bar – for entering formula; the reference
area to left displays coordinates of active cell(s)

Active cell – can contain a number, text or formula

A new worksheet is a grid of rows and columns. The rows are labeled with numbers, and the
columns are labeled with letters. Each intersection of a row and a column is a cell. Each cell
has an address, which is the column letter and the row number. The arrow on the worksheet to
the right points to cell A1, which is currently highlighted, indicating that it is an active cell. A
cell must be active to enter information into it. To highlight (select) a cell, click on it.

One worksheet can have up to 256 columns and 65,536 rows, so it'll be a while before you
run out of space.

Very few people draw charts by hand these days, as it is much easier to use a spreadsheet, such
as excel. Charts produced by a spreadsheet also look more professional, and they can be
immediately updated if the data changes. When drawing charts in Excel, you can choose
whether to create the chart as an object in the same worksheet as the data or to create the chart
in a new sheet. Within each tab, commands are grouped logically, so in the Insert tab there is
a charts group which contains all the charts.

UU-MBA-711-ZM - Dissertation Page 10


We will use the data from the table Sales by department and year to make our charts.

1. Pie Chart
Highlight cells A6 to A8. And while holding down the <Ctrl> key on your keyboard,
highlight cells D6 to D8. Click on the Insert tab, then Pie and choose the one you want.

Click on the
one you want

Highlight these cells

UU-MBA-711-ZM - Dissertation Page 11


2. Simple bar chart
Vertical bar charts are called Clustered Column charts in Excel. Highlights cells B10 to D10
(B10:D10) and then click on the Quick Analysis (Ctrl + Q) gallery at the bottom right hand
corner of cell D10. Then click on charts then Clustered column. If you right-click inside the
chart area, you can move the chart to a new sheet. In the Charts tab you can add the horizontal
axis labels by clicking on the Design group and selecting Data and edit under the horizontal
(category) axis labels.

Click on edit
and highlight
cells B4 to D4

UU-MBA-711-ZM - Dissertation Page 12


3. Multiple bar chart
Proceed as before, but this time highlight cells A6 to D8. Add the years to the horizontal axis
and choose a chart layout as before. The final chart can be seen as below:

$6.00

$5.00

$4.00
Clothing
$3.00
Furniture
$2.00 Electrical Goods

$1.00

$-
2012 2013 2014

4. Component bar chart

A component var chart is called a Stacked Column chart in Excel. It can be found under more
charts in the quick analysis gallery. Proceed exactly as before. The final chart can be seen
below.

$8.00

$7.00

$6.00

$5.00
Electrical Goods
$4.00
Furniture
$3.00 Clothing

$2.00

$1.00

$-
2012 2013 2014

UU-MBA-711-ZM - Dissertation Page 13


5. Percentage bar chart
This is exactly the same as before except that 100% stacked column is chosen. This chart can
be seen below:

100%

80%

60% Electrical Goods

40% Furniture
Clothing
20%

0%
2012 2013 2014

6. Line graph
To display a line graph, highlight the totals as in the simple bar chart. In the design ribbon
select the chart with markers. The chart can be seen in the figure below.

$8.0
$7.5
$6.7
$6.0
$5.3
$4.0
$2.0
$-
2012 2013 2014

Total sales in million dollars

UU-MBA-711-ZM - Dissertation Page 14


Conclusion
A large amount of data is difficult for the human brain to comprehend. However, when data is
organized and presented in a diagrammatic format, we find it much easier to identify patterns.
With technological advancement, very few people draw charts by hand these days, making it
far more convenient to use a spreadsheet program. A spreadsheet-generated chart also looks
more professional, and if the data changes, the chart will change automatically. When drawing
charts, most programs allow you to choose whether to create the chart as an object in the same
worksheet as the data or in a separate sheet. Seeing the chart next to the data can be useful;
however, if you want to print out the chart or copy and paste it into another document, it is
easier to create it in a new sheet. Examples of programs that you will allow to develop charts
and diagrams are Microsoft Excel, Google Sheets, IBM SPSS statistics, WPS Office
Spreadsheets, OpenOffice Calc, SSuite Accel, and many others. Finally, using graphs,
diagrams, and charts can help your reader understand your research findings and how they
compare to other data.

UU-MBA-711-ZM - Dissertation Page 15


References
Albers, M. J. (2017). Quantitative data analysis—In the graduate curriculum. Journal of Technical Writing
and Communication, 47(2), 215-233.

Cliff, N. (2014). Ordinal methods for behavioral data analysis. Psychology Press.
Elliott, A. C., Hynan, L. S., Reisch, J. S., & Smith, J. P. (2006). Preparing data for analysis using Microsoft
Excel. Journal of investigative medicine, 54(6), 334-341.
Howell, D. C. (2012). Statistical methods for psychology. Cengage Learning.

McCue, C. (2014). Data mining and predictive analysis: Intelligence gathering and crime analysis.
Butterworth-Heinemann.
Mers, A. (Ed.). (2008). Useful pictures. Whitewalls Incorporated.
Oakshott, L. (2016). Essential Quantitative Methods (1st ed., pp. 50-87). UK: Palgrave Macmillan.
Umoquit, M. J., Dobrow, M. J., Lemieux-Charles, L., Ritvo, P. G., Urbach, D. R., & Wodchis, W. P.
(2008). The efficiency and effectiveness of utilizing diagrams in interviews: an assessment of
participatory diagramming and graphic elicitation. BMC Medical Research
Methodology, 8(1), 1-12.
Umoquit, M., Tso, P., Varga-Atkins, T., O’Brien, M., & Wheeldon, J. (2013). Diagrammatic
elicitation: Defining the use of diagrams in data collection. The Qualitative Report, 18(30), 1-
12.
Velleman, P. & Wilkinson, L. (2011). Nominal, Ordinal, Interval, and Ratio Typologies are
Misleading. In I. Borg & P. Mohler (Ed.), Trends and Perspectives in Empirical Social
Research (pp. 161-177). Berlin, New York: De Gruyter.
Wheeldon, J. (2011). Is a Picture Worth a Thousand Words? Using Mind Maps to Facilitate
Participant Recall in Qualitative Research. Qualitative Report, 16(2), 509-522.

UU-MBA-711-ZM - Dissertation Page 16

You might also like