You are on page 1of 15

How To Present Research Data?

Tong Seng Fah  and Aznida Firzah Abdul Aziz

Author information Copyright and License information Disclaimer

Go to:

INTRODUCTION
The result section of an original research paper provides answer to this question “What was found?” The
amount of findings generated in a typical research project is often much more than what medical journal can
accommodate in one article. So, the first thing the author needs to do is to make a selection of what is worth
presenting. Having decided that, he/she will need to convey the message effectively using a mixture of text,
tables and graphics. The level of details required depends a great deal on the target audience of the paper. Hence
it is important to check the requirement of journal we intend to send the paper to (e.g. the Uniform
Requirements for Manuscripts Submitted to Medical Journals1). This article condenses some common general
rules on the presentation of research data that we find useful.
Go to:

SOME GENERAL RULES


 Keep it simple. This golden rule seems obvious but authors who have immersed in their data sometime
fail to realise that readers are lost in the mass of data they are a little too keen to present. Present too
much information tends to cloud the most pertinent facts that we wish to convey.
 First general, then specific. Start with response rate and description of research participants (these
information give the readers an idea of the representativeness of the research data), then the key findings
and relevant statistical analyses.
 Data should answer the research questions identified earlier.
 Leave the process of data collection to the methods section. Do not include any discussion. These errors
are surprising quite common.
 Always use past tense in describing results.
 Text, tables or graphics? These complement each other in providing clear reporting of research findings.
Do not repeat the same information in more than one format. Select the best method to convey the
message.
Go to:

TEXT
Consider these two lines:
1. Mean baseline HbA1c of 73 diabetic patients before intervention was 8.9% and mean HbA1c after
intervention was 7.8%.
2. Mean HbA1c of 73 of diabetic patients decreased from 8.9% to 7.8% after an intervention.
In line 1, the author presents only the data (i.e. what exactly was found in a study) but the reader is forced to
analyse and draw their own conclusion (“mean HbA1c decreased”) thus making the result more difficult to read.
In line 2, the preferred way of writing, the data was presented together with its interpretation.
 Data, which often are numbers and figures, are better presented in tables and graphics, while the
interpretation are better stated in text. By doing so, we do not need to repeat the values of HbA1c in the
text (which will be illustrated in tables or graphics), and we can interpret the data for the readers.
However, if there are too few variables, the data can be easily described in a simple sentence including
its interpretation. For example, the majority of diabetic patients enrolled in the study were male (80%)
compare to female (20%).
 Using qualitative words to attract the readers’ attention is not helpful. Such words like “remarkably”
decreased, “extremely” different and “obviously” higher are redundant. The exact values in the data will
show just how remarkable, how extreme and how obvious the findings are.

Example:
“It is clearly evident from Figure 1B that there was significant different (p=0.001) in HbA1c level at 6, 12 and 18
months after diabetic self-management program between 96 patients in intervention group and 101 patients in
control group, but no difference seen from 24 months onwards.” [Too wordy]

Figure 1B.
Changes of HbA1c level after diabetic self-management program.

The above can be rewritten as:


“Statistical significant difference was only observed at 6, 12 and 18 months after diabetic self-management
program between intervention and control group (Fig 1B)”. [The p values and numbers of patients are already
presented in Figure 1B and need not be repeated.]
 Avoid redundant words and information. Do not repeat the result within the text, tables and figures.
Well-constructed tables and graphics should be self-explanatory, thus detailed explanation in the text is
not required. Only important points and results need to be highlighted in the text.
Go to:

TABLES
Tables are useful to highlight precise numerical values; proportions or trends are better illustrated with charts or
graphics. Tables summarise large amounts of related data clearly and allow comparison to be made among
groups of variables. Generally, well-constructed tables should be self explanatory with four main parts: title,
columns, rows and footnotes.
 Title. Keep it brief and relate clearly the content of the table. Words in the title should represent and
summarise variables used in the columns and rows rather than repeating the columns and rows’ titles.
For example, “Comparing full blood count results among different races” is clearer and simpler than
“Comparing haemoglobin, platelet count, and total white cell count among Malays, Chinese and
Indians”.
 Columns and rows. Columns are vertically listed data, and rows are horizontally listed data. Similar
data ought to be presented in columns. Often these are dependant variables and allow clearer comparison
among groups. Compare Table 1A and and1B,1B, the dependant variables in Table 1B are waist
circumference, HbA1c, SBP and etc. Table 1B shows a better comparison of dependant variables among
ethnicity than Table 1A. The first column to the left is usually a list of its independent variables i.e.
Malay, Chinese, Indian and others, as the example of Table 1B. A table with too many dependent
variables would become too wide for a page. There are two alternatives to this problem. We can list the
dependant variables in the first left column and independent variables across the top. However, doing so
should not compromise clarity of the message we want to get across. The second alternative is to cut
down unnecessary columns, which, can be replaced by footnotes explaining their definition. For
example, we can eliminate the columns on p, student-t test and chi-square values (see Table 2).
Significant test result can be marked using * or # with a footnote. Presenting exact values of statistical
data with no significant difference is rarely useful.
Table 1A:

Baseline waist circumference, HbA1c, Blood pressure and LDL-cholesterol level among Malay,
Chinese, Indian and others races

Malay Chines Indian Others


e

Waist 98 102 105 95


circumference
Malay Chines Indian Others
e

HbA1c 8.89 8.66 9.0 8.7

SBP 165.1 164.0 170.34 168

DBP 98.5 101 99.3 97.6

LDL-C 3.8 3.9 3.4 3.1

Table 1B:

Mean (SD) baseline diabetic metabolic control among different races

WC* (SD) HbA1c (SD), % SBP† (SD) DBP‡ (SD) LDL-C§ (SD)

Malay 98 (15) 8.9 (1.5) 165 (21) 98 (13) 3.8 (0.9)

Chines 102 (18) 8.7 (2.1) 164 (28) 101 (15) 3.9 (0.7)
e

Indian 105 (22) 9.0 (1.8) 170 (36) 99 (22) 3.4 (1.2)

Others 95 (28) 8.7 (2.5) 168 (40) 97 (28) 3.1 (1.0)

*WC, waist circumference (in cm)


†SBP, systolic blood pressure (in mmHg)
‡DBP, diastolic blood pressure (in mmHg)
£LDL-cholesterol (in mmol/L)

Table 2:

Comparison of the presenting symptoms among patients with and without thrombocytopaenia

Symptom Platelet count (%) OR* (95% CI)

Normal Thrombocytopaeni
a

Presented at or after day 3 of 26 (65) 31 (93.9) 5.88(1.20-28.8)†


fever

Myalgia 32 (80) 23 (82.1) 1.09 (0.51-2.31)

Headache 25 (64.1) 22 (78.6) 1.56 (0.75-3.26)

Nausea/vomiting 18 (46.2) 23 (76.7) 2.24 (1.12-4.50)‡

Arthralgia 26 (40) 13 (54.2) 1.43 (0.76-2.69)

Retro-orbital pain 9 (23.1) 6 (26.1) 1.11 (0.54-2.29)

Rash 5 (12.5) 8 (24.2) 1.47 (0.88-2.49)

*Odds ratio (95% confidence interval)


†p=0.04
‡p=0.01
 Footnotes. These add clarity to the data presented. They are listed at the bottom of tables. Their use is to
define unconventional abbreviation, symbols, statistical analysis and acknowledgement (if the table is
adapted from a published table). Generally the font size is smaller in the footnotes and follows a
sequence of foot note signs (*, †, ‡, §, ‖, ¶, **, ††, # ).1 These symbols and abbreviation should be
standardised in all tables to avoid confusion and unnecessary long list of footnotes. Proper use of
footnotes will reduce the need for multiple columns (e.g. replacing a list of p values) and the width of
columns (abbreviating waist circumference to WC as in table 1B)
 Body of the table. We can improve the clarity of data presented in body of the tables by the following:
 Consistent use of units and its decimal places. The data on systolic blood pressure in Table 1B is
neater than the similar data in Table 1A.
 Arrange date and timing from left to the right.
 Round off the numbers to fewest decimal places possible to convey meaningful precision. Mean
systolic blood pressure of 165.1mmHg (as in Table 1B) does not add much precision compared
to 165mmHg. Furthermore, 0.1mmHg does not add any clinical importance. Hence blood
pressure is best to round off to nearest 1mmHg.
 Avoid listing numerous zeros, which made comparison incomprehensible. For example total
white cell count is best represented with 11.3 ×106/L rather than 11,300,000/L. This way, we
only need to write 11.3 in the cell of the table.
 Avoid too many lines in a table. Often it is sufficient to just have three horizontal lines in a table;
one below the title; one dividing the column titles and data; one dividing the data and footnotes.
Vertical lines are not necessary. It will only make a table more difficult to read (compare Tables
1A and and1B1B).
 Standard deviation can be added to show precision of the data in our table. Placement of standard
deviation can be difficult to decide. If we place the standard deviation at the side of our data, it
allows clear comparison when we read down (Table 1B). On the other hand, if we place the
standard deviation below our data, it makes comparison across columns easier. Hence, we should
decide what we want the readers to compare.
 It is neater and space-saving if we highlight statistically significant finding with an asterisk (*) or
other symbols instead of listing down all the p values (Table 2). It is not necessary to add an
extra column to report the detail of student-t test or chi-square values.
Go to:

GRAPHICS
Graphics are particularly good for demonstrating a trend in the data that would not be apparent in tables. It
provides visual emphasis and avoids lengthy text description. However, presenting numerical data in the form
of graphs will lose details of its precise values which tables are able to provide. The authors have to decide the
best format of getting the intended message across. Is it for data precision or emphasis on a particular trend and
pattern? Likewise, if the data is easily described in text, than text will be the preferred method, as it is more
costly to print graphics than text. For example, having a nicely drawn age histogram is take up lots of space but
carries little extra information. It is better to summarise it as mean ±SD or median depends on whether the age
is normally distributed or skewed. Since graphics should be self-explanatory, all information provided has to be
clear. Briefly, a well-constructed graphic should have a title, figure legend and footnotes along with the figure.
As with the tables, titles should contain words that describe the data succinctly. Define symbols and lines used
in legends clearly.
Some general guides to graphic presentation are:
 Bar charts, either horizontal or column bars, are used to display categorical data. Strictly speaking, bar
charts with continuous data should be drawn as histograms or line graphs. Usually, data presented in bar
charts are better illustrated in tables unless there are important pattern or trends need to be emphasised.
 Avoid 3-D graphs and charts. Three dimensional graphics are impressive in slide show and easily
capture the attention of the audience. In medical writing, they are not effective because it is difficult to
read the exact value on the Y axis (the height of the bars) accurately (Figure 1A).

Figure 1A.

Changes of HbA1c level after diabetic self-management program.

 Line graphs are most appropriate in tracking changing values between variables over a period of time or
when the changing values are continuous data. Independent variables (e.g. time) are usually on the X-
axis and dependant variables (for example, HbA1c) are usually on the Y-axis. The trend of HbA1c changes
is much more apparent with Figure 1B than Figure 1A, and HbA1c level at any time after intervention
can be accurately read in Figure 1B.
 Pie charts should not be used often as any data in a pie chart is better represented in bar charts (if there
are specific data trend to be emphasised) or simple text description (if there are only a few variables). A
common error is presenting sex distribution of study subjects in a pie chart. It is simpler by just stating
% of male or female in text form.
 Patients’ identity in all illustrations, for example pictures of the patients, x-ray films, and investigation
results should remain confidential. Use patient’s initials instead of their real names. Cover or blackout
the eyes whenever possible. Obtain consent if pictures are used. Highlight and label areas in the
illustration, which need emphasis. Do not let the readers search for details in the illustration, which may
result in misinterpretation. Remember, we write to avoid misunderstanding whilst maintaining clarity of
data.
Go to:

STATISTICS
Papers are often rejected because wrong statistical tests are used or interpreted incorrectly. A simple approach is
to consult the statistician early. Bearing in mind that most readers are not statisticians, the reporting of any
statistical tests should aim to be understandable by the average audience but sufficiently rigorous to withstand
the critique of experts.
 Simple statistic such as mean and standard deviation, median, normality testing is better reported in text.
For example, age of group A subjects was normally distributed with mean of 45.4 years old kg
(SD=5.6). More complicated statistical tests involving many variables are better illustrated in tables or
graphs with their interpretation by text. (See section on Tables).
 We should quote and interpret p value correctly. It is preferable to quote the exact p value, since it is
now easily obtained from standard statistical software. This is more so if the p value is statistically not
significant, rather just quoting p>0.05 or p=ns. It is not necessary to report the exact p value that is
smaller than 0.001 (quoting p<0.001 is sufficient); it is incorrect to report p=0.0000 (as some software
apt to report for very small p value).
 We should refrain from reporting such statement: “mean systolic blood pressure for group A
(135mmHg, SD=12.5) was higher than group B (130mmHg, SD= 9.8) but did not reach statistical
significance (t=4.5, p=0.56).” When p did not show statistical significance (it might be >0.01 or >0.05,
depending on which level you would take), it simply means no difference among groups.
 Confidence intervals. It is now preferable to report the 95% confidence intervals (95%CI) together with
p value, especially if a hypothesis testing has been performed.
Go to:

CONCLUSION
The main core of the result section consists of text, tables and graphics. As a general rule, text provides
narration and interpretation of the data presented. Simple data with few categories is better presented in text
form. Tables are useful in summarising large amounts of data systemically and graphics should be used to
highlight evidence and trends in the data presented. The content of the data presented must match the research
questions and objectives of the study in order to give meaning to the data presented. Keep the data and its
statistical analyses as simple as possible to give the readers maximal clarity.
Go to:

Contributor Information
Tong Seng Fah, 
Aznida Firzah Abdul Aziz, 
Go to:

REFERENCES
1. International Committee for Medical Journal Editors. Uniform requirements for manuscripts submitted to
biomedical journals: Writing and Editing for Biomedical Publication. [PubMed]

What is data presentation and analysis?


Data presentation and analysis forms an integral part of all academic studies, commercial,
industrial and marketing activities as well as professional practices. Presentation of data
requires skills and understanding of data. It is necessary to make use of collected data which
is considered to be raw data which must be processed to put for any application. Data
analysis helps in the interpretation of data and take a decision or answer the research
question. This can be done by using data processing tools and softwares. Data analysis starts
with the collection of data followed by data processing by various data processing
methods and sorting it. Processed data helps in obtaining information from it as the raw data
is non-comprehensive in nature. Presenting the data includes the pictorial representation of
the data by using graphs, charts, maps and other methods. These methods help in adding
the visual aspect to data which makes it much more comfortable and quicker to understand.
Various methods of data presentation can be used to present data and facts. Widely used
format and data presentation techniques are mentioned below:

1. As text – Raw data with proper formatting, categorisation, indentation is most


extensively used and very effective way of presenting data. Such format is widely found
in books, reports, research papers and in this article itself.
2. In tabular form – Tabular form is generally used to differentiate, categorise, relate
different datasets. It can be a simple pros & cons table, or a data with corresponding
value such as annual GDP, a bank statement, monthly expenditure etc.
3. In graphical Form – Data can further be presented in a simpler and even easier form
by means of using graphical form. The input for such graphical data can be another type
of data itself or some raw data. For example, a bar graph & pie chart takes tabular data
as input. The tabular data in such case is processed data itself but provides limited use.
Converting such data or raw data into graphical form directly makes it quick and easier to
interpret.

 
The significance of data presentation and analysis
Data presentation and analysis plays an essential role in every field. An excellent
presentation can be a deal maker or deal breaker. Some people make an incredibly useful
presentation with the same set of facts and figures which are available with others. At times
people who did all the hard work but failed to present it present it properly have lost essential
contracts, the work which they did is unable to impress the decision makers. So to get the job
done especially while dealing with clients or higher authorities presentation matters! No one is
willing to spend hours in understanding what you have to show and this is precisely why
presentation matters! It is thus essential to have clarity on what is data presentation. 
FREQUENCY DISTRIBUTION:

Data can be presented in various forms depending on the type of data collected. A frequency
distribution is a table showing how often each value (or set of values) of the variable in question
occurs in a data set. A frequency table is used to summarize categorical or numerical data.
Frequencies are also presented as relative frequencies, that is, the percentage of the total number in
the sample.

EXAMPLE: Frequency distribution of peptic ulcer according to site of ulcer

Site of ulcer Frequency Percent

Gastric ulcer 24 30.0

Duodenal ulcer 50 62.5

Gastric and duodenal ulcer 6 7.5

TOTAL 80 100

GRAPHICAL METHODS:

Frequency distributions and are usually illustrated graphically by plotting various types of graphs:

Bar graph - A bar graph is a way of summarizing a set


of categorical data. It displays the data using a number
of rectangles, of the same width, each of which
represents a particular category. Bar graphs can be
displayed horizontally or vertically and they are usually
drawn with a gap between the bars (rectangles).

Histogram - A histogram is a way of summarizing data


that are measured on an interval scale (either discrete
or continuous). It is often used in exploratory data
analysis to illustrate the features of the distribution of
the data in a convenient form.
Pie chart - A pie chart is used to display a set of
categorical data. It is a circle, which is divided into
segments. Each segment represents a particular
category. The area of each segment is proportional to
the number of cases in that category.

Line graph - A line graph is particularly useful when we


want to show the trend of a variable over time. Time is
displayed on the horizontal axis (x-axis) and the
variable is displayed on the vertical axis (y- axis).
Univariate means "one variable" (one type of data)

Example: Travel Time (minutes): 15, 29, 8, 42, 35, 21, 18, 42, 26

The variable is Travel Time

Example: Puppy Weights

You weigh the pups and get these results:

2.5, 3.5, 3.3, 3.1, 2.6, 3.6, 2.4

The variable is Puppy Weight

We can do lots of things with univariate data:

 Find a central value using mean, median and mode


 Find how spread out it is using range, quartiles and standard deviation
 Make plots like Bar Graphs, Pie Charts and Histograms

Bivariate means "two variables", in other words there are two types of data

With bivariate data we have two sets of related data we want to compare:

Example: Sales vs Temperature

An ice cream shop keeps track of how much ice cream they sell versus the temperature on
that day.

The two variables are Ice Cream Sales and Temperature.

Here are their figures for the last 12 days:

Ice Cream Sales vs Temperature


Temperature °C Ice Cream Sales
14.2° $215
16.4° $325
11.9° $185
15.2° $332
18.5° $406
22.1° $522
19.4° $412
25.1° $614
23.4° $544
18.1° $421
22.6° $445
17.2° $408

And here is the same data as a  Scatter Plot :

Now we can easily see that warmer weather and more ice cream sales are linked, but
the relationship is not perfect.

So with bivariate data we are interested in comparing the two sets of data and finding
any relationships.

We can use Tables,  Scatter Plots ,  Correlation , Line of Best Fit, and plain old common
sense.

Univariate Analysis: Definition, Examples


Statistics Definitions > Univariate Analysis
What is Univariate Analysis?
Univariate analysis is the simplest form of analyzing data. “Uni” means “one”, so in other words your data has only one variable. It
doesn’t deal with causes or relationships (unlike regression) and it’s major purpose is to describe; it takes data, summarizes that data
and finds patterns in the data.

What is a variable in Univariate Analysis?


A variable in univariate analysis is just a condition or subset that your data falls into. You can think of it as a “category.” For example,
the analysis might look at a variable of “age” or it might look at “height” or “weight”. However, it doesn’t look at more than one
variable at a time otherwise it becomes bivariate analysis (or in the case of 3 or more variables it would be called multivariate
analysis).
The following frequency distribution table shows one variable (left column) and the count in the right column.

A frequency chart.

You could have more than one variable in the above chart. For example, you could add the variable “Location”or “Age” or something
else, and make a separate column for location or age. In that case you would have bivariate data because you would then have two
variables.

Univariate Descriptive Statistics


Some ways you can describe patterns found in univariate data include central tendency (mean, mode and median)
and dispersion: range, variance, maximum, minimum, quartiles (including the interquartile range), and standard deviation.
You have several options for describing data with univariate data. Click on the link to find out more about each type of graph or chart:

 Frequency Distribution Tables.


 Bar Charts.
 Histograms.
 Frequency Polygons.
 Pie Charts.
Check out our Statistics YouTube channel for hundreds of videos on elementary statistics, probability and more.
------------------------------------------------------------------------------
Need help with a homewo

You might also like