Buckle - JOINT REFEREES' REPORT A Double-Blind, Randomised Clinical Trial Using Aromatherapy

JOINT REFEREES' REPORT
Michael Kirk-Smith and David Stretch (1)
"A double-blind, randomised clinical trial using aromatherapy to assess the therapeutic differences
of two essential oils of two lavenders (Lavandula angustifolia and Lavandula burnatii)".
INTRODUCTION
We have gone into more detail than would normally be the case for a standard referee's report.
This is partly because of the number of mistakes present and partly because it is important to take
as much care as possible when beginning to publish refereed papers, especially in an area such as
aromatherapy, since it is viewed with great scepticism in certain quarters.
The general conclusions we came to were that it would not be possible to publish this paper in any
refereed journal as it stands. It contains faults of both presentation and content or substance, some
of which may not be remedied by a rewrite, but only by carrying out the research again, with the
faults fixed. Indeed, we do not think that it should be published in any journal, given the
misleading analysis and conclusions. We recommend that the author spends some time learning
the basics of experimental design, statistics, and writing up research from someone who already has
a proven track record in this. We have structured the report in the following way:
1 Summary of main points

2 Issues concerning citations and structure
3 Issues concerning design
4 Issues concerning the results and discussion sections
5 Opinions which appear to arise directly from the research
6 Errors of Referencing
7 References
1. We found out by accident that we were both refereeing this paper so we decided to write this
joint report. After this report was written and submitted we found out that the article had already
been published as Buckle, J. (1993) Aromatherapy: Does it matter which lavender essential oil is
used. Nursing Times, 19(89), 32-35. (We subsequently sent this report to Nursing Times. They
told us that they had then sacked their own referees and commissioned from us a five article series
on “How to do research”).
1. SUMMARY OF MAIN POINTS
1.1 The paper is badly organised and does not conform to the standard layout of a scientific
research paper. The standard form allows people to read what has been done, why, and what has
been found in a very efficient and clear manner. A rewrite of the submission could correct this.
1.2 There is no clear statement as to the aims of the research described. Again, a rewrite would
correct this.
1.3 There is a lack of discrimination between solid scientifically reputable sources, and informal,
anecdotal, or secondary sources. This needs attention so that the reader can assess the quality of
the evidence being quoted. Ways of rewriting the submission to avoid these confusions are
possible.
1.4 The design has a number of flaws that hinders any interpretation of the results in terms of the
assumed aims. In particular, the absence of the instructions given to the participants, the role of the
investigator as the experimenter (thus casting doubt on its "double-blind" status) and the
consequent possibility of the experimenter unintentional expectancy effect makes a rerunning of the
experiment necessary. To recap, a rewrite cannot correct this error; instead a redesigning and
rerunning of the study is required.
1.5 The results and discussion sections are mixed up. A rewrite can correct this.
1.6 The results are incomplete - almost none of the collected data is reported, it is probably
analysed inappropriately, and the way the figures are drawn does not allow us to see clearly what
they are supposed to be showing. A rewrite could correct this, but if no results are significant by
normally accepted standards, it would be difficult to justify publication without radical reanalysis
and restructuring.
1.7 Points are made at the end of the discussion that have no support from the rest of the
submission, and, as at least one is merely an attempt at professional boundary-drawing, it should
not be included in a piece of objective research that tries to make empirically-supported points.
These errors can be corrected by omitting them in a rewrite.
1.8 There are numerous errors in the references section that suggest that secondary sources were
used but not admitted to, and that too little attention was devoted to giving accurate references. A
substantial rewrite, which may revise the support for doing the research, could correct this error.
2. ISSUES CONCERNING CITATIONS AND STRUCTURE
These following points could be viewed as nitpicking, however, we have gone into them in this
detail because a reviewer of submissions to journals should never have to deal with them; papers
should not be submitted which contain the problems we identify in this section. They make the
paper difficult to read, and may lead to the contents being disregarded by other scientists.
2.1 Although some of the headings typical in a refereed scientific research paper are included, the
most useful ones are not. It would be preferable to see the paper rewritten so that the desired points
are put across more clearly. That is, the following headings should have all been used: Abstract,
Introduction, Aims, Methods including sub-headings of Subjects (inclusion and exclusion criteria
for them), Materials (or Stimuli or Apparatus), Measures, Design and Procedure (especially
including the script of the instructions or explanations given to the subjects, for reasons we will
explain below), and then Results, followed by the Discussion, References (see points 2.3 and 2.4,
below) and Appendices (see also below). This would allow people to read the paper much more
efficiently, and help to ensure that the study could be replicated. In the case of this research,
replication will be required before the larger scientific community would even begin to believe that
there were any statistically significant results to explain. Although the Nursing Times does not
specify any detailed structure to reports of research, the section titles and contents, as we have
given them above, are used so frequently that we are surprised to see this submission not using
them - especially as no convincing reason is present for not making use of this standard layout.
2.2 We are unsure what the title of the original submission is supposed to be. On the front sheet, it
is given as we have stated at the top of this report. However, on the page numbered 1 in the
submission, we have the two lines: "Aromatherapy - A Clinical Trial" and, "Does it matter which
Lavender is used in hospitals?" which occur before the abstract. This needs to be cleared up, and
we will make some general comments about this and the entire paper at the end of this report.
2.3 Many (indeed, much) of the "evidence" cited appears in unrefereed journals and even in
popular magazines (like "Good Housekeeping", for example). This evidence is anecdotal, and to
not describe it as such in the body of the paper is misleading as it gives the impression that the
claims in these papers are of equal status to those that have been exposed to rigorous and
knowledgeable peer review. If this submission is to be published as a scientific paper, then the
references to anecdotal or unrefereed sources should be either omitted entirely, or else clearly
stated to be anecdotal or in scientifically unrefereed publications each time they are cited in the
text. In particular, statements such as "It is also safe to use on babies and pregnant women..." (page
32) need checking. Is the British Journal of Phytotherapy a peer-reviewed journal, and does
Blackwell (1991) base this point on scientific research, or informal, anecdotal evidence?
Authoritative sounding statements like this about the safety of a substance, not backed up by
evidence from solid pharmacological testing could lead to difficult situations for authors and
journals if anyone suffers harm after heeding the messages in these statements. Also, see the
"Errors in Referencing" section, later on, for more details of these errors, together with a
description of how extra errors occur which are misleading to the reader.
2.4 The claim, on page 32, that Lavandula angustifolia has been "shown" to be one of the safest
species of lavender has three references attached to it. Of these three, one is a book by Robert
Tisserand. Unless this author has engaged in serious pharmacological and toxicology research
himself, this book will report what other people have said (and this is perfectly all right, given that
he is not claiming to be publishing a serious scientific piece of research). The other reference is
one we do not know, but the title ("The secret of life and youth") does not suggest that it is a piece
of scientific research, but it more likely to be merely a popularising book, reporting other sources.
The only reference which does have some scientific credibility is that claimed to be by Bilsland,
however, this uncovers yet another problem, which will be discussed more completely in the final
section of this report (see later). At this point it suffices to state that this reference seems to be quite
irrelevant to the point that it was intended to support.
In summary, this section has pointed out deficiencies in the structure of the report that make
reading and understanding more difficult than it would otherwise be. We have suggested the
sections that should be used if the paper were to be rewritten. Similarly, we have pointed out the
flaws of using secondary sources, and of mixing up informal or anecdotal evidence with evidence
coming from scientifically sound, refereed papers. We will suggest how these can be remedied in
the "Errors in Referencing" section later on. In this section we will propose the "author-date"
means of referring to papers to help distinguish the quality of the evidence being cited, as well as
making the report read better. The advice follows a modified form of a referencing style widely
adopted by many learned journals, and originating with the APA (American Psychological
Association).
3. ISSUES CONCERNING DESIGN
The crucial point is that this study is not a double-blind study, as is claimed. On page 33, lines 15-
18, we read that "All treatments were carried out by the researcher herself to ensure that exactly the
same number of effleurage strokes were given in exactly the same order". So, the researcher, who
knew the purpose of the study, was the aromatherapist who had direct contact with the patients.
Most books dealing with research design and discussing possible sources of bias in experiments
usually contain something written like the following concerning double-blind studies:
"Investigators usually do not want their participants to be aware of the experimental aim.
Deception may well have to be employed to keep them in the dark and the moral implications of
this are discussed in the chapter on ethics. Keeping the participants in the dark is known as the
employment of a "single blind" procedure. But it has been argued here that experimenters may
transmit cues. Hence it makes sense to keep experimenters in the dark too. The employment of a
"double blind" procedure does just that - experimenters, or those who gather results directly from
the participants, are not told the true experimental aims. Where a placebo group is used, for
example, neither the participants, nor the data gatherers may know who has received the real
treatment." (Coolican, 1990, page 48., our emphasis).
In the research described in this paper, it seems that the author, and the initiator of the research,
actually did the aromatherapy with the patients. And although it helps that the particular brand of
lavender was not obviously known to the researcher, the fact that the person doing the therapy was
aware of the reasons for doing the research is sufficient reason for it not to be considered a double-
blind trial. In the Coolican quote, it is clear that there is a problem with the use of the word
"experimenter", which in the context in which it appears, means "data gatherer". From what is
written, this person will not be the initiator or "owner" of the research, as they are "not told the true
experimental aims". This is clear when we see who and what Coolican reports with respect to the
investigator and experimenter effects that may bias and affect results from experiments. The single
and double blind designs are both attempts to remove these biases and unwanted effects. Here is
the list that he obtained from Barber (1976):
1. Investigator paradigm effect

2. Investigator experimental design effect
3. Investigator loose procedure effect
4. Investigator data analysis effect
5. Investigator fudging effect
6. Experimenter personal attributes effect
7. Experimenter failure to follow the procedure effect
8. Experimenter misrecording effect
9. Experimenter fudging effect
10. Experimenter unintentional expectancy effect
(Coolican, 1990, page 47.)

Since the aim of any well-designed experiment is to eliminate all possible explanations of the
results other than those which are directly relevant to the aims of the study (i.e., to eliminate all
alternative explanations), it is clear that the list of 10 possible sources of biases and unwanted
effects need attention in any experiment. Note that Coolican uses the term "experimenter" to mean
the person who actually deals directly with the participants direct "hands-on" contact, in fact,
whereas the term "investigator" is used to refer to the person whose experiment it is, and who has
overall control of it, and so on. In the case of the submitted paper, the investigator and the
experimenter were stated to be the same person, and so the list of 10 possible sources of bias
collapse together to give 9 potential sources of bias, and hence, alternative explanations, of any
results obtained from the experiment described in this submission. These must be tackled before we
can be satisfied that the proposed explanation for the results is the only one that can be reasonably
advanced.
Although it is difficult to judge, given the lack of relevant information in the submission, we would
suggest that the issues that need the most careful consideration here are those to do with the
procedure (the "loose procedure effect", and the "failure to follow the procedure effect") and the
experimenter unintentional expectancy effect (Items 3, 7, and 10, in the list above). Also, given the
lack of detail and other errors, it is possible that other errors on this list could also be advanced to
interpret any results obtained.
We have not had time to consider the possible omissions in very great depth, but here is a list of
important details that should be included in the procedure but which haven't been, or which are
identifiable as deficits in the current design:
3.1 The investigator acting as the experimenter. This is sufficient, given many of the other points
we raise, for us to say that this is not a double-blind study at all. The most it can be is a single-
blind study.
3.2 Although the introduction to the study mentions the names of three species of lavender and of
one hybrid, and something is said of three of these, Lavandula stoechas is only mentioned by name
once. Furthermore, the study talks of Lavender A and Lavender B being chosen in the study, and
yet we are not told exactly which type of lavender, from the four mentioned, these are.
Although one may be tempted to infer that the A stands for angustifolia and the B for burnatii, this
is only supposition on our part (with the awareness that A and B will be commonly used for
unknown samples just by virtue of their being the first two letters of the alphabet). So, what were
the types of lavender used? We do hope that this is in fact known, for if it is not, then the study
immediately becomes marginal in its value, as any differential effects between the two groups will
be a result of differences between two types of lavender, but we won't know which lavender is
which, and if any differential effect of the two lavenders is found, we won't be in any position to
identify and make any recommendations about which lavender to choose to use (if, indeed, we wish
to make any such recommendation.)
3.3 We are not given details of the gas chromatography, but, even given that, we do not know how
much intra-species (or even intra-plant) variability there is in the chemical components of the kinds
of lavender used. Something needs to be added about this. Furthermore, were samples from the
same batches of lavender oil used in the gas chromatograph, or were they merely samples from the
same species or hybrid of lavender? Brud (1993) reported that there is great intra-species and intra-
plant variability in the levels of chemical compounds found in lavender oil, so much so that this
variability may swamp inter-species variability. Unless this issue is discussed in more detail, it is
difficult to interpret any results from the gas chromatography in any useful way.
3.4 No information is given about the extent to which the two lavenders used were capable of
being distinguished by the experimenter. This could have been either by the use of labelling on the
bottles containing the oil to be used in the massage, or by the experimenter being able to distinguish
between the two oils by their smell. No information is given about whether the oil used in the
massage was put in unlabelled bottles before being given to the experimenter, nor any explanation
of how likely it was (with possible experimental evidence from a pilot study) that the experimenter
could distinguish between the two lavenders used. These points are important, given the extent to
which unintentional experimenter expectancy effects can occur (see the Barber reference, already
given; Rosenthal, 1966; and most introductory textbooks dealing with research design in
psychology).
Unless these can be satisfactorily dealt with, or it is demonstrated that they could not have had a
large effect, then unintentional experimenter expectancy effects can provide an alternative
explanation to the results which does not rely on any therapeutic difference between the two types
of lavender. This point, of course, assumes that there are results that could justifiably be called
significant, and the next section questions this point.
3.5 The failure to include the script of what was said to the patients about what was going to
happen to them is another flaw. This is linked to the previous point, and is similarly important as
we need to be able to rule out any cues (unintentional as well as intentional) that may have led to
any observed differences between the two groups. Additionally, omitting the instructions given to
the subjects means that the study is not capable of being replicated, and one purpose of publishing
experiments is to enable others to replicate and thus to investigate the robustness of any findings.
3.6 Although it is said that massage was carried out in accordance to a report, and that exactly the
same number of effleurage strokes were used, we are not given the protocol that was followed for
this to be the case. It should be included, if only to enable others to replicate the study. For
example, how many strokes were used and in what order were different areas massaged?
3.7 The rest for 10 minutes to allow the essential oils to reach the brain has one reference attached
to it. How does the investigator know that 10 minutes is enough? Is the Franchomme
reference a book that reports hard experimental evidence supporting this action, or is it another
popularising book? The extent to which informal sources are used in this submission means that
the quality of each reference (i.e., refereed or unrefereed) needs to be stated when each reference is
used.
3.8 What detail we are given of the questionnaires could indicate that "leading questions" were
employed, and that the participants responded to what they saw as being what the experimenter
wanted to hear them say. This needs some discussion in the design section where the measures
should be (but aren't) discussed.
3.9 There are no "placebo" or other control groups in this study. This is potentially a rather
difficult issue, as without any evidence about how well the participants would have progressed
without any intervention by the aromatherapist, it remains a possibility that either or both groups of
participants were impeded in their recovery, compared with doing nothing extra to them. At the
very least, given our re-analysis of the results, there is no evidence to say that aromatherapy had
any effect whatsoever upon the rate of recovery of these patients.
3.10 There is a mention of a "modified O'Brian" on page 4 of the original submission. What is
this? We assume it is a questionnaire, but we are given no reference to it, and so it is difficult to
know how to assess its suitability. We suspect it may be something by the O'Brian who publishes
in the Nursing journals, but a full reference to this is certainly required. There is no mention at all
in the Nursing Times paper.
Additionally, if it is a modified version of the standard instrument, this certainly needs further
explanation of what the items were and how they were modified. For example, depending on how
this questionnaire was standardised, and how any reliability measures were measured, the
questionnaire can be suited or unsuited for studying changes in psychological states as opposed to
rather stable traits (Harris, 1963).
3.11 It is not said who completed these questionnaires. Did the patients complete them themselves,
or did someone else ask them the questions and record their responses? In the context of
unintentional experimenter effects, these details must be clearly stated.
3.12 In the original submission it is said that "behavioural status was also evaluated by the nurse on
a scale of 1-4" (page 4), yet in the appendices the scale does not range from 1-4, but from 0-4!
Which one is it?
The conclusion to this section is that the lack of important information given in the materials and
methods section leads to any results being uninterpretable. So much is omitted that any differences
observed cannot be linked to the experimental control that was used, such that alternative
explanations cannot be convincingly ruled out. Furthermore, the number of mistakes that are
present make the whole submission so difficult to follow that, at times, it isn't possible to determine
what was done at all. To illustrate how important psychological issues to do with procedure and
design are in research of this nature, we recommend Chapter One ("Placebos") of Skrabanek and
McCormick (1992), which has a very useful and sceptical outlook on all medical research, not just
complementary medicine, and which, therefore, should be read by all medical researchers.
4. ISSUES CONCERNING THE RESULTS AND DISCUSSION SECTIONS
We are forced to consider these two sections together as the submission mixes up their proper
contents in a way which is difficult to disentangle. As a reminder, here is how material should be
allocated between the results and discussion sections:
RESULTS
This should include what was found in the study, including descriptions of data, graphs and tables,
and the details of any statistical tests employed. Only data that have a direct bearing on the aims of
the study should be included, but if interesting aspects, separate to the main aims, are discovered,
these should be given at the end of the results section, with an explanation that these are a result of
an exploratory analysis. In this submission, something will need to be done which makes the aims
of the research far more clear, and we have already suggested that this should take the form of a
final sub-section to the introduction which just lists the main aims.
One point which needs stressing is that a Results section should contain no interpretation of the
results - just graphs, tables, numbers and short written descriptions of these. In particular, the
figures and tables should be approximately positioned within the results section with inserts like
this:
-------- Figure One about here ------------
Though the actual figures and results need to be placed at the end of the submission.
DISCUSSION
This section tries to offer an explanation of what the results mean in terms of the Aims given at the
end of the Introduction. Results should be interpreted in terms of any previous work introduced in
the Introduction, and, especially important, they should always be interpreted in terms of the
research's aims. If there are any unexpected aspects of results, useful applications of the results, or
future research that should be done, the discussion is the place to give them and discuss them. Any
improvements to the study, or any further research that are see as arising from the results should be
included here, usually at the end of the discussion.
However, a mere rewrite which involves shuffling paragraphs or sections of text from one section
to another would not be sufficient. Here are some points that need more attention:
4.1 The statistical test is presented in an overly formal manner that is not used, even by professional
statisticians, and there is no explanation of the terms PA and PB (we assume this refer to
"probability of A" or some such thing, but not using subscripts, and not defining these terms is just
confusing.) In truth, the PA - PB bit refers to a crude measure of what is known as the "size of
effect", whereas the jumbled up calculations that follow (see later), are the means by which we
decide whether there really is any justification for saying that there's something there at all.
Typically, this is done by completing a standard statistical significance test (see later for a comment
on this). A better way of writing this down, which includes our reanalysis, would have been:
"To test whether the proportion of people showing improvement is different in the two groups
(lavender A and lavender B) on the anxiety measure for day one, a chi-squared test for two-way
tables was performed (Leach, 1979, pp.264-292) This showed no significant interaction between
type of lavender administered and whether there was any improvement or not (Chi-squared = 2.8,
df = 1, p = 0.1). Consequently, there is no evidence to suggest that the two lavenders differed in
terms of the proportions of people showing improvement on day one of the anxiety scores."
Note that properly analysed and interpreted the data simply do not support the conclusions as stated
in the paper, at least not in any way universally accepted as normal scientific practice.
Here are other points relating to the analysis:
- Note that the value of Chi-squared we have derived is slightly different from that given in the
submission, and it is clear from our calculations that the author has truncated the p value, rather
than rounding it, which explains our value of 0.1 rather than 0.09.
- Furthermore, the derivation as given in the submission makes no mathematical sense. There are
unbalanced brackets, and the apparent mixture of implied operators and explicit ones (the
multiplication signs and their absence) corresponds to no formula we are aware of for deriving the
Chi-squared statistic. This suggests that the formulae were typed in by someone with no
mathematical ability, and that there was no checking of this by the author of the submission (who
should have recognised the derivation as mathematical nonsense). Using sub-scripts and super-
scripts (for exponentiation operators, or the "squared sign") in the submission would considerably
help here.
- The issue of small numbers of observed frequencies is not exactly as written here. Consulting
Wickens (1989), pages 27-31, will help. However, any objection about small cell sizes can be
easily circumvented by making use of the Fisher Exact Test, details of which can be readily found
in any elementary statistical textbook describing standard statistical hypothesis testing (e.g., Leach,
1979). The minor extra assumption made by the Fisher Exact Test would not be realistically
criticised by any expert statistician, who would, in any case, be able to derive the test without this
minor extra assumption (see the Wickens reference, already given).
- Any level of significance greater than p<0.05 will not be believed by any researcher. In
particular, in an area like aromatherapy, where there is great scepticism in the scientific
community, even more stringent significance levels, like p<0.01 may be advisable.
4.2 These are, at any rate, not complete results at all! After telling us about all the measures that
were taken, we find that a completely unexplained test on some anxiety data for day one is
analysed. What about the other measures and other days? If the results from these proved non-
significant, then it is a serious omission not to include them. They must be included if one of the
aims was to see whether the two lavenders differed in their effects on different aspects of people. If
the results are not significant, then that's the way it goes, but to withhold the results of tests because
a positive result was not found smacks of selectivity and bias, and not of objective scientific
investigation.
4.3 The raw responses or observations have clearly been transformed into some other measures of
outcome (the plus marks). How this was done needs to be explained, and, furthermore, the
justification for doing this when there are other standard ways of analysing the ratings needs to be
spelt out. If the investigator does not know how to do these analyses, then advice should have been
sought at the planning stage of the research.
4.4 The figures are not introduced or explained in the results section, they are not numbered, and
they have no legend clearly explaining what they are showing. Furthermore, the axes of the graphs
are not labelled clearly - What do the numbers along the horizontal axes mean? It may be obvious
to the investigator that these refer to numbers of people, but it is merely supposition on our part,
and needs clearing up by appropriate labelling.
4.5 It is not usual to make value comments about one's own research as in "This was probably the
world's first double-blind aromatherapy trial, using topical application" (page 35). Far better to
allow others to come to that conclusion, especially when faults in design and so on can occur, like,
for example, that the design here isn't double-blind!
To summarise, the results and discussions sections need to be completely rewritten, and the results
that are given are such a small amount of what was measured that the results would have to be
greatly expanded. If measures were taken and not used in any analysis, this would indicate another
flaw in the design. It can be considered unethical to take measurements of psychological states like
mood, anxiety, and so on if there is no good reason for doing this, or if the measures are not going
to be used. Medical Ethical Committees (certainly in our parts of the world) are quite definite and
correct in having this view. If the author believes that too much information would be placed in
this paper by including analyses of all results, then a note explaining this, and pointing towards
further papers that cover the currently unanalysed measures should be included in the Methods
section. Finally, we advise the author to seek expert advice from a researcher who has expertise in
design and statistics.
5. OPINIONS WHICH APPEAR TO ARISE DIRECTLY FROM THE RESEARCH
Points made in the "Other recommendations include:" section do not directly arise from the
research as described. They appear to have the role of professional boundary-drawing. In
particular, the recommendation that research in hospitals should be "under the guidance of an
experienced aromatherapist who is also a trained nurse." (page 35) cannot be supported at all -
aromatherapists do not generally know much about formal research design, and neither do trained
nurses as their expertise lies in other areas.
If this submission is from one such person, if a point needs making at all, it is that this submission
shows that what is required is the input of a well-trained and qualified researcher to help the
clinician's research. Moreover, given that the research fundamentally deals with psychological
factors, someone with research experience in psychology would seem to be a crucial requirement.
This appears especially relevant here, where many of the design flaws indicate a lack of knowledge
or appreciation of the psychological processes at work in such situations as described in this
submission. All this works against what has been stated in the submission. We justify this by
pointing out the obvious deficiencies in this submission. Furthermore, though it is arguable, we
would suggest that it points towards collaboration on an equal status amongst workers from
different disciplines (as is the case in many other multidisciplinary research areas).
6. ERRORS OF REFERENCING
The References section of the original submission contains inconsistencies in the way articles from
the same source are referenced, as well as actual errors in the references themselves. This mistake
is one that would be unacceptable even in a second-year psychology undergraduate, and so we are
unsure how such errors came to be in this submission. Normally, any paper containing such errors
would be rejected out of hand in most serious refereed journals, which in this case would mean that
most of the points we made in the earlier sections of this report would not have been made, and so
we urgently suggest the author takes time to learn how to structure and write a standard scientific
research paper.
Here are some of the most obvious errors in the References section that have been identified:
6.1 The footnote in the submission given to reference numbers 26 and 33 (that the author has "not
had time to find out the publishers"), would be sufficient for any refereed journal to reject the paper
outright. If the author has referred to the sources given, this should be as a result of consulting the
actual references themselves, and NOT secondary sources. If the original references had been read
then the author would have had sufficient time to find out the publishers, as these would be printed
at the front of the books. This suggests that these sources were not read directly, but that the author
was relying on what someone else had read, interpreted, and reported elsewhere. This casts doubt
on the veracity of the information claimed to have been obtained because secondary sources are
notorious for increasing the chance that distortions of facts and findings creep into accounts of
previous work. More importantly, given that these sources may have been read at secondhand, how
many other sources have been cited in a similar way? This needs to be clarified, and some
evidence of reading all the references is now required. Note that there are two references which are
incomplete in this manner. If it is not possible to read these references directly, then they either
should not be cited, or else they should be referred to via their secondary sources (e.g., two ways of
doing this are "Chaytor, 1932, cited in Smith, 1992"; or "Smith (1992) reports Chaytor (1932) as
claiming that . .".), and this has implications to the way in which references are generally treated
(see point 3 in this section).
6.2 The reference to Bilsland is wrong. In the references section it given as:
Bilsland, T. Allergic Contact Dermatitis from Essential oil of Marigold. Contact Dermatitis 1990;
7:1, 55-56.
There is a minor point we talk about more generally concerning the formatting of these references,
but the main point here is that the reference as given does not exist. Instead, the correct
reference (as confirmed by consulting the medical research directory, Excerpta Medica) is:
Bilsland, D. & Strong, A. (1990). Allergic contact dermatitis from the essential oil of French
marigold (Tagetes patula) in an aromatherapist. Contact Dermatitis, 23, 55-56.
So, the initial of Bilsland is incorrect, the second author is omitted and the title is given is
incomplete. However, correcting this reference merely highlights yet another serious problem -
that the paper seems to be dealing with dermatitis as a result of contact with marigold, and yet the
paper was cited as providing evidence that shows lavandula angustifolia to be one of the safest
species of lavender.
How can this be relevant? Certainly, more than mentioning this reference would be required, but
the reference needs to be used in order to argue the point. Of the three references given to back up
the claim, two are likely to be informal or anecdotal in nature, and the other does not seem at all
relevant to the argument about the safety of a species of lavender as it is employed currently in the
submission. This last point more properly could be first mentioned in the section where we deal
with the Introduction, but the confused nature of this submission would make it difficult to do so.
These are very serious errors for any scientific paper to contain for the following reasons:
- They hinder people from checking sources and such like, which is an important aspect of any
scientific research.
- They are misleading and even seem to be irrelevant in places.
- They seem to show that the author has not grasped the fundamental issue of scientific research
concerning the quality and nature of evidence required to back up a claim.
6.3 Because of this, we looked at a sample of the other references in a bit more detail. This is what
we found:
Reference 7 (Martin, S.) is incomplete. Is "Here's health" a journal, a book, or what? If a journal,
then more details are required, e.g., whether it is a letter, or an article, and what its title is.
Reference 13 is given as an anonymous item. However, it is not.

The reference as given is:
Anon. Stress: Another chimera? British Medical Journal 1991;

302: 191.
However, it should be:
Wilkinson, G. (1991). Stress: Another chimera? British medical journal, 302(6770), 191-192.
[Editorial]
Note that there is an author who can be identified from the journal, as it is an editorial, and this is
marked in the way we have referenced it. Furthermore, we have spelt out the name of the journal
in full, which submitted papers must do. If published, the local "house" style may dictate how
references are to be used and listed; however, this is an issue for the editor of the journal
considering whether to publish a paper, not the person submitting the paper.
References 10 and 17 are from the same journal (Nursing Times), and yet the form in which they
are referenced is different in the two. They should be made consistent. This can all be solved by
adopting a consistent means of listing references, which is certainly what every research scientist in
our experience attempts to do.
Reference 22 should be Van Toller, S. and Dodd, G.H.D.. The full title is not given.
Reference 24 is wrong. It is given as:
Bronough, R.L. In vivo percutaneous absorption of fragrance ingredients in rhesus monkeys and
humans. Food & Chemical Toxicology. 1990; 28: 5, 369-374.
However, Excerpta Medica gives the reference as follows:
Bronaugh, R. L., Wester, R. C., Bucks, D., Maibach, H.I. & Sarason, R. (1990). In vivo
percutaneous absorption of fragrance ingredients in rhesus monkeys and humans. Food and
chemical toxicology, 28(5), 369-373.
Note that the name of Bronaugh is mis-spelt, the additional authors are not listed, and Excerpta
Medica lists it as having one less page than is given in the submission.
Reference 28 does not contain the page numbers of the article. These should be included.
6.4 The way in which the references are given in the References section needs to be sorted out. At
the moment, the Nursing Times guidelines for structuring references appear to be non-standard,
even so, they are not being followed in this paper. The result unfortunately gives the impression of
a reference section put together by possibly a few people, with little or no checking prior to the
paper being sent off to the Nursing Times. We strongly urge that the author-date form of giving
references be adopted in the body of the text. This means that, instead of referring to papers by
numbers, the authors and the date of publication are given (as illustrated at the end of point 1,
above). This allows a simple and clear way of distinguishing between papers giving informal or
anecdotal support for a point, versus those which are from peer-reviewed or refereed scientific
journals. Taking the example we considered in point 4 of the first section of this report, we would
suggest writing this as follows:
"Lavandula angustifolia, common name English, French or Vera, was chosen as there is evidence
suggesting that this is safer than the others (Bilsland, 1990; informally: Maury, 1964; and
Tisserand, 1988)."
Though note that the problem of the relevance of the Bilsland reference to the safety of lavender
would still need to be attended to.
Also note that our suggestion is not as "hard-line" as is typically taken in peer-reviewed scientific
journals, in which unsupported evidence really should not be included, and where using anecdotal
accounts from unrefereed journals are strongly discouraged. However, in recognition that there is
as yet little or no direct scientific evidence concerning aromatherapy, citing references as we've
suggested could be a useful tactic to employ. However, this is still likely to lead to criticism from
within the wider scientific community.
7. REFERENCES
Barber, T. X. (1976). Pitfalls in human research. Oxford: Pergamon Press.
Brud, W. (1993). Blending and compounding: Where is the true essential oil? Paper given to the
"Aroma '93" International Conference, Brighton, UK. 2nd. 4th. July, 1993.
Coolican, H. (1990). Research methods and statistics in psychology. London: Hodder and
Stoughton.
Harris, C. (ed) (1963). Problems in measuring change. Madison, Wis.: University of Wisconsin
Press.
Leach, C. (1979). Introduction to statistics: a nonparametric approach for the social sciences.
Chichester: John Wiley.
Rosenthal, R. (1966). Experimenter effects in behavioral research. New York: Appleton-Century-

Crofts.
Skrabanek, P., and McCormick, J. (1992). (2nd Edition) Follies and fallacies in medicine.
Chippenham: The Tarragon Press.
Wickens, T. D. (1989). Multiway contingency tables analysis for the social sciences. Hillsdale,
NJ: Lawrence Erlbaum.

Buckle - JOINT REFEREES' REPORT A Double-Blind, Randomised Clinical Trial Using Aromatherapy

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Buckle - JOINT REFEREES' REPORT A Double-Blind, Randomised Clinical Trial Using Aromatherapy

Uploaded by

Copyright:

Available Formats

JOINT REFEREES' REPORT

Michael Kirk-Smith and David Stretch (1)

1 Summary of main points

2. ISSUES CONCERNING CITATIONS AND STRUCTURE

3. ISSUES CONCERNING DESIGN

1. Investigator paradigm effect

(Coolican, 1990, page 47.)

4. ISSUES CONCERNING THE RESULTS AND DISCUSSION SECTIONS

5. OPINIONS WHICH APPEAR TO ARISE DIRECTLY FROM THE RESEARCH

Reference 13 is given as an anonymous item. However, it is not.

Anon. Stress: Another chimera? British Medical Journal 1991;

However, it should be:

However, Excerpta Medica gives the reference as follows:

Barber, T. X. (1976). Pitfalls in human research. Oxford: Pergamon Press.

Rosenthal, R. (1966). Experimenter effects in behavioral research. New York: Appleton-Century-

You might also like