Method and measurement in pharmacology

Overview 87 89 91

6

BIOASSAY
Methods for measuring drug effects are needed in order that we may compare the properties of different substances, or the same substance under different circumstances, requirements that are met by the techniques of bioassay, defined as the estimation of the concentration or potency of a substance by measurement of the biological response that it produces.

Bioassay 87 —General principles of bioassay —Bioassays in humans 90 Animal models of disease Clinical trials 92

Balancing benefit and risk

95

USES OF BIOASSAY OVERVIEW
We emphasised in Chapters 2 and 3 that drugs, being molecules, produce their effects by interacting with other molecules. This interaction can lead to effects at all levels of biological organisation, from molecules to human populations (Fig. 6.1).1 In this chapter, we cover the principles of metrication at the various organisational levels, ranging from laboratory methods to clinical trials. Assessment of drug action at the population level is the concern of pharmacoepidemiology and pharmacoeconomics (see Ch. 1), disciplines that are beyond the scope of this book. We consider first the general principles of bioassay, and its extension to studies in human beings; we describe the development of animal models to bridge the predictive gap between animal physiology and human disease; we next discuss aspects of clinical trials used to evaluate therapeutic efficacy in a clinical setting; finally, we consider the principles of balancing benefit and risk. Experimental design and statistical analysis are central to the interpretation of all types of pharmacological data. Kirkwood & Sterne (2003) provide an excellent introduction.

The uses of bioassay are:

• to measure the pharmacological activity of new or chemically
undefined substances • to investigate the function of endogenous mediators • to measure drug toxicity and unwanted effects.
▼ Bioassay plays a key role in the development of new drugs, discussed in Chapter 56. In the past, bioassay was often used to measure the concentration of drugs and other active substances in the blood or other body fluids, an application now superseded by analytical chemistry techniques. Bioassay is useful in the study of new hormonal or other chemically mediated control systems. Mediators in such systems are often first recognised by the biological effects that they produce. The first clue may be the finding that a tissue extract or some other biological sample produces an effect on an assay system. For example, the ability of extracts of the posterior lobe of the pituitary to produce a rise in blood pressure and a contraction of the uterus was observed at the beginning of the 20th century. These actions were developed as quantitative assay procedures, and a standard preparation of the extract was established by international agreement in 1935. By use of these assays, it was shown that two distinct peptides—vasopressin and oxytocin—were responsible, and they were eventually identified and synthesised in 1953. Biological assay had already revealed much about the synthesis, storage and release of the hormones, and was essential for their purification and identification. Nowadays, it does not take 50 years of laborious bioassays to identify new hormones before they are chemically characterised,2 but bioassay still plays a key role.

1

Consider the effect of cocaine on organised crime, of organophosphate ‘nerve gases’ on the stability of dictatorships, or of anaesthetics on the feasibility of surgical procedures for examples of molecular interactions that affect the behaviour of populations and societies.

2 In 1988, a Japanese group (Yanagisawa et al., 1988) described in a single remarkable paper the bioassay, purification, chemical analysis and synthesis, and DNA cloning of a new vascular peptide, endothelin (see Ch. 19).

87

Clinical trials Socioeconomic Pharmacoeconomics. 13) by the technique of cascade superfusion (Fig. 6.1 Levels of biological organisation and types of pharmacological measurement. As our understanding of drug action at the molecular level advances (Ch. BIOLOGICAL TEST SYSTEMS Nowadays. the sample is run sequentially over a series of test preparations chosen to differentiate between different active constituents of the sample. job prospects. Bioassays on different test systems may be run in parallel to reveal the profile of activity of an unknown mediator. suicide risk Methods Family Patients’ family members Social medicine Patient Individual Human volunteer Experimental animal Normal healthy subjects Rat. and the technologies underlying it. disability costs. who studied the generation and destruction of endogenous active substances such as prostanoids (see Ch. 6. social costs.6 SECTION 1 GENERAL PRINCIPLES Level of biological organisation Population & society Test system (examples) Socioeconomic group Response measures (example relating to analgesia) Impact on health-care costs. This was developed to an almost baroque splendour in the work of Vane and his colleagues. and had developed the 88 principles of bioassay to allow reliable measurements to be made with these sometimes difficult and unpredictable test systems. etc. have greatly extended the range of models that are available for measuring drug effects. an important use of bioassay is to provide information that will predict the effect of the drug in the clinical situation (where the aim is to improve function in patients suffering from the effects of disease). this knowledge. By the 1960s. 3). pharmacologists had become adept at using isolated organs and laboratory animals (usually under anaesthesia) for quantitative experiments. The pattern of responses produced identifies the active material. mouse. and the use of such assay systems for ‘on line’ analysis of biological samples has been invaluable in studying Laboratory methods Clinical Patients undergoing medical treatment Pain relief. pharmacoepidemiology .2). improvement of disability. Subjective pain intensity and threshold Behavioural responses to noxious and non-noxious stimuli Reflex responses to noxious stimuli Clinical pharmacology Physiological system CNS Physiological Tissue & organ Spinal cord Synaptic responses in dorsal horn Cell Spinal cord neurons Membrane responses Cellular DRUG ACTION Molecule Transfected cell lines Second messenger responses Molecular Substance P (NK-1) receptor Binding studies on cloned receptor expressed in cell lines Fig. primate. In this technique. The choice of laboratory test systems (in vitro and in vivo ‘models’) that provide this predictive link is an important aspect of quantitative pharmacology. disease prevalence Impact on relationships. etc.

These ‘traditional’ assay systems address drug action at the physiological level—roughly. comparisons are based on analysis of dose–response curves. rat and mouse. Thus M provides an estimate of the potency ratio of the two preparations. 5-HT.) the production and fate of short-lived mediators such as prostanoids and the endothelium-derived relaxing factor (Ch. and the design of bioassays is aimed at: • minimising variation • avoiding systematic errors resulting from variation • estimation of the limits of error of the assay result. As Figure 6. ▼ Many different experimental designs have been proposed to maximise the efficiency and reliability of bioassays (see Laska & Meisner. 1 mHelen being sufficient to launch 1 ship. PHI. The mHelen is a unit of beauty. 56). from which the matching doses of S and U are calculated. stands for ‘purity in heart index’ and measures the ability of a virgin pure-in-heart to transform.3). which can often be achieved by using a logarithmic dose scale and restricting observations 3 More picturesque examples of absolute units of the kind that Burn would have frowned on are the PHI and the mHelen. 6. The best kind of standard is.’ He was referring to the fact that the ‘king’s arm’ had been long since abandoned as a standard measure of length. More recently. 1987). however. enabling unknown materials present in the blood to be identified and assayed. natural products and antisera against which laboratory samples can be calibrated. even though the standard preparations are not chemically pure. Commonly. towards the molecular and towards the clinical. they should nonetheless be able to agree that preparation X is. Indeed. ‘mouse units’ and the like. the mid-range of the organisational hierarchy shown in Fig.3 shows. a he-goat into a youth of surpassing beauty. The use of transgenic animals to model human disease represents a real advance. a bioassay must provide an estimate of the dose or concentration of U that will produce the same biological effect as that of a known dose or concentration of S. ADH.5 times as active as standard preparation Y on the pigeon test. The main problem with all types of bioassay is that of biological variation. (From Vane J R 1969 Br J Pharmacol 35: 209–242. Nor. Subsequent developments have extended the range of available models in both directions. provided that the log dose–effect curves for S and U are parallel. Water jacket (38˚C) Blood from artery Roller pump To vein B Adr Nor Ang II BK PGs 5-HT ADH Rat stomach Chick rectum Rat colon Rabbit rectum Cat jejunum THE DESIGN OF BIOASSAYS Fig. the range of techniques for analysing drug effects at the molecular and cellular levels is now very impressive. PG. say. say. but it may be necessary to establish standard preparations of various hormones. prostaglandin. Bridging the gap between effects at the physiological and the therapeutic levels has. bradykinin. antidiuretic hormone. Adr. usually a standard and an unknown. 6.2 Parallel assay by the cascade superfusion technique. B The response of these organs to a variety of test substances (at 0. noradrenaline (norepinephrine). whose responses are measured by a simple transducer system. A Blood is pumped continuously from the test animal over a succession of test organs. vomiting of a pigeon or cardiac arrest in a mouse. of equiactive doses will not depend on the magnitude of response chosen. Ang II. Biological assays are therefore designed to measure the relative potency of two preparations. of course. angiotensin II. The introduction of binding assays (Ch. because human illness cannot. the ratio. contaminated the literature. the use of cell lines engineered to express specific human receptor subtypes has become widespread as a screening tool for drug discovery (see Ch. 3) in the 1970s was a significant step towards analysis at the molecular level. 14). 6. 5-hydroxytryptamine. under appropriate conditions. Each active substance produces a distinct pattern of responses.1. proved much more difficult.METHOD AND MEASUREMENT IN PHARMACOLOGY 6 GENERAL PRINCIPLES OF BIOASSAY A THE USE OF STANDARDS J H Burn wrote in 1950: ‘Pharmacologists today strain at the king’s arm. adrenaline (epinephrine). which no two laboratories could agree on. be accurately reproduced in experimental animals. cited by Colquhoun (1971). This analysis is much simpler if the dose–response curves are linear. in many cases. a standard (S) and an unknown (U) on a particular preparation. Given the aim of comparing the activity of two preparations. the pure substance. not to mention the guinea pig and the pigeon. and is discussed in more detail below. but they swallow the frog. whereas drug activity continued to be defined in terms of dose needed to cause.1–5 ng/ml) is shown. 89 . 3.3 Even if two laboratories cannot agree—because their pigeons differ—on the activity in pigeon units of the same sample of an active substance. A plethora of ‘pigeon units’. A comparison of the magnitude of the effects produced by equal doses of S and U does not provide an estimate of M (see Fig. M. BK.

and the potency ratio (M) is estimated from the horizontal distance between the two curves (Fig. A1 and A2. Because the lines are parallel. although the appropriate statistical procedures are slightly different. Morphine 6 4 1 Codeine 4 4 3 2 Potency ratio = 13 2 3 2 1 0 8 16 30 60 120 240 Dose (mg) Fig. the minimal design being the 2 + 2 assay. New York.e. so the difference between them cannot be expressed simply in terms of a potency ratio. for example to check whether mechanisms that operate in other species also apply to humans. death. human pharmacology. or to take advantage of the much broader response capabilities of a person compared with a rat. 24) are compared. success in maze running within a stipulated time).3). by means of straightforward statistical analysis. change in blood glucose concentration. 1965 In: Analgetics. 2) will generate non-parallel log dose –response curves. measuring the effect of an analgesic drug on the Pain relief score to the middle region of the log10 dose–effect curve. but also a measure of the relative heights of the ceilings.4 Assay of morphine and codeine as analgesics in humans. which may be the case if the assay is used to compare two drugs whose mechanism of action is not the same. (The differences.g. to estimate the confidence limits of the final result. on successive occasions in random order.e. The calculated regression lines gave a potency ratio estimate of 13 for the two drugs. change in the time taken for a rat to run a maze). ▼ An example of an experiment to compare two analgesic drugs (see Ch. others (‘high ceiling’) can produce a very intense diuresis (described as ‘torrential’ by authors with vivid imaginations). or on all-or-nothing responses (e. Academic Press. it is not possible to define the relative potencies of S and U unambiguously in terms of a simple ratio. If the lines are not parallel. and many will do better than this. The first. providing an inherent measure of the variability of the test system.) 90 . An example of this kind of difficulty is met when diuretic drugs (Ch. In practice. loss of righting reflex. The 2 + 2 assay also detects whether or not the two log dose–effect lines deviate significantly from parallelism.) Comparison of equieffective doses of standard and unknown gives a valid measure of their relative potencies. With the latter. (After Houde R W et al.g. Each of four patients (numbered 1–4) was given. Note that comparing the magnitude of responses produced by the same dose (i. no matter how much is given.3 Comparison of the potency of unknown and standard by bioassay. A comparison of two such drugs requires not only a measure of the doses needed to produce an equal low-level diuretic effect. so that comparison requires measurement of more than a single dimension of potency. which is usually close to a straight line (see Ch. full and partial agonists at the same receptor (see Ch.4. the steepness of the dose–response curve is a property of the test system and has nothing to do with biological variation. 2). the proportion of animals responding will increase with dose. which can be used.g. but the ethical and safety issues are paramount. 6. Although many animal tests have been devised (e. focuses on using human subjects (either healthy volunteers or patients) essentially as experimental animals. sometimes known as a discontinuous or quantal response. the magnitude of the effect chosen for the comparison is immaterial. The doses are chosen to give responses lying on the linear part of the log10 dose –response curve. and are given repeatedly in randomised order.6 SECTION 1 GENERAL PRINCIPLES 100 log M A2 QUANTAL AND GRADED RESPONSES ▼ An assay may be based on a graded response (e. four different treatments (high and low morphine. contraction of a strip of smooth muscle. and the subjective pain relief score calculated for each. most bioassays will give results whose 5% confidence limits lie within ± 20%. 41) in humans is shown in Figure 6. volume) of standard and unknown gives no quantitative estimate of their relative potency. 6. The shape and slope of such a curve is governed by the individual variation between animals—the more uniform the population. the steeper the curve and the more precise the assay. The scientific principles underlying such measurements are the same. The experimenter must then face up to the fact that there are qualitative as well as quantitative differences between the two. Quantal responses can be used in essentially the same way as graded responses for the purposes of bioassay. With graded responses. 6. depend on the dose chosen. 3 Response (% maximal) Un kno wn A1 log M 0 0 1 2 log10 volume administered (μl) Fig. Some (‘low ceiling’) diuretics are capable of producing only a small diuretic effect. i. and ethical committees associated with all medical research centres tightly control the type of experiment that can be done. in which two doses of standard (S1 and S2) and two of unknown (U1 and U2) are used. Sta nd ard BIOASSAYS IN HUMANS Studies involving human subjects fall into two distinct categories. and high and low codeine) by intramuscular injection. log M is the same at all points on the curves. The use of a logarithmic dose scale means that the curves for S and U will normally be parallel. Assays of this type are known as parallel line assays. More generally.

and many of the models used are designed to reflect dopamine function in the brain.METHOD AND MEASUREMENT IN PHARMACOLOGY mean time taken for groups of mice to jump off a surface heated to a mildly painful temperature). • The biological response may be quantal (the proportion of tests in which a given all-or-nothing effect is produced) or graded. • The ‘cause’ (criterion 2) of many human diseases is complex or unknown. Such a measurement is. In practice. an important and highly specialised form of biological assay. in vitro and in vivo animal studies. models for many important disorders. particularly in psychiatry. Approaches range through molecular and chemical techniques. Bioassay • Bioassay is the measurement of potency of a drug or unknown mediator from the magnitude of the biological effect that it produces. there are many difficulties.4 shows a comparison of morphine and codeine in humans. we need to model the upstream (causative) factors rather than the downstream (symptomatic) features of the disease. such as epilepsy. As far as we know. and drugs that prevent this are also found to relieve motion sickness and other • Many diseases. which rules out criterion 1. in reducing brain damage following cerebral ischaemia) but ineffective in humans (stroke victims). it is clear that dopamine antagonists are effective. Parkinson’s disease). needs to be done on the basis of doses that are equiactive as analgesics. This. ANIMAL MODELS OF DISEASE There are many examples where simple intuitive models predict with fair accuracy therapeutic efficacy in humans. similar response to treatment (sometimes called predictive validity). nor can we recognise in them anything resembling a migraine attack or suicidal behaviour. even though their success in predicting therapeutic efficacy is far from perfect. • Different approaches to metrication apply according to the level of biological organisation at which the drug effect needs to be measured. The need to use patients for experimental purposes imposes many restrictions. 38) for example. because the model will have been selected on the basis of its responsiveness to known drugs. hypertension. 4 There have been many examples of drugs that were highly effective in experimental animals (e. the order being randomised and both subject and observer being unaware of the dose given. Similarly. where no clear brain pathology has been defined. are defined by phenomena in humans that are difficult or impossible to observe in animals. based on a modified 2 + 2 design. Different statistical procedures are appropriate in each case. such as side effects. Below. Parallel line assays follow this principle. however.4 Generalising. respond by vomiting. Estimates that are not based on comparison with standards are usually unreliable and vary from laboratory to laboratory. because such drugs will not have been tested in humans. tolerance or dependence. rather than other potential mechanisms that need to be identified if drug discovery is to move on. How many errors in the opposite direction may have occurred we shall never know. although the latter are the basis of most of the simple physiological models used hitherto. and the shortcomings of animal models are one of the main roadblocks on the route from basic medical science to improvements in therapy. The second type of human assay. and the results showed morphine to be 13 times as potent as codeine. • Bioassay normally involves comparison of the unknown preparation with a standard. Alzheimer’s disease. 91 . 6 types of nausea in humans. does not prove its superiority. are available. duration of action. similar causation (sometimes called construct validity) 3. the clinical trial. but they proved inactive in humans. Irritant chemicals injected into rats’ paws cause them to become swollen and tender.g. we can say that an animal model should ideally resemble the human disease in the following ways: 1. to measurement of effects at the socioeconomic level. • Comparisons are best made on the basis of dose–response curves. With schizophrenia (Ch. • Relying on response to treatment (criterion 3) as a test of validity carries the risk that drugs acting by novel mechanisms could be missed. osteoarthritis. but merely shows that a smaller dose is needed to produce the same effect. For many degenerative diseases (e. Ferrets. similar pathophysiological phenotype (sometimes called face validity) 2. Each of the four doses was given on different occasions to each of the four subjects. The difficulties include the following. and this test predicts very well the efficacy of drugs used for symptomatic relief in inflammatory conditions such as rheumatoid arthritis in humans. the role of such trials in the course of drug development is described in Chapter 56. diabetes. an essential preliminary to assessing the relative therapeutic merits of the two drugs. Subjective pain relief was assessed by a trained observer. when housed in swaying cages. we discuss some of the basic principles involved in clinical trials. and clinical studies on volunteers and patients. Pathophysiological similarity is also inapplicable to conditions such as depression or anxiety disorders. and have been used successfully to produce new drugs. Figure 6. recent work on substance P antagonists (Ch. As discussed elsewhere in this book. is designed to measure therapeutic effectiveness. they often fail to predict accurately the subjective relief of pain in humans. 16) showed them to be very effective in animal tests for analgesia. mania or delusions have no counterpart in rats. based on knowledge of the physiology of the condition. which allow estimates of the equiactive concentrations of unknown and standard to be used as a basis for the potency comparison. and gastric ulceration. of course. for any comparison of other factors.g.

This versatile technology. 2000). it is possible to obtain pure animal strains with characteristics closely resembling certain human diseases. These findings serve to pinpoint the physiological role of this receptor. Genetic models of this kind include spontaneously hypertensive rats. in the most widely used Cre-Lox conditional system). 40 years passed before the British Navy acted on his advice. this is carried out during phase III of clinical development (Ch. these mice develop pathological lesions and cognitive changes resembling Alzheimer’s disease. and thereby providing animal models that are expected to be more predictive of therapeutic drug effects in humans (see reviews by Rudolph & Moehler.. when reproduced in mice causes a disorder that mainly affects the intestine. such as increased aggression. The use of transgenic animals in pharmacological research is increasing rapidly as the technology improves. genetically obese mice. and many other examples. a diet. digitalis (see Ch.6 SECTION 1 GENERAL PRINCIPLES human cancers. 1996). which are important in the pathogenesis of Alzheimer’s disease (see Ch. For example. first reported in 1980. the gene defect responsible for causing cystic fibrosis (a disease affecting mainly the lungs in humans). which sets out strict criteria for assessing therapeutic efficacy. and provide very useful models with which to test possible new therapeutic approaches to the disease. based on randomised. aromatherapy. can be used in many different ways. nematodes. 6 92 8 It is fashionable in some quarters to argue that to require evidence of efficacy of therapeutic procedures in the form of a controlled trial runs counter to the doctrines of ‘holistic’ medicine. or any other kind of therapeutic intervention. which showed that oranges and lemons offered protection against scurvy. whose function was hitherto unknown. be misleading in relation to human disease. most transgenic technologies are applicable in mice but much more difficult in other mammals. Transgenic mice with mutations in tumour suppressor genes and oncogenes (see Ch. allowing the mutation to remain silent until triggered by the administration of a chemical promoter (e. In most cases.7 Although many drugs. the tetracycline analogue. When they are a few months old. 35) has been modelled in transgenic mice that overexpress synuclein.g. with undoubted effectiveness. 2001). 2002). can be used in automated highthroughput drug-screening assays (see Ch. rats with deficient vasopressin secretion.5 Currently. and to suggest new ways in which agonists or antagonists for these receptors might be developed for therapeutic use (e. unlike mice. . Conditional transgenesis is now possible. Parkinson’s disease (Ch. 35). acupuncture or ‘detox’. until about 30 years ago.6 Examples of such models include transgenic mice that overexpress mutated forms of the amyloid precursor protein or presenilins (see Yamada & Nabeshima. sometimes proving lethal or causing major developmental abnormalities. fruit flies and zebra fish—fast-multiplying species whose genetics has been extensively studied. any new drug is now required to have been tested in this way before being licensed for general clinical use.g. It is important to realise that. 5) are widely used as models for 7 Not exclusively. human) genes to overexpress genes by inserting additional copies to allow gene expression to be controlled by the experimenter. and may allow adult disease to be modelled more accurately. physiotherapy • • • • to inactivate individual genes. Very few ‘alternative’ or ‘complementary’ medical procedures. A good account of the principles and organisation of clinical trials is given by Friedman et al. This avoids the complications of developmental effects and long-term adaptations. and raised blood pressure (Ledent et al. the genetic abnormality is expressed throughout development. On the other hand. a protein found in the brain inclusions that are characteristic of the disease (see Beal. by a prospective study. Another neurodegenerative condition. for example: CLINICAL TRIALS A clinical trial is a method for comparing objectively. have been so tested. and urges scepticism about therapeutic doctrines whose efficacy have not been so demonstrated. Mice in which the gene for a particular adenosine receptor subtype has been inactivated show distinct behavioural and cardiovascular abnormalities. GENETIC AND TRANSGENIC ANIMAL MODELS Nowadays. 56). epilepsy-prone dogs and mice. deliberate genetic manipulation of the germline is increasingly used to generate transgenic animals as a means of replicating human disease states in experimental animals. By selective breeding. such as a surgical operation. the results of two or more therapeutic procedures. the genes responsible have not been identified. Standing up for the scientific approach is the evidence-based medicine movement (see Sackett et al. Treatment A might be a new drug or a new combination of existing drugs. 1999. see Offermanns & Hein (2004). Törnell & Snaith. to reduce aggressive behaviour or to treat hypertension). however. However. A clinical trial aims to compare the response of a test group of patients receiving a new treatment (A) with that of a control group receiving an existing ‘standard’ treatment (B).g. genetic approaches are increasingly used as an adjunct to conventional physiological and pharmacological approaches to disease modelling. methods of treatment were chosen on the basis of clinical impression and personal experience rather than objective testing. James Lind conducted a controlled trial in 1753 on 12 mariners. 1997). This is a fundamentally antiscientific view. (1996). controlled clinical trials. 18) was used for 200 years to treat cardiac failure before a controlled trial showed it to be of very limited value except in a particular type of patient. are very amenable to transgenic approaches and. for science advances only by generating predictions from hypotheses and by subjecting the predictions to experimental test.8 On the other hand. doxycycline. and a further century before the US Navy did. 5 With conventional transgenic technology. such as homeopathy. or mutate them to pathological forms to introduce new (e. For more detailed information. More recently. reduced response to noxious stimuli. Transgenic mice can. 56).. remain in use without ever having been subjected to a controlled clinical trial. For new drugs.

16 out of 20 patients receiving drug X got better within 2 weeks are of no value without a knowledge of how 20 patients receiving no treatment. so the use of a double-blind technique 9 Even this can be contentious. • The induced mutation operates throughout the development and lifetime of the animal. which allow new genes to be introduced (‘knock-ins’) or existing genes to be inactivated (‘knockouts’) or mutated in the animals in a breeding colony. Stratified randomisation is often used to avoid the difficulty. Claims of therapeutic efficacy based on reports that. but only at the expense of added complexity and numbers of patients. particularly if the groups are small. Concern inevitably arises over the ethics of assigning patients at random to an untreated control group when the doctor in charge believes the test treatment to have advantages. but merely compares the response produced by two stipulated therapeutic regimens. with the problem of avoiding bias. despite the difficulties of carrying out such studies. and agree to participate on the basis that he or she will be randomly and unknowingly assigned to either the test or the control group. sex. The new technique of conditional mutagenesis is an advance that allows the abnormal gene to be switched on or off at a chosen time. • Transgenic animals are produced by introducing mutations into the germ cells of animals (usually mice). but sometimes a cross-over design is possible in which the same patients are switched from test to control treatment or vice versa. the simplest form of randomisation is to allocate each patient to A or B by reference to a series of random numbers. The use of controls is crucial in clinical trials. If two treatments. or disease severity. stratification can also allow more sophisticated conclusions to be reached.METHOD AND MEASUREMENT IN PHARMACOLOGY and so on. subjects and investigators both contribute to bias if they know which treatment is which. The same concerns apply to trials in elderly patients. B might. There are many examples where experience has shown that children respond differently from adults. A and B. or a different treatment. would have fared. the reason for setting up a trial is that doubt exists in the minds of many doctors that the treatment is efficacious. and most trials are kept as simple as possible. Models of psychiatric illness are particularly problematic. Hence. Usually. Randomisation is essential to avoid bias in assigning individual patients to test or control groups. the randomised controlled clinical trial is now regarded as the essential tool for assessing clinical efficacy of new drugs. namely: • randomisation • the double-blind technique. are being compared on a series of selected patients. As well as avoiding error resulting from imbalance of groups assigned to A and B. It is possible to treat two or more characteristics of the trial population in this way. random allocation to A or B being used within each category. because patients who are unconscious. prove to be better than A in a particular group of patients even if it is not significantly better overall. It has been repeatedly shown that. and the trial will reveal only whether the chosen regimen performed better or worse than the control treatment. However. and the results compared. the clinical trial does not normally give any information about potency or the form of the dose–response curve. One difficulty with simple randomisation. is intended to minimise subjective bias. Thus the subjects might be divided into age categories. It will not say whether increasing or decreasing the dose would have improved the response. The double-blind technique. Many such models are now available. AVOIDANCE OF BIAS There are two main strategies that aim to minimise bias in clinical trials. Animal models generally reproduce imperfectly only certain aspects of human disease states. such as the prevalence and severity of side effects. • Insertion or deletion of certain genes sometimes results in phenotypic changes resembling human disease. 93 . so for these doctors there is no ethical dilemma. Additional questions may be posed. 6 Animal models • Animal models of disease are important for the discovery of new therapeutic agents. for example. timeconsuming and expensive than that of any laboratory-based assay. The standard against which it is judged (treatment B) might be a currently used drug treatment or (if there is no currently available effective treatment) a placebo or no treatment at all. However. but the number of strata can quickly become large. and is an approach increasingly used to develop disease models for drug testing. for example. is that the two groups may turn out to be ill-matched with respect to characteristics such as age.9 whereby each patient must be told the nature and risks of the trial. they should clearly avoid participating in a controlled trial. Unlike the kind of bioassay discussed earlier. and the process is self-defeating when the number of subjects in each becomes too small. or whether the treatment works better or worse in particular classes of patient. is immeasurably more complicated. The basic question posed by a clinical trial is thus simpler than that addressed by most conventional bioassays. which means that neither subject nor investigator is aware at the time of the assessment which treatment is being used. the organisation of clinical trials. the controls are provided by a separate group of patients from those receiving the test treatment. The investigator must decide in advance what dose to use and how often to give it. All would agree on the principle of informed consent. demented or mentally ill are unable to give such consent. and may be lethal. Clinical trials in children are particularly problematic but are necessary if the treatment of childhood diseases is to be placed on the same evidence base as is judged appropriate for adults. another trial would be needed to ascertain that. with the best will in the world. yet no one would want to preclude trials that might offer improved therapies to these needy patients. and there is now increasing pressure on pharmaceutical companies to perform trials in children. If individual doctors are personally convinced that the treatment is beneficial.

societal and economic benefit. however. . The probability of avoiding a type II error (i. A dietary regimen or a surgical operation. or (if you lose the gamble) dying immediately. even more disturbingly: ‘If you could gamble on surviving free of disability for your normal lifespan. the use of a double-blind procedure. We tend to regard type II errors more leniently than type I errors. A major factor that determines the size of sample needed is the degree of certainty the investigator seeks in avoiding either type of error. with a jet lag questionnaire to fill in when they arrived. with analytical resources easily to hand. for example. trading off duration and quality of life raises issues about which many of us feel decidedly squeamish. a pharmacologist selected a group of fellow pharmacologists attending a congress in Australia. and the trial stopped as soon as a result (at a predetermined level of significance) is achieved. In an attempt to determine whether melatonin is effective in countering jet lag. the interim results showed a significant reduction in mortality. only 210 subjects would be needed. In one such large-scale trial (Beta-blocker Heart Attack Trial Research Group.10 In general. As may be imagined. and even with drugs. If we were content only to reveal a reduction by 20 percentage points (and very likely miss a reduction by 10 points). 1997). this level of significance is considered acceptable as a basis for drawing conclusions. with precautions if necessary to disguise such clues as the taste or appearance of the two drugs. however. thus reducing the number of subjects tested. In sequential trials. Walley & Haycocks. and trials are often designed with a power of 0. the results are computed case by case (each case being paired with a control) as the trial proceeds. missing a real 10-point reduction in mortality could result in abandonment of a treatment that would save 100 lives for every 1000 patients treated—an extremely serious mistake from society’s point of view. 18) following heart attacks. which attempts to combine both survival time and relief from suffering in assessing overall benefit. however.. For most purposes. ‘But I only wanted something for my sore throat’ you protest weakly. A type II error occurs if no difference is found although A and B do actually differ (false negative). For example.6 SECTION 1 GENERAL PRINCIPLES regarded as clinically significant. Pharmacologists are only human. is an important principle. providing them with unlabelled capsules of melatonin or placebo.8–0. would require 850 subjects. referred to as type I and type II errors. be absolutely conclusive.9.05 level of significance means that the probability of obtaining a false positive result (i. by their nature. ▼ A trial may give a significant result before the planned number of patients have been enrolled. rather than in terms of objective clinical effects. incurring a type I error) is less than 1 in 20. The results of a trial cannot. and we should not expect a mere clinical trial to resolve such a fine semantic issue. is an important safeguard. Various scales for assessing ‘healthrelated quality of life’ have been devised and tested (see Drummond et al. failing to detect a real difference between A and B) is termed the power of the trial. Many of them (one of the authors included). can seldom be disguised. so it is common for interim analyses to be carried out (by an independent team so that the trial team remains unaware of the results). They approach the problem by asking such questions as: ‘How many years of life would you be prepared to sacrifice in order to live the rest of your life free of the disability you are currently experiencing?’ Or. A type I error occurs if a difference is found between A and B when none actually exists (false positive). The second factor that determines the sample size required is the magnitude of difference between A and B that is CLINICAL OUTCOME MEASURES The measurement of clinical outcome can be a complicated business. Not so economists. opened the capsules and consigned them to the bin on finding that they contained placebo. It is not always possible. Various ‘hybrid’ trial designs. This is because it is based on a sample of patients. If this analysis gives a conclusive result. the trial can be terminated. to detect that a given treatment reduces the mortality in a certain condition by at least 10 percentage points.05 level of significance and a power of 0. Maintaining the blind can be problematic. In this example. and there is always a chance that the sample was atypical of the population from which it came.e.e. pharmacological effects may reveal to patients what they are taking and predispose them to report accordingly. To increase the significance and the power of a trial requires more patients. (1996). and much statistical thought has gone into the problem of deciding in advance how many subjects will be required to produce a useful result. improved airways conductance or increased life expectancy. assuming that we wanted to achieve a 0. This simple example emphasises the need to assess clinical benefit (which is often difficult to quantify) in parallel with statistical considerations (which are fairly straightforward) in planning trials. Two types of erroneous conclusion are possible. To say that A and B are different at the 0. such as lowering of blood pressure. 1997. which led to the early termination of the trial. or if it shows that continuation is unlikely to give a conclusive result. and the tendency is to combine these with measures of life expectancy to arrive at the measure ‘quality-adjusted life years’ (QALYs) as an overall measure of therapeutic efficacy. have been devised (see Friedman et al.11 In 11 10 94 The distinction between a true pharmacological response and a beneficial clinical effect produced by the knowledge (based on the pharmacological effects that the drug produces) that an active drug is being administered is not easy to draw.9. which have the advantage of sequential trials in minimising the number of patients needed but do not require strict pairing of subjects. and is becoming increasingly so as society becomes more preoccupied with assessing the efficacy of therapeutic procedures in terms of improved quality of life. what odds would you accept?’ Imagine being asked this by your doctor. say from 50% (in the control group) to 40% (in the treated group). 1982) of the value of long-term treatment with the adrenoceptor–blocking drug propranolol (Ch. The probability of incurring a type I error is expressed as the significance of the result. THE SIZE OF THE SAMPLE Both ethical and financial considerations dictate that the trial should involve the minimum number of subjects.

producing a significant beneficial effect in about one-third of patients. which are being increasingly applied to clinical trials. on average. may therefore be weaker than has been argued. The data from the new trial. only a small minority have compared this group with untreated controls. had we repeated the trial many times. how often. and are therefore less likely to be published. A recent survey of these trial results (Hróbjartsson & Grøtsche. the variability between individuals is not taken into account in this definition. which the patient believes is (or could be. Their independent analysis of the data showed the beneficial effect of the drug to be slight and insignificant. and the statistical analysis is complex. it is certainly preferable to the ‘take your pick’ approach to conclusion forming adopted by most human beings when confronted with contradictory data. would we have obtained results suggesting that A is better? If this probability is low (say less than 0. Rejection of the hypothesis implies that A is more effective than B. however. Further analysis and further trials. As an objective procedure. i. If P is larger. supported the original conclusion. 1997). based on previous trials or clinical experience. This procedure. PLACEBOS ▼ A placebo is a dummy medicine containing no active ingredient (or alternatively. in effect an update of the prior probability curve that takes account of the new data. which has been the subject of much public discussion. BALANCING BENEFIT AND RISK THERAPEUTIC INDEX Ehrlich recognised that a drug must be judged not only by its useful properties. it can be argued that to ignore altogether prior knowledge and experience when interpreting new data is unjustified. so the use of sulfinpyrazone never found favour. and even unethical. A state of ‘therapy dependence’ may be produced in people who are not ill. however. or other kind of therapeutic intervention). refused to grant a licence for the use of the drug. and the Bayesian approach is consequently gaining acceptance. The principles underlying Bayesian approaches. The conclusion was that the drug under test (sulfinpyrazone) reduced by almost one-half the mortality from repeat heart attacks in the 8-month period after a first attack. are then statistically superimposed on the prior probability curve to produce a posterior probability curve. but by then the efficacy of aspirin in this condition had been established. there will be good reason. 1978) involved 1620 patients at 26 research centres in the USA and Canada. what is the probability (P) of obtaining the results that were actually obtained in the trial? In other words. of which some claimed superiority of the test treatment over the control while others did not. The use of active medicines may be delayed. in real life. the frequentist approach is perfectly appropriate. 2001) showed that the placebo effect was generally insignificant.METHOD AND MEASUREMENT IN PHARMACOLOGY planning clinical trials. While many clinical trials include a placebo group that shows improvement. can be very useful in arriving at a conclusion on the basis of several published trials. the results could quite easily have been obtained without there being any true difference between A and B. 6 META-ANALYSIS ▼ It is possible. If we have no prior reason for thinking that A will be better than B. (1999) and Lilford & Braunholtz (2000). One large trial (Anturane Reinfarction Trial Research Group. Double counting. 98 collaborating researchers. The necessary element of deception risks undermining the confidence of patients in the integrity of doctors. diet. and a formidable list of organising committees. which can be smaller than a conventional trial. in the context of a controlled trial) the real thing. although most trials are still based on frequentist principles. But often. than positive studies. because negative studies are generally considered less interesting. The ethical case for using placebos as therapy. Nevertheless. It has several drawbacks. criticising the trial as unreliable and biased in several respects. Suppose that a trial shows. 95 . The Bayesian approach is controversial. depending as it does on expressing the often subjective prior assumption in explicit mathematical terms.05). and he expressed the therapeutic index of a drug in terms of the ratio between the average minimum effective dose and the average maximum tolerated dose in a group of subjects. including two independent audit committees to check that the work was being carried out in conformity with the strict protocols established. it is necessary to decide the purpose of the trial in advance. and to define the outcome measures accordingly. to believe that A is actually better than B. because there is no way of assessing whether a patient still ‘needs’ the placebo. They concluded that the popular belief in the strength of the placebo effect is misplaced. and probably reflects in part the tendency of many symptoms to improve spontaneously and in part the reporting bias of patients who want to please their doctors. and could save many lives. and can easily go wrong. are described by Spiegelhalter et al. where it was small but significant. is another problem. The ‘placebo response’ is widely believed to be a powerful therapeutic effect. so it is quite possible that the effective dose in some individuals will be toxic to others. that patients treated with A live longer than patients treated with B. but also by its toxic effects. however (see Naylor. and we cannot reject the null hypothesis.e. individuals may vary widely in their sensitivity. Therapeutic index = Maximum non-toxic dose Minimum effective dose Unfortunately. THE ORGANISATION OF CLINICAL TRIALS The organisation of large-scale clinical trials involving hundreds or thousands of patients at many different centres is a massive and expensive undertaking that makes up one of the major costs of developing a new drug. we reject the null hypothesis and conclude that A is most likely better. except in the case of pain relief. The US Food and Drug Administration. by the use of statistical techniques. known as meta-analysis or overview analysis. caused by the same data being incorporated into more than one trial report. given that treatment A is no better than B. the main one being ‘publication bias’. FREQUENTIST AND BAYESIAN APPROACHES ▼ The conventional approach to analysis of scientific data (including clinical trials data) is known as ‘frequentist’ and is based on a null hypothesis. Even if for a particular subject there is a large margin between the maximum tolerated dose and minimum effective dose. Using a Bayesian approach allows this to be taken into account formally and explicitly by defining a prior probability for the effect of A. The risks of placebo therapies should not be underestimated. for example of the form treatment A is no more effective than treatment B. to combine the data obtained in several individual trials (provided each has been conducted according to a randomised design) in order to gain greater power and significance. and it is the usual principle on which trials of unknown drugs are based. a dummy surgical procedure. Conventional frequentist statistics addresses the question: If A were actually no more effective than B.

and include the number needed to treat (NNT) principle. and are inevitably expensive. therapeutic index is intended to indicate the margin of safety in use of a drug. survival beyond 2 years. Thus propranolol is dangerous to an asthmatic patient in doses that are harmless to a normal individual. clinical trials should be: — controlled (comparison of A with B.g. Thus defined. 96 • Therapeutic index (lethal dose for 50% of the population divided by effective dose for 50%) provides a very crude measure of the safety of any drug as used in practice. it is not a useful guide to the safety of a drug in clinical use. • Some very important forms of toxicity are idiosyncratic (i. pain relief) — overall ‘quality of life’ measures — ’quality-adjusted life years’ (QALYs). It may have some relevance as a measure of the impunity with which an overdose may be given. only a small proportion of individuals are susceptible. carried out by an independent group. The result may be: ‘B better than A’. In other cases. More generally. 52) in either the effective dose or the toxic dose of a drug makes it inherently less predictable. • LD50 does not reflect toxicity in the therapeutic setting.g. because it depends on what measure of effectiveness is used. • Interim analysis of data. which may not reflect forms of toxicity or adverse reactions that are important clinically — it takes no account of idiosyncratic toxic reactions.e. One useful approach is to estimate from clinical trial data the proportion of test and control patients who will experience (a) a defined level of clinical benefit (e.g. ‘B worse than A’. • Clinical outcome measures may comprise: — physiological measures (e. which combine survival with quality of life • Meta-analysis is a statistical technique used to pool the data from several independent trials. the ED50 for aspirin used for a mild headache is much lower than for aspirin as an antirheumatic drug. see Ch. OTHER MEASURES OF BENEFIT AND RISK Alternative ways of quantifying the benefits and risks of drugs in clinical use have received much attention. the likelihood of either kind of error gets less as the sample size and number of end-point events is increased. • Generally. and they are much less likely to kill when taken in accidental or deliberate overdose. toxicity depends greatly on the clinical state of the patient. and it provides no measure of the usefulness of a drug. or ‘No difference detected’. Efficacy. its pseudoquantitative precision is misleading. For example. and ED50 is the dose that is ‘effective’ in 50%. . may be used as a basis for terminating a trial prematurely if the data are already conclusive. Thus one reason why the benzodiazepines replaced barbiturates as hypnotic drugs (see Ch. pain relief to a certain predetermined level.g. • Clinical trials require very careful planning and execution. thalidomide —probably the most harmful drug ever marketed—was promoted specifically on the basis of its exceptionally high therapeutic index. For many reasons. In conclusion. • ED50 is often not definable. survival) — subjective assessments (e. 37) is that their therapeutic index is much greater. which produces unwanted effects but rarely death.6 SECTION 1 GENERAL PRINCIPLES Clinical trials • A clinical trial is a special type of bioassay done to compare the clinical efficacy of a new drug or procedure with that of a known drug or procedure (or a placebo). by drawing attention to the relationship between the effective and toxic doses. • To avoid bias. not potency. liver function tests) — long-term outcome (e. slowing of cognitive decline by a Determination of risk and benefit An often-quoted definition that aims to take into account individual variation is: Therapeutic index = LD50/ED50 where LD50 is the dose that is lethal in 50% of the population. the aim is a straight comparison of unknown (A) with standard (B) at a single dose level. blood pressure. therapeutic index is of little value as a measure of the clinical usefulness of a drug. or if a clear result is unlikely to be reached. • More sophisticated measures of risk–benefit analysis for drugs in clinical use are coming into use. is compared. Its main limitations are: — it is based on animal toxicity data. Although therapeutic index expresses a valid general concept by emphasising the balance between risk and benefit. Ironically. and therefore less safe. but it has obvious limitations and is therefore very rarely quoted as a number. 53). though. rather than study of A alone) — randomised (assignment of subjects to A or B on a random basis) — double-blind (neither subject nor assessor knows whether A or B is being used) • Type I errors (concluding that A is better than B when the difference is actually due to chance) and type II errors (concluding that A is no better than B because a real difference has escaped detection) can occur. • All experiments on human subjects require approval by an independent ethical committee. although this is not reflected in the therapeutic index. we can say that wide individual variation (see Ch.

Oxford University Press. Mortality results. Oxford (Standard textbook) Drummond M F. Abrams K R 1999 An introduction to Bayesian methods in health technology assessment. 6 REFERENCES AND FURTHER READING General references Colquhoun D 1971 Lectures on biostatistics. Int J Exp Pathol 77: 257–262 (Useful general review) Rudolph U. Br J Clin Pharmacol 43: 343–348 (Useful introduction to analytical principles that are becoming increasingly important for therapeutic policy makers). and 4 or 5 will experience major adverse effects. 33 will experience minor unwanted effects. Torrance G W 1997 Methods for the economic evaluation of health care programmes. Yanagisawa M. Schiffmann S N et al. the NNT to save one life is 40. 3rd edn. whether beneficial or adverse). 2nd edn. Meisner M J 1987 Statistical methods and the applications of bioassay. if drug B halves the mortality of a rarely fatal disease (reducing it from 5% to 2. Furberg C D. Br Med J 315: 617–619 (Thoughtful review on the strengths and weaknesses of meta-analysis) Sackett D L. information that is helpful in guiding therapeutic choices. Haerri A 1996 Transgenic technology: principles. NNT = 3. contrary to common belief. Snaith M 2002 Transgenic systems in drug discovery: from target identification to humanized mice. Thus of 100 patients treated with the drug. Nabeshima T 2000 Animal models of Alzheimer’s disease and evaluation of anti-dementia drugs. Br Med J 312: 71–72 (Balanced account of the value of evidence-based medicine—an important recent trend in medical thinking) Spiegelhalter D J. that placebos in general have no significant effect on clinical outcome. which shows. Myles J P. Br Med J 319: 508–512 (Short non-mathematical explanation of the Bayesian approach to data analysis) 97 . on average 33 will experience pain relief. Handb Exp Pharmacol 159 (A comprehensive series of review articles describing transgenic mouse models used to study different pharmacological mechanisms and disease states) Plueck A 1996 Conditional mutagenesis in mice: the Cre/loxP recombination system. 1997 Aggressiveness. including transgenics) Clinical trials Anturane Reinfarction Trial Research Group 1978 Sulfinpyrazone in the prevention of cardiac death after myocardial infarction. Jones D R. for minor unwanted effects. for major adverse effects.METHOD AND MEASUREMENT IN PHARMACOLOGY given amount). Stoddart G I. Eur J Pharmacol 375: 327–337 (Good review of uses of transgenic animals in pharmacological research. the principles apply generally) Ledent C.5%.e. the clinician must realise that to save one life with drug B. drug A is judged to be more valuable than drug B. i. These estimates of proportions of patients showing beneficial or harmful reactions can be expressed as number needed to treat (NNT. Int J Exp Pathol 77: 247–250 (Short review article) Offermanns S. St Louis (Standard textbook) Hróbjartsson A. Moehler H 1999 Genetically modified animals in pharmacological research: future trends. say). Sterne J A C 2003 Medical statistics. Mosby. Thus if drug A halves the mortality of an often fatal disease (reducing it from 50% to 25%. Drug Discov Today 7: 463–470 Yamada K. 1988 A novel potent vasoconstrictor peptide produced by vascular endothelial cells. N Engl J Med 344: 1594–1601 (An important survey of clinical trial data. O’Brien B. say). even though both reduce mortality by a half. Notwithstanding other considerations. Kimura S et al. whereas only 4 are exposed for each life saved with drug A. NNT = 22. Nature 332: 411–415 (The first paper describing endothelin—a remarkably full characterisation of an important new mediator) Bioassay Laska E M. the number of patients who need to be treated in order for one to show the given effect. Int J Exp Pathol 77: 269–278 (An emerging technology for allowing genes to be switched on or off during the lifetime of an animal) Polites H G 1996 Transgenic model applications to drug discovery. NNT = 3. 1996 Evidence-based medicine: what it is and what it isn’t. Grøtsche P C 2001 Is the placebo powerless? An analysis of clinical trials comparing placebo with no treatment. Kurihara H. JAMA 247: 1707–1714 (A trial that was terminated early when clear evidence of benefit emerged) Friedman L M. 40 patients must be exposed to a risk of adverse effects. Blackwell. Oxford (Includes good explanation of the principles of pharmacoeconomics) Kirkwood B R. including transgenics. One advantage of this type of analysis is that it can take into account the underlying disease severity in quantifying benefit. 1. N Engl J Med 298: 289–295 (Example of a large-scale clinical trial) Beta-blocker Heart Attack Trial Research Group 1982 A randomised trial of propranolol in patients with acute myocardial infarction. Haycocks A 1997 Pharmacoeconomics: basic concepts and terminology. except—to a small degree—in pain relief trials) Naylor C D 1997 Meta-analysis and the metaepidemiology of clinical research. and (b) adverse effects of defined degree. Braunholtz D 2000 Who’s afraid of Thomas Bayes? J Epidemiol Community Health 54: 731–739 (Explains the principles of Bayesian analysis in a non-mathematical way) Walley T. Hein L (eds) 2004 Transgenic models in pharmacology. the findings were: for benefit (a defined level of pain relief). Nature 388: 674–676 (Examples of the use of a transgenic model to study receptor function) Maerki U. For example. although the focus is on one disorder. in a recent study of pain relief by antidepressant drugs compared with placebo. Veaugois J-M. Malden (Clear introductory textbook covering statistical principles and methods) Lilford R J. Furthermore. Pharmacol Ther 88: 93–113 (Good review of models of Alzheimer’s disease. Nat Rev Neurosci 2: 325–332 (Review of the various approaches to producing valid models for Parkinson’s disease. hypoalgesia and high blood pressure in mice lacking the adenosine A2a receptor. Rosenburg W M C. the NNT to save one life is 4. DeMets D L 1996 Fundamentals of clinical trials. Oxford University Press. Annu Rev Pharmacol 27: 385–397 (Useful references for those concerned with statistical principles of assay design and analysis) Animal models Beal M F 2001 Experimental models of Parkinson’s disease. Muir-Gray J A et al. including application to disease models) Törnell J.

Sign up to vote on this title
UsefulNot useful