You are on page 1of 8

Paper PO06

Randomization in Clinical Trial Studies


David Shen, WCI, Inc.
Zaizai Lu, AstraZeneca Pharmaceuticals

ABSTRACT

Randomization is of central importance in clinical trials. It prevents selection bias and


insures against accidental bias. It produces comparable groups, and eliminates the source
of bias in treatment assignments. Finally, it permits the use of probability theory to
express the likelihood of chance as a source for the difference between outcomes. This
paper discusses four common randomization methods. SAS implementation of
randomization is provided with RANUNI and RANOR functions, PROC
SURVEYSELECT and PROC PLAN.

INTRODUCTION

A good clinical trial minimizes variability of the evaluation and provides an unbiased
evaluation of the intervention by avoiding confounding from other factors.
Randomization insures that each patient have an equal chance of receiving any of the
treatments under study, generate comparable intervention groups which are alike in all
important aspects except for the intervention each group receives. It also provides a basis
for the statistical methods used in analyzing data.

WHY RANDOMIZATION

The basic benefits of randomization include


1. Eliminates selection bias.
2. Balances arms with respect to prognostic variables (known and unknown).
3. Forms basis for statistical tests, a basis for an assumption-free statistical test of the
equality of treatments.
In general, a randomized trial is an essential tool for testing the efficacy of the treatment.

CRITERIA FOR RANDOMIZATION

1. Unpredictability
• Each participant has the same chance of receiving any of the interventions.
• Allocation is carried out using a chance mechanism so that neither the participant
nor the investigator will know in advance which will be assigned.
2. Balance
• Treatment groups are of a similar size & constitution, groups are alike in all
important aspects and only differ in the intervention each group receives
3. Simplicity
• Easy for investigator/staff to implement

METHODS OF RANDOMIZATION

The common types of randomization include (1) simple, (2) block, (3) stratified and (4)
unequal randomization. Some other methods such as biased coin, minimization and
response-adaptive methods may be applied for specific purposes.

1. Simple Randomization
This method is equivalent to tossing a coin for each subject that enters a trial, such as
Heads = Active, Tails = Placebo. The random number generator is generally used. It is
simple and easy to implement and treatment assignment is completely unpredictable.
However, it can get imbalanced in treatment assignment, especially in smaller trials.
Imbalanced randomization reduces statistical power. In trial of 10 participants, treatment
effect variance for 5-5 split relative to 7-3 split is (1/5+1/5)/(1/7+1/3)=.84, so 7-3 split is
only 84% as efficient as 5-5 split. Even if treatment is balanced at the end of a trial, it
may not be balanced at some time during the trial. For example, the trial may be balanced
at end with 100 participants, but the first 10 might be AAAATATATA. If the trial is
monitored during the process, we’d like to have balance in the number of subjects on
each treatment over time.

2. Block Randomization
Simple randomization does not guarantee balance in numbers during trial. Especially, if
patient characteristics change with time, (e.g. early patients sicker than later), early
imbalances can't be corrected. Block randomization is often used to fix this issue.
The basic idea of block randomization is to divide potential patients into m blocks of size
2n, randomize each block such that n patients are allocated to A and n to B.
then choose the blocks randomly. This method ensures equal treatment allocation within
each block if the complete block is used.
Example: Two treatments of A, B and Block size of 2 x 2= 4
Possible treatment allocations within each block are
(1) AABB, (2) BBAA, (3) ABAB, (4) BABA, (5) ABBA, (6) BAAB
Block size depends on the number of treatments, it should be short enough to prevent
imbalance, and long enough to prevent guessing allocation in trials. The block size
should be at least 2x number of treatments (ref ICH E9). The block size is not stated in
the protocol so the clinical and investigators are blind to the block size.
If blocking is not masked in open-label trials, the sequence becomes somewhat
predictable (e.g. 2n= 4):
B A B ? Must be A.
A A ? ? Must be B B.
This could lead to selection bias. The solution to avoid selection bias is (1).Do not reveal
blocking mechanism. (2). Use random block sizes.
If treatment is double blinded, selection bias is not likely. Note if only one block is
requested, then it produces a single sequence of random assignment, i.e. simple
randomization.

3. Stratified Randomization
Imbalance randomization in numbers of subjects reduces statistical power, but imbalance
in prognostic factors is also more likely inefficient for estimating treatment effect. Trial
may not be valid if it is not well balanced across prognostic factors.
For example, with 6 diabetics, there is 22% chance of 5-1 or 6-0 split by block
randomization only. Stratified randomization is the solution to achieve balance within
subgroups: use block randomization separately for diabetics and non-diabetics.
For example, Age Group: < 40, 41-60, >60; Sex: M, F
Total number of strata = 3 x 2 = 6
Stratification can balance subjects on baseline covariates, tend to produce comparable
groups with regard to certain characteristics (e.g., gender, age, race, disease severity),
thus produces valid statistical tests.
The block size should be relative small to maintain balance in small strata. Increased
number of stratification variables or increased number of levels within strata leads to
fewer patients per stratum. Subjects should have baseline measurements taken before
randomization. Large clinical trials don’t use stratification. It is unlikely to get imbalance
in subject characteristics in a large randomized trial.

4. Unequal Randomization
Most randomized trials allocate equal numbers of patients to experimental and control
groups. This is the most statistically efficient randomization ratio as it maximizes
statistical power for a given total sample size. However, this may not be the most
economically efficient or ethically/practically feasible. When two or more treatments
under evaluation have a cost difference it may be more economically efficient to
randomize fewer patients to the expensive treatment and more to the cheaper one. The
substantial cost savings can be achieved by adopting a smaller randomization ratio such
as a ratio of 2:1, with only a modest loss in statistical power. When one arm of the
treatment saves lives and the other such as placebo/medical care only does not much to
save them in the oncology trials. The subject survival time depends on which treatment
they receive. More extreme allocation may be used in these trials to allocate fewer
patients into the placebo group. Generally, randomization ratio of 3:1 will lose
considerable statistical power, more extreme than 3:1 is not very useful, which leads to
much larger sample size.

SAS IMPLEMENTATION

1. SAS Random Number Generators


SAS provides several functions to work as random number generators:
• RANUNI: generates random numbers between 0 and 1 which have a uniform
distribution.
• RANNOR: generates random numbers with a standard normal ~N(0, 1)
distribution
• RANBIN: generates random numbers with a binomial distribution
Random number generators are used in producing randomization schedules for clinical
trials or carrying out simulation studies. Subjects are supposed to get either a drug or a
placebo with equal probability. Assume the variable GROUP represents assignment:
Group = 'A' or Group = 'P'. RANUNI generates random number R between 0 and 1. If R
is less than .5, then it is assigned to Group = 'P'. If R is greater than or equal to .5, then is
assigned to Group = 'A'. The code that does this is the following:

data ONE;
seed=123;
do i=1 to 100;
r = ranuni(seed);
if r<.5 then group = 'A';
else group = 'P';
output;
end;
run;

proc freq data=one;


tables group;
run;

The SEED for the random number generator determines the starting value. The same
positive SEED in the program always generates the same results. However, if SEED is 0
or negative number, the result will be different each time. When 0 or negative number as
the seed, SAS chooses the current computer clock time value as the seed. The result is
completely impossible to predict, but it is not generally recommended. You need to select
a beginning seed value so that you could reproduce the results by the same seed value at a
later date. Otherwise you may have to wait for thousand of years to get the same result.
Note that in this example, the treatment assignments are unbalanced from the result of
PROC FREQ: there are 56 assignments to placebo P and only 44 assignments to active
treatment. This is not an unusual imbalance. The following code can put same number of
subjects into each group by sorting the random number, then assigning drug and placebo
to the random sequence.
data ONE;
seed=123;
do i=1 to 100;
r = ranuni(seed);
output;
end;
run;

proc sort data=ONE;


by r;
run;

data TWO;
set ONE;
if _n_ <=50 then group='A';
else group='P';
run;

How if we want to split 100 subjects into more than 2 treatment groups? PROC RANK
can easily accomplish this.

proc rank data=ONE groups = 5 out=THREE;


var r;
ranks group;
run;

PROC RANK collapses or categorizes the values of numeric variable R in data set ONE
and creates new data set THREE. The new variable GROUP created by PROC RANK
indicates observation membership in the ranking or grouping variable. Option GROUPS=
N, N is the number of groups to create.

RANNOR is another SAS random number generator. It produces random numbers which
have a normal distribution with mean 0 and standard deviation 1. RANNOR is used in
much the same way as RANUNI.

2. PROC SURVEYSELECT.
This procedure is originally designed to analyze very large data but to work with a
relatively small random sample. The SURVEYSELECT procedure provides a variety of
methods for selecting probability-based random samples. It can select a simple random
sample or can sample according to a complex multistage sample design that includes
stratification and unequal randomization.
The following is the simple randomization.
data ONE;
do i=1 to 100;
output;
end;
run;

proc surveyselect data=ONE method=srs n= 50 out=TWO;


run;

The method=SRS specifies simple randomization sampling, in which each subject has an
equal probability of selection and sampling is without replacement. N=50 option specifies
a samples size. OUT= option stores the sample data. If we define subjects in TWO as
active treatment, then the rest of subjects in ONE will be treated with placebo. The
following information is displayed in OUTPUT, which summarizes the sample selection.

The SURVEYSELECT Procedure


Selection Method Simple Random Sampling
Input Data Set ONE
Random Number Seed 56895
Sample Size 50
Selection Probability 0.5
Sampling Weight 2
Output Data Set TWO
The random number seed is 56895. Since the seed= option is not specified in the proc
statement, the seed values in obtained using the time from computer’s clock. You can
specify SEED=56895 to reproduce this sample. It is recommended that a random seed
should be specified, so that the sample can be replicated.
In the next example. dataset ONE has 100 subjects, of which 20 are male. We’d like to
randomly split them into two treatment groups and also ensure each group has equal
number of males, i.e. 10 males in each group.
data ONE;
do n=1 to 100;
if n<=20 then sex='M';
else sex='F';
output;
end;
run;

proc sort data=one;


by sex;
run;

proc surveyselect data=one method=srs n=(40 10) out=TWO;


strata sex;
run;

Stratification is added to the sampling. Random samples are selected independently


within the strata. N=(40 10) requests that 40 subjects from Female and 10 subjects from
Male. PROC SURVEYSELECT requires that the input dataset sorted by the STRATA
variables. The PROC FREQ with TABLES SEX displays the sampling result as we
expected.

sex Frequency Percent


F 40 80.00
M 10 20.00

The N= option can be replaced by rate=(0.5, 0.5) alternatively. RATE is the percentage
of observations to select from each strata, 50% from Female and 50% from Male in this
example. The rate can be adjusted for unequal randomization. The following
randomization selects 25 subjects. Suppose that they are put into placebo group, the rest
of subjects will be in the active treatment group. The randomization ratio is 1:3, which is
also stratified by SEX.

proc surveyselect data=one method=srs rate=(0.25, 0.25) out=THREE;


strata sex;
run;

3. PROC PLAN.
The PLAN procedure is designed specifically for more complex designs and
randomization plans such as factorial, nested and crossed experiments, and Latin square
designs. It can also be used in many basic randomization designs. The syntax is
somewhat tricky, so care should be taken when using the procedure.
The first example is the simple randomization to divide 12 subjects into 3 treatments.
proc plan;
factors Subject=12 ;
treatments Group=12 cyclic (1 1 1 1 2 2 2 2 3 3 3 3 );
output out= ONE;
run; quit;

Simple Randomization with 3 Levels of Treatments

Subject Group
1 3
2 2
3 3
4 1
5 1
6 3
7 2
8 3
9 2
10 1
11 2
12 1
Once again, a SEED should be applied, otherwise SAS generates its own seed, and this
seed will be displayed in LOG: At the start of processing, random number seed=37786.

Our next example is about the block randomization design for 12 subjects: 2 treatments
of A & B, block size 2x2=4 and 12/4 =3 blocks.
PROC PLAN SEED=12345678 ;
FACTORS Block=3 random Size=4 random;
OUTPUT out =C Size cvals = ('A' 'A' 'B' 'B' );
RUN;

It can bee seen that two treatments are always balanced in each block.

Block Randomization Design With 3 Blocks of Size 4, Treatments of A & B


Obs Block Size
1 1 B
2 1 A
3 1 B
4 1 A
5 2 A
6 2 B
7 2 B
8 2 A
9 3 B
10 3 B
11 3 A
12 3 A

CONCLUSION

Randomization in clinical trial is convenient with the power of SAS. The randomization
numbers generated will be stored in the central computer center (CORE) or put into
sealed envelopes (opaque, not resealable). Each subject must have a unique identification
number and keep that number throughout the study. Subject should be determined to be
eligible by uniform and clear eligibility criteria and have signed the ICF before
randomization. The subject’s randomization number can be obtained by calling
randomization center through IVRS or accessing the web-based central randomization
system.

CONTACT INFORMATION

Zaizai Lu
zz_lu@hotmail.com
AstraZeneca Pharmaceuticals
Wilmington, Delaware

SAS and all other SAS Institute Inc. product or service names are registered trademarks
or trademarks of SAS Institute Inc. in the USA and other countries.  indicates USA
registration. Other brand and product names are registered trademarks or trademarks of
their respective
companies.

You might also like