Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Save to My Library
Look up keyword
Like this
9Activity
0 of .
Results for:
No results containing your search query
P. 1
Sample Size Computations and Power Analysis With the SAS System

Sample Size Computations and Power Analysis With the SAS System

Ratings:

4.5

(2)
|Views: 7,050|Likes:
Statistical power analysis characterizes the ability of a study to detect a meaningful effect size—for example, the difference between two population means. It also determines the sample size required to provide a desired power for an effect of scientific interest. Proper planning reduces the risk of conducting a study that will not produce useful results and determines the most sensitive design for the resources available. Power analysis is now integral to the health and behavioral sciences, and its use is steadily increasing wherever empirical studies are performed. SAS Institute is working to implement power analysis for common situations such as t -tests, ANOVA, comparison of binomial proportions, equivalence testing, survival analysis, contingency tables and linear models, and eventually for a wide range of models and designs. An effective graphical user interface reveals the contribution to power of factors such as effect size, sample size, inherent variability, type I error rate, and choice of design and analysis. This presentation demonstrates issues involved in power analysis, summarizes the current state of methodology and software, and outlines future directions.
Statistical power analysis characterizes the ability of a study to detect a meaningful effect size—for example, the difference between two population means. It also determines the sample size required to provide a desired power for an effect of scientific interest. Proper planning reduces the risk of conducting a study that will not produce useful results and determines the most sensitive design for the resources available. Power analysis is now integral to the health and behavioral sciences, and its use is steadily increasing wherever empirical studies are performed. SAS Institute is working to implement power analysis for common situations such as t -tests, ANOVA, comparison of binomial proportions, equivalence testing, survival analysis, contingency tables and linear models, and eventually for a wide range of models and designs. An effective graphical user interface reveals the contribution to power of factors such as effect size, sample size, inherent variability, type I error rate, and choice of design and analysis. This presentation demonstrates issues involved in power analysis, summarizes the current state of methodology and software, and outlines future directions.

More info:

Published by: Ahmad Abdullah Najjar on Aug 08, 2008
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

08/01/2010

pdf

text

original

 
Paper 265-25
Sample Size Computations and Power Analysiswith the SAS
®
System
John M. Castelloe, SAS Institute Inc., Cary, NC
Abstract
Statistical power analysis characterizes the ability ofa study to detect a meaningful effect size—for exam-ple, the difference between two population means.It also determines the sample size required to pro-vide a desired power for an effect of scientific inter-est. Proper planning reduces the risk of conductinga study that will not produce useful results and de-termines the most sensitive design for the resourcesavailable. Power analysis is now integral to the healthand behavioral sciences, and its use is steadily in-creasing wherever empirical studies are performed.SAS Institute is working to implement power analy-sis for common situations such as
-tests, ANOVA,comparison of binomial proportions, equivalence test-ing, survival analysis, contingency tables and linearmodels, and eventually for a wide range of modelsand designs. An effective graphical user interface re-veals the contribution to power of factors such as ef-fect size, sample size, inherent variability, type I errorrate, and choice of design and analysis. This presen-tation demonstrates issues involved in power analy-sis, summarizes the current state of methodology andsoftware, and outlines future directions.
Introduction
Suppose you have performed a small study and aredisappointed to find that the results are unexpectedlyinsignificant. Where did you go wrong? You may needto do a larger study to detect the effect you suspect,but how much larger?Alternatively, suppose you have performed a largestudy and found a
hugely 
significant effect. In follow-up studies, can you make more efficient use of re-sources by using smaller sample sizes?Power analysis can optimize the resource usage anddesign of a study, improving chances of conclusiveresults with maximum efficiency. Power analysis ismost effective when performed at the study planningstage, and as such it encourages early collaborationbetween researcher and statistician. It also focusesattention on effect sizes and variability in the underly-ing scientific process, concepts that both researcherand statistician should considercarefully at this stage.Muller and Benignus (1992) and O’Brien and Muller(1993) provide cogent discussions of these and re-lated concepts. These references also provide a goodgeneral introduction to power analysis.There are many factors involved in a power analysis,such as the research objective, design, data analysismethod, power, sample size, type I error, variability,and effect size. By performing a power analysis, youcan learn about the relationships between these fac-tors, optimizing those that are under your control andexploring the implications of those that are fixed.For the purposes of statistical testing, the researchobjective is usually to use a feasible sample of data toassess a given hypothesis,
À 
½ 
, that some effect ex-ists in a much larger population of potential data. Ifthe sample data lead you to conclude that
À 
½ 
is true,but the opposite is really the case—that is, if the (null)hypothesis
À 
¼ 
is true that there really is
no 
effect—this is called a
type I error 
. The probability of a type Ierror is usually designated “alpha” or
« 
, and statisticaltests are designed to ensure that
« 
is suitably small(for example, less than 0.05). But it is also impor-tant to control the probability
¬ 
of making the oppo-site (
type II 
) error—that is, concluding
À 
¼ 
, that thereis no effect, when there really is one. The probability
½ 
 
¬ 
of correctly rejecting
À 
¼ 
when it is false is tradi-tionally called the
power 
of the test. (Note, however,that another more technical definition of power is theprobability of rejecting
À 
¼ 
for any given set of circum-stances, even those corresponding to
À 
¼ 
being true.)Power analysis is often problematic in practice, be-ing performed infrequently or improperly. There areseveral reasons for this: it is technically complicated,usually under-represented in statistical curricula, andoften not performed early enough to be effective (thatis, in the study planning stage). Good software tools1
 
for power analysis can alleviate these difficulties andhelp you to benefit from these techniques.
Some Power Analysis Scenarios
Thereare severaldifferent varieties of poweranalysis.Here are a few simple scenarios:
¯ 
A statistician in a manufacturing company is re-viewing a proposed experiment designed to as-sess the effect of various operating conditionson the quality of a product. He would like to con-duct a power analysis to see if the planned num-ber of replications and experimental units will besufficient to detect the postulated effects.
¯ 
An advertising executive is interested in study-ing severalalternative marketing strategies, withthe aim of deciding how many and which strate-gies to implement. She would like to get a ball-park idea of how many mailings are necessaryto detect differences in response rates.
¯ 
A study performed by a behavioral scientist(without a prior power analysis) did not detect asignificant difference between success rates oftwo alternative therapies. He is considering re-peating the study, but would first like to assessthe power of the first study to detect the minimaleffect size in which he is interested. A findingof low power would encourage him to repeat thestudy with a larger sample size or more efficientdesign.Perhaps the most basic distinction in power analysisis that between
prospective 
and
retrospective 
analy-ses. In the examples above, the first two are prospec-tive, while the third is retrospective. A prospectivepower analysis looks ahead to a future study, whilea retrospective power analysis attempts to character-ize a completed study. Sometimes the distinction is abit fuzzy: for example, a retrospective analysis for arecently completed study can become a prospectiveanalysis if it leads to the planning of a new study toaddress the same research objectives with improvedresource allocation.Although a retrospective analysis is the most conve-nient kind of power analysis to perform, it is often un-informative or misleading, especially when power iscomputed for the observed effect size. See the sec-tion “Effective Power Analysis” for more details.Power analyses can also be characterized by the fac-tor(s) of primary interest. For example, you mightwant to estimate power, determine required samplesize, or assess detectable effect sizes. Sometimesthe researchgoal involves the largest acceptable con-fidence interval width instead of the significance ofa hypothesis test; in this case, rather than consid-ering the criterion of power, you would focus on theprobability of achieving the desired confidence inter-val precision. There are also Bayesian approaches tosample size determination for estimating parametersor maximizing utility functions.
Example: Prospective Analysis for aClinical Trial
The purpose of this example is to introduce some ofthe issues involved in power analysis and to demon-strate the use of some simple SAS
®
software toolsfor addressing them. Let’s say you are a clinical re-searcher wanting to compare the effect of two drugs,A and B, on systolic blood pressure (SBP). You haveenough resourcestorecruit25 subjects foreach drug.Will this be enough to ensure a reasonable chance ofestablishing a significant result if the mean SBPs ofpatients on each drug really differ? In other words,will your study have good power? The answer de-pends on many factors:
¯ 
Howbig is the underlying effectsize that you aretrying to detect? That is, what is the populationdifference in mean SBP between patients usingdrug A and patients using drug B? Of course,this is unknown; that is why you are doing thestudy! But you can make an educated guess orset a goal for the detectable effect size. Thenthe power analysis determines the chance ofdetecting this conjectured effect size. For ex-ample, suppose you have some results from aprevious study involving drug A, and you believethat the mean SBP for drug B differs by about10% from the mean SBP for drug A. If the meanSBP for drug A is 120, you thus posit an effectsize of 12.
¯ 
Whatis theinherentvariability inSBP? Supposeprevious studies involving drug A have shownthe standard deviation of SBP to be between 11and 15, and that the standard deviations are ex-pected to be roughly the same forthe two drugs.You want to consider this range of variability inyour power analysis.
¯ 
What data analysis method and level of type Ierror should you use? You decide to use thesimple approach of a two-sample
-test (assum-ing equal variances) with
« 
= 0.05. To be con-servative you use a two-sided test, although yoususpect the mean SBP for drug B is higher.With thesespecifications, the powercan becomputed2
 
using the noncentral F distribution. The following SASstatements compute this power for the standard devi-ation of 15:
data twosample; Mu1=120; Mu2=132; StDev=15; N1=25; N2=25; Alpha=0.05; NCP = (Mu2-Mu1)**2/((StDev**2)*(1/N1 + 1/N2));CriticalValue = FINV(1-Alpha, 1, N1+N2-2, 0);Power = SDF(’F’, CriticalValue,1, N1+N2-2, NCP); proc print data=twosample;run;
The noncentrality parameter NCP is calculated fromthe conjectured means Mu1 and Mu2, sample sizesN1 and N2, and common standard deviation StDev.The critical value of the teststatistic is then computed,and the power is the probability of a noncentral-F ran-dom variable with noncentrality parameter NCP, onenumerator degree of freedom, and N1 + N2
 
2 de-nominator degrees of freedom exceeding this criticalvalue. This probability is computed using the DATAstep function SDF, which calculates survival distribu-tion function values. In general, SDF = 1
 
CDF; theSDF form is more accurate for values in the uppertail. The CDF and SDF functions, introduced in Re-lease 6.11 and 6.12 of the SAS System, respectively,are documented in SAS Institute Inc. (1999b). Theiruse is recommended for applications requiring theirenhanced numerical accuracy.The resulting power is about 79%. If you would reallylike a power of 85% or more when the standard de-viation is 15, then you will need more subjects. Howmany? One way to investigate required sample sizeis to construct a power curve, as shown in Figure 1.This curve was generated using the Sample Size taskin the Analyst Application. Note that a sample size of30 for each group would be sufficient to achieve 85%power.Now suppose that a colleague brings to your atten-tion the possibility of using a simple AB/BA cross-overdesign. Half of the subjects would get 6 weeks ondrug A, a 4-week washout period, and then 6 weekson drug B; the other half would follow the same pat-tern but with drug order switched. Assuming thereare no period or carry-over effects, you can use apaired
-test to assess the difference between the twodrugs. Each pair consists of the SBP for a patientwhile using drug A and the SBP for that same patientwhile using drug B. Suppose previous studies haveshown that there is correlation of roughly
 
= 0.8 be-tween pairs of SBP measurements for each subject.What would the power for the study be if you use this
Figure 1.
Power Curve for Two-Sample
-testcross-over design with 25 subjects? You simply needto calculate thestandard deviation of apair difference,which is given by
 
¡ 
 
Õ   
 
¾ ½ 
· 
 
¾ ¾ 
 
¾ 
 
½ 
 
¾ 
where
 
½ 
and
 
¾ 
are the standard deviations for thetwo drug types (assumed to be equal in this case).The resulting values are
 
¡ 
= 6.96 when
 
½ 
=
 
¾ 
= 11,and
 
¡ 
= 9.49 when
 
½ 
=
 
¾ 
= 15. The following SASstatements compute the power for the larger standarddeviation:
data paired; Mu1=120; Mu2=132; StDev1=15;StDev2=15; Corr=0.8; N=25; Alpha=0.05;StDevDiff = sqrt(StDev1**2 +StDev2**2 -2*Corr*StDev1*StDev2); NCP = (Mu2-Mu1)**2 /(StDevDiff**2/N);CriticalValue = FINV(1-Alpha, 1, N-1, 0);Power = SDF(’F’, CriticalValue,1, N-1, NCP); proc print data=paired;run;
The resulting power is over 99% with 25 subjects.A power curve generated using the Analyst Applica-tion, displayed in Figure 2, reveals that 85% power3

Activity (9)

You've already reviewed this. Edit your review.
1 hundred reads
1 thousand reads
spuzzar liked this
spuzzar liked this
serdaralpan liked this
tomor2 liked this
Osmosis66 liked this
acansiz liked this

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->