Example 1: Clinical Trial (Confirmatory Study) Introduction Objective The study was conducted to determine the effectiveness of (new treatment regime) compared to (the current regime) in treating oral cancer patients. Methods Design A randomized clinical active-controlled trial was conducted in which the outcome of two study groups, i.e. survival, was compared between the groups. Experimental group was given (proposed new treatment regime) and control group was given (current treatment regime). ……………. ……………. Sample size determination Sample size was determined using PS software version 2.1.31 (Dupont and Plummer, 1997). We need 154 oral cancer patients in each study group (experimental and control) in order to detect a hazard ratio of 1.5 (Hazard ratio of control relative to experimental group) with power 80% and alpha 0.05 according to the software. For this calculation, we estimated the median survival time for control group as 50 months (reference?) and the recruitment and follow up periods as 12 and 36 months respectively. Finally, we determined to take 133 patients for each group anticipating 10% loss of follow-up. obtain median survival time and three- and five-years survival rates for each study group and to compare two survival using univariable log-rank test. Although randomization was done, there were differences in age, sex, and the cancer stage distribution between the two study groups. Therefore, to control these possible confounding effects, we used multiple Cox regression analysis as follows. Firstly, preliminary main effect model was fit with the grouping variable (study group) and the controlled variables (age, sex, and stage of cancer) as independent variables. At this stage, the linearity of age variable with log relative hazard was checked by replacing the numerical covariate with design variables (Hosmer and Lemeshow, 1999) (p 160). As it was not linear in multiple Cox regression, the categorical form of the age variable was used. Secondly, the model was evaluated for possible multicollinearity (MC) problem by obtaining variance inflation factors (VIF) for independent variables using multiple linear regression. As all VIFs were small (maximum 1.5), we considered there was no MC problem. Thirdly, the possible two-ways interactions between the grouping variable and each controlled variable were tested by adding one interaction term at a time in Cox regression. If an interaction term was statistically significant in the model (P<0.05), the model was continued with the interaction term. Finally, the assumption of proportional hazard (PH) for each independent variable was checked using a residual plot (Schoenfeld residuals of a particular independent variable versus survival time variable). If the residuals were randomly scattered along the zero line without having any trend or pattern, it is considered PH assumption was met. The possible outliers were identified by residual plots (each DfBeta versus survival time). If a data point is outside the main stream of the majority of data points, it is considered an outlier. Then the effect of outlier on the result (regression coefficients) was evaluated by calculating the change in the coefficients in terms of percentage if we included the outlier. If the change was more than 20%, we addressed the issue of the outlier in the interpretation of results. We used SPSS version 12.0 (SPSS Inc., 2003) for all analyses. Statistical significance was considered if P<0.05. Two-sided tests were used in all hypothesis testing. Reference Dupont WD, Plummer WD (1997). PS power and sample size program available for free on the Internet. Controlled Clin Trials, 18:274. SPSS Inc. (2003). SPSS for Windows, version 12.0.1. SPSS Inc.:Chicago. Hosmer DW, Lemeshow S (1999). Applied Survival Analysis: Regression mModeling of Time to Event Data. John Wiley & Son, Inc.: New York.

Statistical analysis The study sample was described for demographic characteristics using descriptive statistics. The comparison of survival between two study groups was done using Kaplan-Meier method and Cox regression analysis. Kaplan-Meier method was used to

Statistical analysis The study sample was described for demographic characteristics using descriptive statistics. If a data point is outside the main stream of the majority of data points. At this stage. it is considered PH assumption was met. the model was evaluated for possible multicollinearity (MC) problem by obtaining variance inflation factors (VIF) for independent variables using multiple linear regression analysis. Variables excluded by two selection methods were also confirmed by adding into the model and tested using LR test. three.. The median survival time.How to write in Methods Example 2: Exploratory Study Introduction Objective The study was conducted to identify prognostic factors of oral cancer in our setting. 2003) for all analyses. the procedure was continued with the interaction term. i. If it was not linear. Thirdly. ……………. Variables selected by each method were tested using LR test.05 as entry criteria and P>0.0 (SPSS Inc.05. comparing the model with all selected variables (full model) and the model with all selected variables except the variable being tested (reduced model).and five-years survival rate. Secondly. We used SPSS version 12. we decided to recruit minimum 384 oral cancer patients. the Methods Design A prospective study was conducted for a period of five years. we considered there was no MC problem. Statistical significance was considered if P<0.05). the assumption of proportional hazard (PH) for each independent variable was checked using a residual plot (Schoenfeld residuals of a particular independent variable versus survival time variable).e. ……………. we addressed the issue of the outlier in the interpretation of results. simple Cox regression analysis was done for each independent variable (prognostic factors in question). As all VIFs were small (maximum 1. Two-sided tests were used in all hypothesis testing. they were considered truly non significant variables. it is considered an outlier. If an interaction term was statistically significant in the model (P<0.1 as removal criteria. . The possible outliers were identified by residual plots (each DfBeta versus survival time). Among several potential prognostic factors. considering 10% loss of follow-up. Next.5:1 (40% smoking prevalence. 1997). the preliminary main effect model was determined.5).31 (Dupont and Plummer. the linearity of numerical independent variable with log relative hazard was checked using quartile design method (Hosmer and Lemeshow. reference?). Finally. The ratio of sample sizes between non-smoker and smoker was estimated as 1. Sample size determination Sample size was determined using PS software version 2. the categorical form of the variable was used. In this way. we estimated the median survival time for control group as 50 months (reference?) and the recruitment and follow up periods as 1 year and 4 years respectively. The prognostic factors were identified using multiple Cox regression analysis. preliminary main effect model was determined by using forward and backward LR stepwise variable selection methods using P<0. We need 138 smokers and 207 nonsmokers (a total of 345) to detect a hazard ratio of 1. Two methods were used as they might select different sets of variables.1.05. Then the effect of outlier on the result (regression coefficients) was evaluated by calculating the change in the coefficients in terms of percentage if we included the outlier. Patients were recruited during the first year and they were followed up for the later four years. Finally. For this calculation. If the change was more than 20%. If LR test was not significant. Firstly. 1999). .5 (hazard ratio of smokers relative to non-smokers) with power 80% and alpha 0. If the residuals were randomly scattered along the zero line without having any trend or pattern. the possible two-ways interactions between the independent variables were tested by adding one interaction term at a time in Cox regression using LR test. Reference As in Example 1. and overall the survival curve for the study sample were described using Kaplan-Meier method. calculating the sample size with smoking variable gave the biggest sample size.

Therefore.edu/twiki/bin/view/Main/PowerSampleSize :::::::::: xxxxxxxxxxxx END xxxxxxxxxxxx :::::::::: . In our example. smokers (experimental group) is higher risk to die (higher hazard) and hazard ratio (HR) of 1. the size for control group (non-smokers) will be 1. it goes inverse. 'm' is the ratio of sample size of control to experimental. 'R' also needs a proper attention. it is expected that experimental group have lesser hazard. http://biostat. 'R' and 'm' need a special attention. we consider non-smokers as control and smokers as experimental (although it is not truly experimental).g. in a true experimental study. we have to inverse it (1/1.5:1. Therefore. It is a hazard ratio of control relative to experimental (You can click on 'R' and see how PS defines 'R'). the ratio of non-smokers versus smokers. and therefore.5 times of 138 (207). Therefore.5 is hazards/risk in experimental group relative to hazards/risk in control group. 40% in literature.How to write in Methods CAUTION: In PS software. In our case.mc. we set (m = 1. As the prevalence of smokers was. This HR 1. This is an inverse of 'R' used in PS software. The sample size that shown by the software (138) is the size for experimental group (smokers). However.5). and therefore 'R' is expected to be >1. a total of 345 was required. e.5 = 0. gives 1.5 is aimed to be detected. 60:40.vanderbilt. Normally. in practice.67) to suit the definition used in PS software.

