N, Type of Independent Variables |
Types of Analyses/Test(s) |
Power Analysis Example |
|
One dependent variable |
|||
0 IVs (1 group/population) |
interval & normal |
one-sample t-test |
|
ordinal or interval |
one-sample median |
||
categorical (2 categories) |
binomial test |
||
categorical |
chi-square goodness-of-fit |
||
1 IV with 2 levels (independent groups) |
interval & normal |
2 independent sample t-test |
|
ordinal or interval |
2 independent sample t-test |
||
Wilcoxon-Mann Whitney test |
|||
categorical |
chi- square test |
||
Fisher's exact test |
|||
1 IV with 2 or more levels (independent groups) |
interval & normal |
one-way ANOVA |
|
ordinal or interval |
Kruskal Wallis |
||
categorical |
chi- square test |
||
1 IV with 2 levels (dependent / matched groups) |
interval & normal |
paired t-test |
|
ordinal or interval |
Wilcoxon signed ranks test |
||
categorical |
McNemar |
||
1 IV with 2 or more levels (dependent / matched groups) |
interval & normal |
repeated measures ANOVA |
|
ordinal or interval |
Friedman test |
||
categorical |
GEE (e.g. repeated measures logistic regression) |
||
2 or more IVs (independent groups) |
interval & normal |
multi-factor ANOVA general linear model (GLM) multilevel/hierarchical model (MLM/HLM) structural equation model (SEM) |
|
ordinal or interval |
(treat DV as interval/normal or create categories—use analyses above or below) |
||
categorical |
logistic regression |
||
2 or more IVs (dependent groups / repeated measures) |
interval & normal |
GLM, MLM/HLM, growth models, growth mixture models |
|
categorical |
generalized estimating equations (GEE), MLM/HLM, growth/mixture models |
||
1 interval IV |
interval & normal |
correlation |
|
1 or
more interval IVs and/or |
ordinal or interval |
linear/non-linear regression |
|
non-parametric correlation |
|||
categorical |
logistic regression |
||
interval & normal |
multiple regression |
||
1 or
more interval IVs and/or |
categorical |
analysis of covariance, GLM, MLM/HLM |
|
logistic regression, GEE |
|||
categorical Dichotomized (yes vs. no) |
discriminant analysis |
||
Survival (time to event) analysis |
|||
1 IV with 2 or more
levels |
interval & normal |
MANOVA, GLM, SEM |
|
More than one dependent variable |
|||
2 or more |
interval & normal |
regression, SEM |
|
0 |
interval & normal |
canonical correlation |
|
0 |
categorical |
latent variable models for categorical data |
|
Two sets of two or more dependent variables |
|||
0 |
interval & normal |
factor analysis latent variable models |
|
IV: independent variable; GEE: generalized estimating equations; ANOVA: analysis of variance; MANOVA: Multiple analysis of variance; SEM: structural equation model; GLM: generalized linear model; MLM: multilevel model; HLM: hierarchical linear model; *This chart was adapted from http://www.ats.ucla.edu/stat/mult_pkg/whatstat/choosestat.html which, in turn, was adapted from 'Choosing the Correct Statistic', http://bama.ua.edu/~jleeper/627/choosestat.html by James D. Leeper, Ph.D.
Each example must be specialized to your research design/analysis, e.g. with relevant changes to alpha and power, sample and group sizes, detectable effect size, references and/or software used for calculations, and references to supporting research.
Example 1: A sample size of (e.g., 150) should allow description of ASI drug use severity with a 95% confidence interval of +-.10 (e.g. mean=.85+-.10) assuming a standard deviation of about 0.65 as found in [provide a reference related to your study here].
Example 2: the sample size of (e.g., 1,200) will be sufficient to precisely estimate an ASI scores as small as (e.g., 0.08) standard deviations.
Example 3: mean of a continuous variable, such as drug use severity will be precisely estimated in a range of (e.g., 0.16) standard deviations at alpha of 0.05 and power of 0.80 for the sample size of (e.g., 240)
Example 4: From previous studies, standard deviation of mean ASI may range from 0.1 to 0.2. Sample sizes for each subgroup by county can detect treatment effect of small to medium size. For example, a sample size of (e.g., 240 patients) per county will allow detection of a significant treatment effect when the mean change on ASI score is as small as (e.g., 0.16) standard deviation, given an alpha level of 0.05 and power of 80%.
Example 1: A sample size of 100 should allow description with a 95% confidence interval of +-8% on questions with percentage response at about 50% yes/50% no. The confidence interval will be narrower for questions with a more extreme response split.
Example 2: Rates of categorical variables (e.g. cessation rate) will be precisely estimated in a range of (e.g., 8.4%) for the sample size of (e.g., 276), given an alpha level of 0.05 and power of 80%.
Example 3: The sample (e.g., 215) will also be sufficient to precisely estimate rates (e.g., morbidity, mortality). Given an alpha level of 0.05 and power of 80%, a low rate of 10 % will be estimated with a range of ± 6 %, a moderate rate of 30 % will be estimated within a range of ± 9 %, and an extreme rate of 50 % will be estimated within a range of ± 10 %.
Example 1: A sample size of (e.g., 75 per group) should allow detection of medium effects (of about d= 0.46) with power of 0.80, when comparing the experimental and control groups on the CESD scale using a t-test with 2-tailed a=0.05. This effect size translates to a difference of about 1.75 points on the CESD assuming a standard deviation of 3.80 (as was found in a related intervention by [provide a reference related to your study here]).
Example 2: the sample of (e.g., 296) per group will enable us to detect group differences of small effects (d = 0.17) with 2-tailed alpha of 0.05 and power of 0.80
Example 3: the sample of (e.g., 215 African-Americans and 84 white) will provide moderate power in estimating difference on means of continuous variables (e.g. severity index) or difference on rates (e.g. morbidity). The detectable difference of means will be (e.g., 0.36) standard deviations or larger with alpha level of 0.05 and power of 80%. The detectable difference in rates will be 18%, 17% and 13 % when rates in the overall population are 50%, 30% and 10%, respectively.
Example 1: A sample size of 100 per group should allow detection of medium effects (e.g., h=0.4) with power of 0.80 when comparing groups on percent return to work, using chi square with 2-tailed alpha=0.05; we should thus be able to detect as significant a 50% return to work in intervention group if the control group shows 31% as found in [provide a reference related to your study here].
Example 2: the sample of (e.g., 296) per group will enable us to detect percentage differences between groups of 8% with 2-tailed alpha of 0.05 and power of 0.80.
Example 3: The detectable difference between groups in rates will respectively be 18%, 17% and 13 % when rates in the overall population are 50%, 30% and 10% the sample of (e.g., 215 and 84 in each group) with 2-tailed alpha of 0.05 and power of 0.80.
Example 1: A sample size of 75 after expected attrition should allow detection of a 1.75-point decrease in CESD scores (medium effect) with power>= 0.99, when comparing post-test to pre-test using a paired t-test with 2-tailed alpha=0.05 (assuming within group standard deviation of 3.8 as found in [provide a reference related to your study here], with correlation of 0.5 from pre-to-post test scores).
Example 1: The target sample size of 150 should allow detection of small-to-medium correlation effects of about r=.24 with power 0.80 and 2-tailed alpha=0.05. Correlations of this magnitude were found in studies of males by [provide a reference related to your study here].
Example 2: In a simple regression analysis, the detectable correlation of 0.2 or larger requires a sample size of 195 with 2-tailed alpha of 0.05 and power of 0.80.
Example 1: A sample size of (e.g., 92) should allow detection of medium effects with power of 0.80 when testing the model predicting depression from 5 independent variables at alpha=0.05; a medium effect translates to R2=.13 or 13% of variance accounted for. The simpler model examined by [provide a reference related to your study here] accounted for 11% of the variance in depression scores.
Example 2: The required sample size for a multiple regression with p covariates will increase, depending on the partial multiple correlation of p-1 covariates. A sample size of (e.g., 240 per county) will allow to detect a correlation of (e.g., 0.19) given a partial multiple correction less than (e.g., 0.03) with 2-tailed alpha of 0.05 and power of 0.80.
Example 3: the sample of (e.g., 1184) will be for detection of small effects (e.g., R2=.02) in multiple regression analysis with up to (e.g., 25) covariates with 2-tailed alpha of 0.05 and power of 0.80.
Example 1: A sample size of 200 should allow detection of an odds ratio for a single predictor of approximately 1.6 with power of 0.80, assuming an outcome of [your description here, e.g. “lab values less than xxx”] at 33%, and with alpha=0.05. The detectable effect size would increase somewhat for assessing a single predictor within a multivariate model, depending on the correlation of the target predictor with the other covariates in the model; e.g. if the target predictor was modestly correlated (about 0.20) with other covariates, then the detectable odds ratio would increase slightly to about 1.7.
Example 2: Using logistic regression to assess predictors of drug use status (e.g. abstinence), the sample of (e.g., 1184) will allow detection of odds ratios of (e.g., 1.46-1.54) with alpha of 0.05 and power of 0.80 for a predictor controlling for other predictors, assuming moderate correlations of 0.1-0.5 among predictors and assuming a small proportion (e.g., 5%) of respondents who become abstinent. Smaller odds ratios of (e.g., 1.32-1.37) will be detectable with higher abstinent rate (e.g. 25%).
Example 1: For bodyweight, the analysis sample size will allow detection of an average 5% decrease in weight, a target level of clinical significance proposed by the Institute of Medicine (Committee to Develop Criteria for Evaluating the Outcomes of Approaches to Prevent and Treat Obesity, Food and Nutrition Board, 1995). Our calculation was based on the following: an expectation that about 60% of likely weight loss in the intervention group would occur during the intervention, the remainder during the next 6 months, with maintenance through the 12-month follow-up; a small, consistent weight increase (about 1 lb. per year) in the control group as commonly seen in adult women (Sternfeld, Wang, Quesenberry et al., 2004); likely baseline weight from pilot data (n=45) from the target community (mean weight=158.54, standard deviation 27.94); and correlation of .50 of weight over time as found in the pilot data. For analysis with a general linear model with repeated measures, using alpha=.05, the proposed sample size will achieve power=.80 for detecting a significant difference between groups in the weight patterns described above that include the 5% weight decrease for the intervention group (Cohen, 1988; Woodward et al., 1990).
Example 1: The target sample size of 120 per group will allow the detection of small-to-medium effects of about d=.32 in detecting a difference in patterns over time between the maintenance group and each of the other two groups, with power=.80 and one-tailed alpha=.05 assuming a moderate correlation of .50 over time and approximately 15% attrition (Hedeker et al., 1999). A slightly larger effect size (d=.34) would be detectable when comparing outcomes at a single point in time (e.g. percentage who present for treatment post-release) allowing for attrition. This effect size translates to a difference in percent between groups of about 15% (e.g. 25% vs. 40%).
Example 2: The proposed sample size of n=75 per group will allow detection of medium effects of about d=.40 in group differences over time, assuming a moderate correlation of .50 over time and allowing attrition of up to 20% (one-tailed alpha=.05, power=.80) (Hedeker et al., 1999). For the exploratory hypothesis on ART adherence in Aim 3 including only the subsample of HIV+ subjects (approximately 50% of the sample), somewhat larger effects (about d=.58) will be detectable.
Example 3: For exploratory hypothesis testing in this pilot study comparing the electronic monitoring/feedback group to each of the other two treatment arms, the sample size of n=33 per group (total n=99) will allow detection of medium effects (approximately d=.61) with power=.80 and one-tailed alpha of .05, using random effects regression for repeated measures and assuming a moderate correlation of .5 and up to 10% attrition (Hedeker et al., 1999). [from a proposal by D. Koniak-Griffin (PI)]
Example 1: For assessing outcomes measured at the school level or available in aggregate school-level form, the proposed sample size of 24 schools (clusters) will allow the detection of large effects (approximately d=1.0) in testing program differences in change over time, with power=0.80 and 1-tailed alpha=0.05 assuming moderate correlation over time (Hedeker et al., 1999). Other studies have also found a large effect with similar interventions [insert ref. here]; we expect that the longer duration of the proposed intervention will produce an effect of even greater magnitude. For assessing program differences in student-level outcomes, the sample size of approximately 300 students per school will allow detection of small effects (approximately d=0.25 to 0.35) in repeated measures analyses, accounting for cluster randomization and assuming intraclass correlations of 0.05 to 0.10 (Raudenbush, 1997; Spybrook, Raudenbush et al., 2009). With this large sample, there should be little impact of student attrition on detectable effect size.
Example 1: Assuming moderate partial correlations of 0.1 to 0.5 among predictors, the sample of (e.g., 5,419) will allow detection of hazard ratios of (e.g., 1.21 to 1.28) in a survival analysis with 2-tailed alpha of 0.05 and power of 0.80.
Example 2: For the event history analysis on outcome variables (e.g. eventual cessation), the sample of (e.g., 192) would enable us to detect hazard ratios of (e.g., 1.89 to 2.84) in comparing survival curves between two groups (e.g. Whites and Hispanics), with 2-tailed alpha of 0.05, power of 0.80 and cessation rate of (e.g., 15 to 40%).