Aligning power or
sample size analysis with planned data analysis helps to avoid the problems of
1) sample size too small to detect important alternative hypotheses and 2)
sample size so large that the design squanders previous resources.
What do you need for sample size justification?
Discuss your science and study design before calculating your sample size that results in a table of sample size choices and a written paragraph justifying the sample size.
Data you need to provide from historical literature, pilot data, or other clinical information:
- Measures of variation (standard deviations) of the outcome measures.
- Estimate of a clinically meaningful difference between groups or an estimate of the size of association that is clinically meaningful (e.g., correlation).
- stimates of correlation within individuals (or clusters) in studies with repeatedly collected measurements on an individual (or cluster).stimates of correlation within individuals (or clusters) in studies with repeatedly collected measurements on an individual (or cluster).
- Variation of the variable of interest for studies with a continuous variable of interest (e.g., age).
Sample size is a function of:
- Variation in outcome measures (and sometimes primary predictors)
- Size of a clinically meaningful difference between groups
- Level of significance (are multiple tests being performed)
- Desired type II error (the probability that you will not find a difference when a difference, in truth, exists)
Based on the data, we make
a decision to reject the null hypothesis (H0) or fail to reject H0. We quantify the evidence against the H0 in the form of a p-value. Remember, we want α and β small. Note: β increases as α decreases.
to reject H0
||1-α: Level of confidence
P(Type II Error)
P(Type I Error)
Power: Probability of finding difference if it exists
Evaluating the performance of a hypothesis test
There are 4 important quantities that vary together.
- Level of significance of a test = α (usually 0.05)
- Power of a test = 1 - β (usually 0.8 or 0.9)
- Sample size = n
- Detectable difference (sometimes called the effect size) = | μ0 - μ1 |
- Based on prior knowledge
- Based on preliminary study
Derivation of power for two-sided, one-sample Z-test
We usually set α (typically at 0.05), and then by fixing two of the other parameters we can calculate the last:
Detectable Difference (difference in the means that can be detected)
Note: For a one sided test, replace α/2 with α.
How power, detectable difference, and sample size relate to each other.
As sample size n increases:
- Power: ⇧ Increases
- Detectable Difference | μ0 - μ1 |: ⇩ Decreases
As the difference to be detected | μ0 - μ1 |, increases:
- Power: ⇧ Increases
- Required Sample Size: ⇩ Decreases
As desired power increases:
- Required Sample Size: ⇧ Increases
- Detectable Difference | μ0 - μ1 |: ⇧ Increases
- Never ad-hoc the sample size even if you do not get a p-value you were hoping for. Remember, even if you do not show significance in your study you’re still providing the scientific community with useful and pertinent information while upholding the integrity of scientific inquiry.
- Having the minimum calculated sample size is not the most conservative approach. Having a larger sample size than the calculated value usually leads to a more robust and precise conclusion.
- The equations for calculating power, sample size, and detectable difference change depending on the proposed statistical test. Make sure to check the assumptions of the equations prior to calculating.
- Sample size calculations: basic principles and common pitfalls. Nephrol Dial Transplant 2010. Topics addressed in this article: continuous vs. binary vs. other types of outcomes, different types of study designs, common pitfalls, reporting of sample size calculations
- PASS (Power Analysis and Sample Size Software): http://www.ncss.com/software/pass/
- Free sample size and power calculator: http://powerandsamplesize.com/Calculators/