To begin, we must first identify the differences
between what statistics defines as population data and sample data. A population is the entire set people or
things in a specified group. Characteristics of a population are called parameters. A sample is a subset of a population. Characteristics of a sample are
Biostatistics is the use of statistics for public health, biological, or medical applications, and applied to a variety of research topics and fields. The main goal is to use appropriate statistical methods to understand the factors that affect human health.
- Ordinal: Ordered categorical variables (ex. never, sometimes, frequently, always)
- Nominal: Unordered categorical variables (ex. hair color, gender)
- Continuous: Numerical variables with an infinite number of values (ex. height)
- Discrete: Numerical variables that can be counted (ex. number of bacteria)
- Tables: Numerical summary of frequencies, %, summary statistics, etc.
- Graphs: Visual representations of data:
- Histogram: Bar graph of frequencies
- Scatterplot: Plot of two numerical variables
- Boxplot: Visual representation of mean, median, quartiles, and range. Boxplot example:
Observe an existing situation and makes inferences.
- Case Control: Study of existing groups differing on outcome (ex: patients with disease vs w/o)
- Cross-sectional: (prevalence) Study observing patients at a single point in time
- Cohort: Study that follows a group of similar individuals who differ with respect to certain factors, to determine how those factors affect an outcome of interest
Researcher randomly assigns individuals to treatment groups.
- Randomization: Technique used to select samples that keeps certain variables constant across groups (standardization) so true effect can be observed
- Placebo: Treatment given to a group that has no therapeutic effect
- Blinding: Treatment assignment is unknown to patient, doctor, or both
Detailed prediction of a scientific question that can be tested.
- Null hypothesis: There is no relationship among the groups
- Alternative hypothesis: There is a relationship among groups
- P-value: Probability that the test shows a difference among the comparisons, assuming the null is true
Sample Size Justification
Method to ensure there are enough observations
to find a statistical difference between groups when they are, in fact,
Significance level (α): Threshold with which null hypothesis is rejected. Standard values for α include 0.05, 0.01, 0.001
- If the p-value is less than or equal to α, the null hypothesis is rejected
- If the p-value is greater than α, we fail to reject the null hypothesis
Power: Ability to detect a difference when a difference truly exists
Effect size: Clinically meaningful difference between comparisons
Any systematic error that can occur in multiple
areas of a study, (e.g., study design, measurement technique, and or analyses)
which will either over or under estimate a parameter and to false conclusions.
Characterizing data using
graphs, tables, numerical summaries.
|Mean: Average of the data
||Median: Middle point of the data
||Mode: Most occurring data point
|Measure of Spread
|Standard deviation: Deviation of the data in a sample
||Interquartile Range: Difference between the 75th percentile
and the 25th percentile
||Range: Difference between the largest and smallest
Outliers: Very extreme data points
Frequency: The proportions of values within a single
Drawing conclusions about
populations based on samples
- Confidence Intervals: Combining the sample statistics and standard errors to estimate population parameters
- Standard error: Uncertainty of the sample mean
- Statistical Tests: Tests used to quantify the similarity between comparisons
- Statistical test performed depends on variable type, number of comparisons, and underlying distribution of population
- Number of comparisons can be between two or more groups, independent or paired
- Distribution of population can be parametric (normally distributed), or non-parametric (no assumed distribution)
- Types of statistical tests: ttests, ztests, ftests, Chi-squared, ANOVA, Regression, Correlation, etc.
More information on statistical tests:
- Baron, Anna. Biostatistical Methods. Lecture 1
Overview. Fall 2015.
- Rosner. Fundamentals of Biostatistics. 7th ed.
- Samuels & Witmer. Statistics for the Life Sciences.
3rd ed. Pearson Education. 2003.