Skip to main content
Sign In
 

General Statistics Terminology


To begin, we must first identify the differences between what statistics defines as population data and sample data. A population is the entire set people or things in a specified group. Characteristics of a population are called parameters. A sample is a subset of a population. Characteristics of a sample are called statistics.

Biostatistics is the use of statistics for public health, biological, or medical applications, and applied to a variety of research topics and fields. The main goal is to use appropriate statistical methods to understand the factors that affect human health.

Variable Types

Qualitative (Categorical)
  • Ordinal: Ordered categorical variables (ex. never, sometimes, frequently, always)
  • Nominal: Unordered categorical variables (ex. hair color, gender)
Quantitative
  • Continuous: Numerical variables with an infinite number of values (ex. height)
  • Discrete: Numerical variables that can be counted (ex. number of bacteria)

Displaying Data

  • Tables: Numerical summary of frequencies, %, summary statistics, etc.
  • Graphs: Visual representations of data:
    • Histogram: Bar graph of frequencies
    • Scatterplot: Plot of two numerical variables
    • Boxplot: Visual representation of mean, median, quartiles, and range. Boxplot example:
      Boxplot Example

Study Design

Study Type

Observational study

Observe an existing situation and makes inferences.

  • Case Control: Study of existing groups differing on outcome (ex: patients with disease vs w/o)
  • Cross-sectional: (prevalence) Study observing patients at a single point in time
  • Cohort: Study that follows a group of similar individuals who differ with respect to certain factors, to determine how those factors affect an outcome of interest
Experimental

Researcher randomly assigns individuals to treatment groups.

  • Randomization: Technique used to select samples that keeps certain variables constant across groups (standardization) so true effect can be observed
  • Placebo: Treatment given to a group that has no therapeutic effect
  • Blinding: Treatment assignment is unknown to patient, doctor, or both
Hypothesis

Detailed prediction of a scientific question that can be tested.

  • Null hypothesis: There is no relationship among the groups
  • Alternative hypothesis: There is a relationship among groups
  • P-value: Probability that the test shows a difference among the comparisons, assuming the null is true
Sample Size Justification

Method to ensure there are enough observations to find a statistical difference between groups when they are, in fact, biologically different.

Significance level (α): Threshold with which null hypothesis is rejected. Standard values for α include 0.05, 0.01, 0.001

  • If the p-value is less than or equal to α, the null hypothesis is rejected
  • If the p-value is greater than α, we fail to reject the null hypothesis

Power: Ability to detect a difference when a difference truly exists

Effect size: Clinically meaningful difference between comparisons

Study Analysis

Bias

Any systematic error that can occur in multiple areas of a study, (e.g., study design, measurement technique, and or analyses) which will either over or under estimate a parameter and to false conclusions.

Descriptive Statistics

Characterizing data using graphs, tables, numerical summaries.

Measure of Location
Mean: Average of the data Median: Middle point of the data Mode: Most occurring data point
Measure of Spread
Standard deviation: Deviation of the data in a sample Interquartile Range: Difference between the 75th percentile and the 25th percentile Range: Difference between the largest and smallest values

Outliers: Very extreme data points

Frequency: The proportions of values within a single variable

Inferential Statistics

Drawing conclusions about populations based on samples

  • Confidence Intervals: Combining the sample statistics and standard errors to estimate population parameters
  • Standard error: Uncertainty of the sample mean
  • Statistical Tests: Tests used to quantify the similarity between comparisons
  • Statistical test performed depends on variable type, number of comparisons, and underlying distribution of population
  • Number of comparisons can be between two or more groups, independent or paired
  • Distribution of population can be parametric (normally distributed), or non-parametric (no assumed distribution)
  • Types of statistical tests: ttests, ztests, ftests, Chi-squared, ANOVA, Regression, Correlation, etc.
More information on statistical tests:
References
  • Baron, Anna. Biostatistical Methods. Lecture 1 Overview. Fall 2015.
  • Rosner. Fundamentals of Biostatistics. 7th ed. Brookes/Cole. 2011.
  • Samuels & Witmer. Statistics for the Life Sciences. 3rd ed. Pearson Education. 2003.

Center for Innovative Design & Analysis (CIDA)

Formerly known as the Colorado Biostatistics Consortium (CBC)
13001 17th Place | Mail Stop B119 | Room 100, Building 406 | Aurora, CO 80045
303-724-4370 | cbc.admin@ucdenver.edu

Biostatistics Consulting | Grant Collaboration | Department of Biostatistics and Informatics | BERD | BBSR | ColoradoSPH

© The Regents of the University of Colorado, a body corporate. All rights reserved.

Accredited by the Higher Learning Commission. All trademarks are registered property of the University. Used by permission only.