This page contains links to individual videos on Statistics, Statistical Tests, Machine Learning and Live Streams, organized, roughly, by category. Generally speaking, the videos are organized from basic concepts to complicated concepts, so, in theory, you should be able to start at the top and work you way down and everything will make sense.

**NOTE:** I also have a bunch of playlists for a variety of interests, like Statistics Basics, Linear Regression and Machine Learning. Check them out!!!

- Histograms, Clearly Explained
- What is a statistical distribution?
- The Normal Distribution
- Statistics Fundamentals: Population Parameters
- Statistics Fundamentals: Estimating the Mean, Variance and Standard Deviation
- Covariance and Correlation Part 1: Covariance
- Covariance and Correlation Part 2: Pearson’s Correlation
- What is a statistical model?
- What does it mean to “sample from a distribution”?
- p-values: What they are and how to interpret them
- How to Calculate p-values
- p-hacking: What it is and how to avoid it
- Statistical Power, Clearly Explained
- Power Analysis, Clearly Explained
- Expected Values
- Conditional Probability
- The Binomial Distribution
- The Central Limit Theorem (or “How I Learned to Stop Worrying and Love the t-test”).
- The Difference between Technical and Biological Replicates
- The sample size and the effective sample size
- Standard Deviation vs Standard Error
- The Standard Error
- Bar Charts Are Better Than Pie Charts
- Boxplots, Clearly Explained
- Logs (logarithms), clearly explained
- Confidence Intervals
- R-squared explained
- Linear Models Part 0: Fitting a line to data, aka Least Squares, aka Linear Regression
- Fitting a curve to data, aka Lowess, aka Loess
- Linear Models Part 1: Linear Regression
- Linear Models: Linear Regression in R
- Linear Models Part 1.5: Multiple Regression
- Linear Models: Multiple Regression in R
- Linear Models Part 2: t-tests and ANOVA
- Linear Models Part 3: Design Matrices
- Linear Models: Design Matrix Examples in R
- Quantiles and Percentiles
- Quantile-Quantile Plots (QQ Plots)
- Quantile Normalization
- Probability vs Likelihood
- Maximum Likelihood
- Maximum Likelihood: A worked out example for the exponential distribution
- Maximum Likelihood: A worked out example for the binomial distribution
- Maximum Likelihood: A worked out example for the normal distribution
- Odds and Log(Odds)
- Odds Ratios and Log(Odds Ratios)

- Enrichment Analysis using Fisher’s Exact Test and the Hypergeometric Distribution
- Which t-test to use
- p-values: What they are and how to interpret them
- How to Calculate p-values
- Thresholds for Significance
- FDR and the Benjamini-Hochberg Method clearly explained
- p-hacking and power calculations

**Machine Learning and Dealing with large datasets that have lots and lots of measurements per sample:**

(NOTE: All of the linear model and curve fitting stuff in the “Basics” section is also considered to be Machine Learning, so make sure you check out those videos).

- A Gentle Introduction to Machine Learning
- Machine Learning Fundamentals: Cross Validation
- Machine Learning Fundamentals: The Confusion Matrix
- Machine Learning Fundamentals: Sensitivity and Specificity
- Machine Learning Fundamentals: Bias and Variance
- ROC and AUC
- ROC and AUC in R
- Regularization Part 1: L2, Ridge Regression
- Regularization Part 2: L1, Lasso Regression
- Regularization Part 2.5: Ridge vs Lasso Visualized (or why Lasso can set parameters to 0 and Ridge can’t)
- Regularization Part 3: Elastic-Net Regression
- Regularization Part 4: Ridge, Lasso and Elastic-Net Regression in R
- Linear Discriminant Analysis (LDA) clearly explained
- Principal Component Analysis (PCA) Step-by-Step
- Principal Component Analysis (PCA) explained in less than 5 minutes
- PCA – Practical Tips
- DEPRECATED: Principal Component Analysis (PCA) clearly explained (more details)
- PCA in R
- PCA in Python
- Multi-Dimensional Scaling (MDS) and Principal Coordinate Analysis (PCoA) clearly explained
- MDS and PCoA in R
- t-SNE, clearly explained
- Heatmaps – considerations for drawing and interpreting them
- Hierarchical Clustering
- K-Means Clustering
- K-Nearest Neighbors
- Naive Bayes
- CART – Classification and Regression Trees are explained in the following three videos:
- Random Forests Part 1: Building, using and evaluating
- Random Forests Part 2: Missing data and clustering
- Random Forests in R
- AdaBoost
- Gradient Boost Part 1: Regression Main Ideas
- Gradient Boost Part 2: Regression Details
- Gradient Boost Part 3: Classification Main Ideas
- Gradient Boost Part 4: Classification Details
- BAM!!! Clearly Explained
- XGBoost Part 1: XGBoost Trees for Regression
- XGBoost Part 2: XGBoost Trees for Classification
- XGBoost Part 3: Mathematical Details
- XGBoost Part 4: Crazy Cool Optimizations
- Gradient Descent
- Stochastic Gradient Descent
- Support Vector Machines (SVM)
- Logistic Regression
- Logistic Regression, Details Part 1: Coefficients
- Logistic Regression, Details Part 2: Maximum Likelihood
- Logistic Regression, Details Part 3: R-squared and its p-value
- Saturated Models and Deviance Statistics
- Deviance Residuals
- Logistic Regression in R

**High-throughput Sequencing Analysis:**

- A Gentle Introduction to RNA-seq
- A Gentle Introduction to ChIP-seq
- edgeR, part1: Library Normalization
- DESeq2, part1: Library Normalization
- edgeR and DESeq2, part2: Independent Filtering (removing genes with low read counts)
- RNA-seq – The Problem with Technical Replicates
- RPKM, FPKM, and TPM

- 2020-01-06
- 2020-01-20
- 0:00 Introduction
- 1:04 Comment #1 – What is your favorite machine learning algorithm
- 4:40 Comment #2 – What id data leakage in machine learning?
- 8:39 Comment #3 – Where do you learn these nitty gritty details?
- 13:37 Live Question #1 – R-squared and Adjusted R-squared
- 17:23 Live Question #2 – How are the videos arranged on https://statquest.org/video-index/ (simple to complex)
- 18:26 Live Question #3 – Is it important to learn all of the formulas and equations even though we have advanced software that does the work?

- 2020-02-03
- 0:00 Silly Song and Introduction
- 0:18 A big huge announcement
- 3:14 Question #1 – Do we use statistical models to predict or explain stuff?
- 8:31 Question #2 – Can you show the effects of regularization?
- 9:42 My cat, Poe
- 15:04 Question #3 – How do I choose the best machine learning algorithm for my data?
- 21:17 Live Questions

- 2020-02-17
- 2020-03-02
- 2020-03-16 – Naive Bayes
- 2020-04-06 – Gaussian Naive Bayes
- 2020-04-20 – Expected Values
- 2020-05-04 – Conditional Probability
- 2020-05-18 – Bayes’ Theorem