This page contains links to playlists and individual videos on Statistics, Statistical Tests, Machine Learning, Webinars and Live Streams, organized, roughly, by category. Generally speaking, the videos are organized from basic concepts to complicated concepts, so, in theory, you should be able to start at the top and work you way down and everything will make sense.

**Playlists:**

- Statistics Fundamentals – These videos give you a general overview of statistics as well as a be a reference for statistical concepts. Topics include:
- Histograms
- What is a statistical distribution?
- And many more!!!

- Linear Regression and Linear Models – These videos teach the basics relating to one of statistics most powerful tools. Linear Regression and Linear Models allow us to use continuous values, like weight or height, and categorical values, like favorite color or favorite movie, to predict a continuous value, like age.
- Logistic Regression – These videos pick up where Linear Regression and Linear Models leave off. Now, instead of predicting something continuous, like age, we can predict something discrete, like whether or not someone will enjoy the 1990 theatrical bust Troll 2.
- Machine Learning – Linear Models and Logistic Regression are just the tips of the machine learning iceberg. There’s tons more to learn, and this playlist will help you trough it all, one step at a time.
- High Throughput Sequence Analysis – If you do high-throughput sequence analysis, this playlist is for you!
- Statistics in R – If you want to do any of this stuff in R, this playlist is for you, and you only. No one else is allowed to watch it.

- Histograms, Clearly Explained
- What is a statistical distribution?
- The Normal Distribution
- Statistics Fundamentals: Population Parameters
- Statistics Fundamentals: Estimating the Mean, Variance and Standard Deviation
- Covariance and Correlation Part 1: Covariance
- Covariance and Correlation Part 2: Pearson’s Correlation
- What is a statistical model?
- What does it mean to “sample from a distribution”?
- Hypothesis Testing and the Null Hypothesis
- Alternative Hypothesis: Main Ideas
- p-values: What they are and how to interpret them
- How to Calculate p-values
- p-hacking: What it is and how to avoid it
- Statistical Power, Clearly Explained
- Power Analysis, Clearly Explained
- Expected Values
- Conditional Probability
- The Binomial Distribution and Test
- The Central Limit Theorem (or “How I Learned to Stop Worrying and Love the t-test”).
- The Difference between Technical and Biological Replicates
- The sample size and the effective sample size
- Standard Deviation vs Standard Error
- The Standard Error
- Bar Charts Are Better Than Pie Charts
- Boxplots, Clearly Explained
- Logs (logarithms), clearly explained
- Confidence Intervals
- R-squared explained
- Linear Models Part 0: Fitting a line to data, aka Least Squares, aka Linear Regression
- Fitting a curve to data, aka Lowess, aka Loess
- Linear Models Part 1: Linear Regression
- Linear Models: Linear Regression in R
- Linear Models Part 1.5: Multiple Regression
- Linear Models: Multiple Regression in R
- Linear Models Part 2: t-tests and ANOVA
- Linear Models Part 3: Design Matrices
- Linear Models: Design Matrix Examples in R
- Quantiles and Percentiles
- Quantile-Quantile Plots (QQ Plots)
- Quantile Normalization
- Probability vs Likelihood
- Maximum Likelihood
- Maximum Likelihood: A worked out example for the exponential distribution
- Maximum Likelihood: A worked out example for the binomial distribution
- Maximum Likelihood: A worked out example for the normal distribution
- Odds and Log(Odds)
- Odds Ratios and Log(Odds Ratios)

- Enrichment Analysis using Fisher’s Exact Test and the Hypergeometric Distribution
- Which t-test to use
- p-values: What they are and how to interpret them
- How to Calculate p-values
- Thresholds for Significance
- FDR and the Benjamini-Hochberg Method clearly explained
- p-hacking and power calculations

**Machine Learning and Dealing with large datasets that have lots and lots of measurements per sample:**

(NOTE: All of the linear model and curve fitting stuff in the “Basics” section is also considered to be Machine Learning, so make sure you check out those videos).

- A Gentle Introduction to Machine Learning
- Machine Learning Fundamentals: Cross Validation
- Machine Learning Fundamentals: The Confusion Matrix
- Machine Learning Fundamentals: Sensitivity and Specificity
- Machine Learning Fundamentals: Bias and Variance
- ROC and AUC
- ROC and AUC in R
- Regularization Part 1: L2, Ridge Regression
- Regularization Part 2: L1, Lasso Regression
- Regularization Part 2.5: Ridge vs Lasso Visualized (or why Lasso can set parameters to 0 and Ridge can’t)
- Regularization Part 3: Elastic-Net Regression
- Regularization Part 4: Ridge, Lasso and Elastic-Net Regression in R
- Linear Discriminant Analysis (LDA) clearly explained
- Principal Component Analysis (PCA) Step-by-Step
- Principal Component Analysis (PCA) explained in less than 5 minutes
- PCA – Practical Tips
- DEPRECATED: Principal Component Analysis (PCA) clearly explained (more details)
- PCA in R
- PCA in Python
- Multi-Dimensional Scaling (MDS) and Principal Coordinate Analysis (PCoA) clearly explained
- MDS and PCoA in R
- t-SNE, clearly explained
- Heatmaps – considerations for drawing and interpreting them
- Hierarchical Clustering
- K-Means Clustering
- K-Nearest Neighbors
- Naive Bayes
- Gaussian Naive Bayes
- The Chain Rule
- Gradient Descent
- Stochastic Gradient Descent
- CART – Classification and Regression Trees are explained in the following three videos:
- Random Forests Part 1: Building, using and evaluating
- Random Forests Part 2: Missing data and clustering
- Random Forests in R
- AdaBoost
- Gradient Boost Part 1: Regression Main Ideas
- Gradient Boost Part 2: Regression Details
- Gradient Boost Part 3: Classification Main Ideas
- Gradient Boost Part 4: Classification Details
- BAM!!! Clearly Explained
- XGBoost Part 1: Regression
- XGBoost Part 2: Classification
- XGBoost Part 3: Mathematical Details
- XGBoost Part 4: Crazy Cool Optimizations
- Support Vector Machines (SVM)
- Logistic Regression
- Logistic Regression, Details Part 1: Coefficients
- Logistic Regression, Details Part 2: Maximum Likelihood
- Logistic Regression, Details Part 3: R-squared and its p-value
- Saturated Models and Deviance Statistics
- Deviance Residuals
- Logistic Regression in R

- Classification Trees in Python, from Start-to-Finish
- Support Vector Machines in Python, from Start-to-Finish
- XGBoost in Python, from Start-to-Finish

**High-throughput Sequencing Analysis:**

- A Gentle Introduction to RNA-seq
- A Gentle Introduction to ChIP-seq
- edgeR, part1: Library Normalization
- DESeq2, part1: Library Normalization
- edgeR and DESeq2, part2: Independent Filtering (removing genes with low read counts)
- RNA-seq – The Problem with Technical Replicates
- RPKM, FPKM, and TPM

- 2020-01-06
- 2020-01-20
- 0:00 Introduction
- 1:04 Comment #1 – What is your favorite machine learning algorithm
- 4:40 Comment #2 – What is data leakage in machine learning?
- 8:39 Comment #3 – Where do you learn these nitty gritty details?
- 13:37 Live Question #1 – R-squared and Adjusted R-squared
- 17:23 Live Question #2 – How are the videos arranged on https://statquest.org/video-index/ (simple to complex)
- 18:26 Live Question #3 – Is it important to learn all of the formulas and equations even though we have advanced software that does the work?

- 2020-02-03
- 0:00 Silly Song and Introduction
- 0:18 A big huge announcement
- 3:14 Question #1 – Do we use statistical models to predict or explain stuff?
- 8:31 Question #2 – Can you show the effects of regularization?
- 9:42 My cat, Poe
- 15:04 Question #3 – How do I choose the best machine learning algorithm for my data?
- 21:17 Live Questions

- 2020-02-17
- 2020-03-02
- 2020-03-16 – Naive Bayes
- 2020-04-06 – Gaussian Naive Bayes
- 2020-04-20 – Expected Values
- 2020-05-04 – Conditional Probability
- 2020-05-18 – Bayes’ Theorem
- 2020-06-01 – Hypothesis Testing
- 2020-06-15 – Bootstrapping Main Ideas