This page contains links to Playlists and individual videos, organized, roughly, by category. Generally speaking, the videos are organized from basic concepts to complicated concepts, so, in theory, you should be able to start at the top and work you way down and everything will make sense.

**Playlists:**

- Statistics Fundamentals
- Linear Models
- Logistic Regression
- Machine Learning
- High Throughput Sequence Analysis
- Statistics in R

**Individual Videos are Below**

**Statistics Fundamentals:**

- Histograms, Clearly Explained
- What is a statistical distribution?
- The Normal Distribution
- Statistics Fundamentals: Population Parameters
- Statistics Fundamentals: Estimating the Mean, Variance and Standard Deviation
- Covariance and Correlation Part 1: Covariance
- Covariance and Correlation Part 2: Pearson’s Correlation
- The Binomial Distribution
- What is a statistical model?
- What does it mean to “sample from a distribution”?
- The Central Limit Theorem (or “How I Learned to Stop Worrying and Love the t-test”).
- The Difference between Technical and Biological Replicates
- The sample size and the effective sample size
- Standard Deviation vs Standard Error
- The Standard Error
- Bar Charts Are Better Than Pie Charts
- Boxplots, Clearly Explained
- Logs (logarithms), clearly explained
- Confidence Intervals
- R-squared explained
- Linear Models Part 0: Fitting a line to data, aka Least Squares, aka Linear Regression
- Fitting a curve to data, aka Lowess, aka Loess
- Linear Models Part 1: Linear Regression
- Linear Models: Linear Regression in R
- Linear Models Part 1.5: Multiple Regression
- Linear Models: Multiple Regression in R
- Linear Models Part 2: t-tests and ANOVA
- Linear Models Part 3: Design Matrices
- Linear Models: Design Matrix Examples in R
- Quantiles and Percentiles
- Quantile-Quantile Plots (QQ Plots)
- Quantile Normalization
- Probability vs Likelihood
- Maximum Likelihood
- Maximum Likelihood: A worked out example for the exponential distribution
- Maximum Likelihood: A worked out example for the binomial distribution
- Maximum Likelihood: A worked out example for the normal distribution
- Odds and Log(Odds)
- Odds Ratios and Log(Odds Ratios)

**Statistical Tests:**

- Enrichment Analysis using Fisher’s Exact Test and the Hypergeometric Distribution
- Which t-test to use
- p-values, clearly explained
- One or Two Tailed p-values
- Thresholds for Significance
- FDR and the Benjamini-Hochberg Method clearly explained
- p-hacking and power calculations

**Machine Learning and Dealing with large datasets that have lots and lots of measurements per sample:**

(NOTE: All of the linear model and curve fitting stuff in the “Basics” section is also considered to be Machine Learning, so make sure you check out those videos).

- A Gentle Introduction to Machine Learning
- Machine Learning Fundamentals: Cross Validation
- Machine Learning Fundamentals: The Confusion Matrix
- Machine Learning Fundamentals: Sensitivity and Specificity
- Machine Learning Fundamentals: Bias and Variance
- ROC and AUC
- ROC and AUC in R
- Regularization Part 1: L2, Ridge Regression
- Regularization Part 2: L1, Lasso Regression
- Regularization Part 3: Elastic-Net Regression
- Regularization Part 4: Ridge, Lasso and Elastic-Net Regression in R
- Linear Discriminant Analysis (LDA) clearly explained
- Principal Component Analysis (PCA) Step-by-Step
- Principal Component Analysis (PCA) explained in less than 5 minutes
- PCA – Practical Tips
- DEPRECATED: Principal Component Analysis (PCA) clearly explained (more details)
- PCA in R
- PCA in Python
- Multi-Dimensional Scaling (MDS) and Principal Coordinate Analysis (PCoA) clearly explained
- MDS and PCoA in R
- t-SNE, clearly explained
- Heatmaps – considerations for drawing and interpreting them
- Hierarchical Clustering
- K-Means Clustering
- K-Nearest Neighbors
- CART – Classification and Regression Trees are explained in the following three videos:
- Random Forests Part 1: Building, using and evaluating
- Random Forests Part 2: Missing data and clustering
- Random Forests in R
- AdaBoost
- Gradient Boost Part 1: Regression Main Ideas
- Gradient Boost Part 2: Regression Details
- Gradient Boost Part 3: Classification Main Ideas
- Gradient Boost Part 4: Classification Details
- XGBoost Part 1: XGBoost Trees for Regression
- XGBoost Part 2: XGBoost Trees for Classification
- Gradient Descent
- Stochastic Gradient Descent
- Support Vector Machines (SVM)
- Logistic Regression
- Logistic Regression, Details Part 1: Coefficients
- Logistic Regression, Details Part 2: Maximum Likelihood
- Logistic Regression, Details Part 3: R-squared and its p-value
- Saturated Models and Deviance Statistics
- Deviance Residuals
- Logistic Regression in R

**High-throughput Sequencing Analysis:**

- A Gentle Introduction to RNA-seq
- A Gentle Introduction to ChIP-seq
- edgeR, part1: Library Normalization
- DESeq2, part1: Library Normalization
- edgeR and DESeq2, part2: Independent Filtering (removing genes with low read counts)
- RNA-seq – The Problem with Technical Replicates
- RPKM, FPKM, and TPM

Live Streams:

- 2020-01-06
- The difference between one large sample of 20 measurements and 4 smaller samples of 5 measurements each (1:52).
- Is machine learning a subset of statistics? (6:20)
- A poem!!! (9:59)
- Viewer Questions/Comments (10:51)