There are two ways to see all of my videos and navigate them. Probably the best way is to use this Learney Flow Chart, which was created by my friends at Learney.me. What makes it so awesome is that you can easily pick the general topic you are interested in and then see all of the relevant videos and their dependencies. Alternatively, you can find everything right here, just not as well organized.
This page contains links to playlists and individual videos on Statistics, Statistical Tests, Machine Learning, Webinars and Live Streams, organized, roughly, by category. Generally speaking, the videos are organized from basic concepts to complicated concepts, so, in theory, you should be able to start at the top and work you way down and everything will make sense.
Playlists:
- Statistics Fundamentals – These videos give you a general overview of statistics as well as a be a reference for statistical concepts. Topics include:
- Histograms
- What is a statistical distribution?
- And many more!!!
- Linear Regression and Linear Models – These videos teach the basics relating to one of statistics most powerful tools. Linear Regression and Linear Models allow us to use continuous values, like weight or height, and categorical values, like favorite color or favorite movie, to predict a continuous value, like age.
- Logistic Regression – These videos pick up where Linear Regression and Linear Models leave off. Now, instead of predicting something continuous, like age, we can predict something discrete, like whether or not someone will enjoy the 1990 theatrical bust Troll 2.
- Machine Learning – Linear Models and Logistic Regression are just the tips of the machine learning iceberg. There’s tons more to learn, and this playlist will help you trough it all, one step at a time.
- Neural Networks – Everything you need to know, from the basics, all the way to image classification with Convolutional Neural Networks, presented one step at a time so that it is easily understood.
- High Throughput Sequence Analysis – If you do high-throughput sequence analysis, this playlist is for you!
- Statistics in R – If you want to do any of this stuff in R, this playlist is for you, and you only. No one else is allowed to watch it.
- #66DaysOfData – If you want to participate in Ken Jee’s #66DaysOfData and are having trouble thinking of new stuff to learn, here’s a playlist that covers everything from the basics to the fancy stuff.
- Histograms, Clearly Explained
- How to tell a story with US Census Data
- The Main Ideas behind Probability Distributions
- The Normal Distribution
- Population and Estimated Parameters, Clearly Explained
- Estimating the Mean, Variance and Standard Deviation
- Covariance, Clearly Explained
- Pearson’s Correlation, Clearly Explained
- What is a (mathematical) model?
- What does it mean to “sample from a distribution”?
- Hypothesis Testing and the Null Hypothesis
- Alternative Hypothesis: Main Ideas
- p-values: What they are and how to interpret them
- How to Calculate p-values
- p-hacking: What it is and how to avoid it
- False Discovery Rate (FDR), Clearly Explained
- Statistical Power, Clearly Explained
- Power Analysis, Clearly Explained
- Conditional Probability, Clearly Explained
- Bayes’ Theorem, Clearly Explained
- Expected Values Part 1, Main Ideas!!! (Expected Values for Discrete Variables)
- Expected Values Part 2, Continuous Variables
- The Binomial Distribution and Test
- The Central Limit Theorem (or “How I Learned to Stop Worrying and Love the t-test”).
- The Difference between Technical and Biological Replicates
- The sample size and the effective sample size
- Standard Deviation vs Standard Error
- The Standard Error
- Bootstrapping Part 1: Main Ideas
- Bootstrapping Part 2: Calculating p-values
- Bar Charts Are Better Than Pie Charts
- Boxplots, Clearly Explained
- Logs (logarithms), clearly explained
- How to make your own StatQuest!!!
- Confidence Intervals
- R-squared explained
- Linear Models Part 0: Fitting a line to data, aka Least Squares, aka Linear Regression
- Fitting a curve to data, aka Lowess, aka Loess
- Linear Models Part 1: Linear Regression
- Linear Models: Linear Regression in R
- Linear Models Part 1.5: Multiple Regression
- Linear Models: Multiple Regression in R
- Linear Models Part 2: t-tests and ANOVA
- Linear Models Part 3: Design Matrices
- Linear Models: Design Matrix Examples in R
- Quantiles and Percentiles
- Quantile-Quantile Plots (QQ Plots)
- Quantile Normalization
- Probability vs Likelihood
- Maximum Likelihood
- Maximum Likelihood: A worked out example for the exponential distribution
- Maximum Likelihood: A worked out example for the binomial distribution
- Maximum Likelihood: A worked out example for the normal distribution
- Odds and Log(Odds)
- Odds Ratios and Log(Odds Ratios)
- Enrichment Analysis using Fisher’s Exact Test and the Hypergeometric Distribution
- Which t-test to use
- p-values: What they are and how to interpret them
- How to Calculate p-values
- Thresholds for Significance
- FDR and the Benjamini-Hochberg Method clearly explained
- p-hacking and power calculations
Machine Learning and Dealing with large datasets that have lots and lots of measurements per sample:
(NOTE: All of the linear model and curve fitting stuff in the “Basics” section is also considered Machine Learning, so make sure you check out those videos).
- A Gentle Introduction to Machine Learning
- Machine Learning Fundamentals: Cross Validation
- Machine Learning Fundamentals: The Confusion Matrix
- Machine Learning Fundamentals: Sensitivity and Specificity
- The Sensitivity, Specificity, Precision, Recall Sing-a-Long!!!
- Machine Learning Fundamentals: Bias and Variance
- ROC and AUC
- ROC and AUC in R
- Entropy, Clearly Explained!!!
- Regularization Part 1: L2, Ridge Regression
- Regularization Part 2: L1, Lasso Regression
- Regularization Part 2.5: Ridge vs Lasso Visualized (or why Lasso can set parameters to 0 and Ridge can’t)
- Regularization Part 3: Elastic-Net Regression
- Regularization Part 4: Ridge, Lasso and Elastic-Net Regression in R
- Linear Discriminant Analysis (LDA) clearly explained
- Principal Component Analysis (PCA) Step-by-Step
- Principal Component Analysis (PCA) explained in less than 5 minutes
- PCA – Practical Tips
- DEPRECATED: Principal Component Analysis (PCA) clearly explained (more details)
- PCA in R
- PCA in Python
- BAM!!! Clearly Explained
- Multi-Dimensional Scaling (MDS) and Principal Coordinate Analysis (PCoA) clearly explained
- MDS and PCoA in R
- t-SNE, clearly explained
- UMAP Dimension Reduction: Part 1 – Main Ideas
- UMAP Dimension Reduction: Part 2 – Mathematical Details
- Heatmaps – considerations for drawing and interpreting them
- Hierarchical Clustering
- K-Means Clustering
- Clustering with DBSCAN
- K-Nearest Neighbors
- Naive Bayes
- Study Guide
- NOTE: This topic is covered The StatQuest Illustrated Guide to Machine Learning
- Gaussian Naive Bayes
- Study Guide
- NOTE: This topic is covered The StatQuest Illustrated Guide to Machine Learning
- The Chain Rule
- Gradient Descent
- Study Guide
- NOTE: This topic is covered The StatQuest Illustrated Guide to Machine Learning
- Stochastic Gradient Descent
- CART – Classification and Regression Trees are explained in the following three videos:
- Decision and Classification Trees, Clearly Explained!!!
- Study Guide
- NOTE: This topic is covered The StatQuest Illustrated Guide to Machine Learning
- Decision Trees Part 2: Feature Selection and Missing Data
- Regression Trees
- How to Prune Trees (Cost Complexity Pruning)
- Classification Trees in Python, from Start-to-Finish
- Decision and Classification Trees, Clearly Explained!!!
- Random Forests Part 1: Building, using and evaluating
- Random Forests Part 2: Missing data and clustering
- Random Forests in R
- AdaBoost
- Three (3) things to do when starting out in Data Science
- Gradient Boost Part 1: Regression Main Ideas
- Gradient Boost Part 2: Regression Details
- Gradient Boost Part 3: Classification Main Ideas
- Gradient Boost Part 4: Classification Details
- Troll 2, Clearly Explained!!!
- XGBoost Part 1: Regression
- XGBoost Part 2: Classification
- XGBoost Part 3: Mathematical Details
- XGBoost Part 4: Crazy Cool Optimizations
- XGBoost in Python, from Start-to-Finish
- Support Vector Machines (SVM)
- Logistic Regression
- Logistic Regression, Details Part 1: Coefficients
- Logistic Regression, Details Part 2: Maximum Likelihood
- Logistic Regression, Details Part 3: R-squared and its p-value
- Saturated Models and Deviance Statistics
- Deviance Residuals
- Logistic Regression in R
- Neural Networks Part 1: Inside the black box
- Neural Networks Part 2: Backpropagation Main Ideas
- Neural Networks Part 3: ReLU in Action!!!
- Neural Networks Part 4: Multiple Inputs and Outputs
- Neural Networks Part 5: ArgMax and SoftMax
- Neural Networks Part 6: Cross Entropy
- Neural Networks Part 7: Cross Entropy Derivatives and Backpropagation
- Neural Networks Part 8: Image Classification with Convolutional Neural Networks
- Tensors for Neural Networks, Clearly Explained!!!
- The StatQuest Introduction to PyTorch
- Silly Songs, Clearly Explained!!!
- How my pop influenced StatQuest
- Classification Trees in Python, from Start-to-Finish
- Support Vector Machines in Python, from Start-to-Finish
- XGBoost in Python, from Start-to-Finish
High-throughput Sequencing Analysis:
- A Gentle Introduction to RNA-seq
- A Gentle Introduction to ChIP-seq
- edgeR, part1: Library Normalization
- DESeq2, part1: Library Normalization
- edgeR and DESeq2, part2: Independent Filtering (removing genes with low read counts)
- RNA-seq – The Problem with Technical Replicates
- RPKM, FPKM, and TPM
- 2020-01-06
- 2020-01-20
- 0:00 Introduction
- 1:04 Comment #1 – What is your favorite machine learning algorithm
- 4:40 Comment #2 – What is data leakage in machine learning?
- 8:39 Comment #3 – Where do you learn these nitty gritty details?
- 13:37 Live Question #1 – R-squared and Adjusted R-squared
- 17:23 Live Question #2 – How are the videos arranged on https://statquest.org/video-index/ (simple to complex)
- 18:26 Live Question #3 – Is it important to learn all of the formulas and equations even though we have advanced software that does the work?
- 2020-02-03
- 0:00 Silly Song and Introduction
- 0:18 A big huge announcement
- 3:14 Question #1 – Do we use statistical models to predict or explain stuff?
- 8:31 Question #2 – Can you show the effects of regularization?
- 9:42 My cat, Poe
- 15:04 Question #3 – How do I choose the best machine learning algorithm for my data?
- 21:17 Live Questions
- 2020-02-17
- 2020-03-02
- 2020-03-16 – Naive Bayes
- 2020-04-06 – Gaussian Naive Bayes
- 2020-04-20 – Expected Values (NOTE: There is now a full StatQuest video on Expected Values that revises and updates this material).
- 2020-05-04 – Conditional Probability
- 2020-05-18 – Bayes’ Theorem
- 2020-06-01 – Hypothesis Testing
- 2020-06-15 – Bootstrapping Main Ideas