**The Basics:**

- Histograms, Clearly Explained
- What is a statistical distribution?
- The Normal Distribution
- What is a statistical model?
- What does it mean to “sample from a distribution”?
- The sample size and the effective sample size
- The Difference between Technical and Biological Replicates
- Standard Deviation vs Standard Error
- Pie vs. Bar Charts
- Logs (logarithms), clearly explained
- Confidence Intervals
- The Standard Error
- R-squared explained
- Fitting a line to data, aka Least Squares, aka Linear Regression
- Fitting a curve to data, aka Lowess, aka Loess
- General Linear Models Part 1: Linear Regression
- General Linear Models: Linear Regression in R
- General Linear Models Part 1.5: Multiple Regression
- General Linear Models: Multiple Regression in R
- General Linear Models Part 2: t-tests and ANOVA
- General Linear Models Part 3: Design Matrices
- General Linear Models: Design Matrix Examples in R
- Quantiles and Percentiles
- Quantile-Quantile Plots (QQ Plots)
- Quantile Normalization
- Maximum Likelihood
- Maximum Likelihood: A worked out example for the exponential distribution

**Statistical Tests:**

- Fisher’s Exact Test and Enrichment Analysis
- Which t-test to use
- p-values, clearly explained
- One or Two Tailed p-values
- Thresholds for Significance
- FDR and the Benjamini-Hochberg Method clearly explained
- p-hacking and power calculations

**Machine Learning and Dealing with large datasets that have lots and lots of measurements per sample:**

- Linear Discriminant Analysis (LDA) clearly explained
- Principal Component Analysis (PCA) explained in less than 5 minutes
- Principal Component Analysis (PCA) clearly explained (more details)
- PCA in R
- PCA in Python
- Multi-Dimensional Scaling (MDS) and Principal Coordinate Analysis (PCoA) clearly explained
- MDS and PCoA in R
- t-SNE, clearly explained
- Heatmaps – considerations for drawing and interpreting them
- Hierarchical Clustering
- K-Nearest Neighbors
- Decision Trees Part 1: Building and Using
- Decision Trees Part 2: Feature Selection and Missing Data
- Random Forests Part 1: Building, using and evaluating
- Random Forests Part 2: Missing data and clustering

**High-throughput Sequencing Analysis:**

- A gentle introduction to RNA-seq
- edgeR, part1: Library Normalization
- DESeq2, part1: Library Normalization
- edgeR and DESeq2, part2: Independent Filtering (removing genes with low read counts)
- RNA-seq – the problem with technical replicates
- RPKM, FPKM, and TPM

Advertisements