PCA clearly explained

Update: People have asked for PowerPoint slides for this presentation. Here they are! If you use them, please cite me. Thank you so much for your support.


Update: A lot of people ask to see example code that shows how to actually do PCA. Here’s how to do it in R

## In this example, the data is in a matrix called
## data.matrix
## columns are individual samples (i.e. cells)
## rows are measurements taken for all the samples (i.e. genes)
pca <- prcomp(t(data.matrix), scale=TRUE)

## plot pc1 and pc2
plot(pca$x[,1], pca$x[,2])

## get the name of the sample (cell) with the highest pc1 value
rownames(pca$x)[pca$x[(order(pca$x[,1], decreasing=TRUE))[1]]]

## get the name of the top 10 measurements (genes) that contribute
## most to pc1.
loading_scores <- pca$rotation[,1]
gene_scores <- abs(loading_scores) ## get the magnitudes
gene_score_ranked <- sort(gene_scores, decreasing=TRUE)
top_10_genes <- names(gene_score_ranked[1:10])

top_10_genes ## show the names of the top 10 genes

## get the scree information
pca.var <- pca$sdev^2
scree <- pca.var/sum(pca.var)
plot((scree[1:10]*100), main="Skree Plot", xlab="Principal Component", ylab="Percent Variation")

RNA-seq results often contain a PCA (Principal Component Analysis) or MDS plot. Usually we use these graphs to verify that the control samples cluster together. However, there’s a lot more going on, and if you are willing to dive in, you can extract a lot more information from these plots. The good news is that PCA only sounds complicated. Conceptually, it’s actually quite simple.

In this video I clearly explain how PCA graphs are generated, how to interpret them, and how to determine if the plot is informative or not. I then show how you can extract extra information out of the data used to draw the graph.


2 thoughts on “PCA clearly explained

  1. Thanks Joshua for such a wonderful explanation of PCA! I was wondering if you would be willing to share your slide deck. Your slides are very clear and well put together. I was hoping to use them for educational purposes. I will of course acknowledge you appropriately for the content. Thanks!


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s