Tuesday, November 23, 2010

Principal Component Analysis: Which variables contribute most to principal components ?

Principal component analysis (PCA) is a mathematical transformation of possibly(correlated) variables into a number of uncorrelated variables called principal components. The resulting components from this transformation is defined in such a way that the first principal component has the highest variance and accounts for as most of the variability in the data (see http://wapedia.mobi/en/Scree_plot)

For each principal component you can see which variables contribute
most to that component. Depending on what you used to do PCA in R, you
can use loadings() function. Loadings function gives a matrix that
shows how each variable contribute to the principal components. You
can do barplots for each principal component. That will visualize what
contributes to which principal component.  You should do the barplots
for absolute values in the loadings matrix.

Take a look at the sources below, especially the second one

check here basic usage of PCA function in R

check here for detailed explanation on loadings()
especially check the section : "How do we know which species
contribute to which axes? We look at the component loadings (or
"factor loadings"): "

1 comment:

  1. Great explanation. Other awesome ressources on PCA: http://www.sthda.com/english/wiki/factominer-and-factoextra-principal-component-analysis-visualization-r-software-and-data-mining