Excuse me for being daft, but how do you transform back into 'what does this mean'?
For instance, in ex 3, we see that N. Ireland is an outlier. It wasn't obvious to me that the cause was potatoes and fruit.
How does PCA help you with the fundamental meaning?
In a toy example, imagine we had a 5D case where we have beer, cereal, fruit, beef, chicken, and salad, and we found out that the first principle axis is {0.3, 0.1, -0.5, 0.0. 0.2} (in the same order). Then the cause of the change would be due to primarily fruit and beer consumption.
From the figure 4, after rescaling and rounding the coefficients
The pc1 that separate N Ireland is approximately:
PC1 = FreshPotatoes + SoftDrinks/2 - FreshFruits/2 - AlcoholicDrinks/2
The pc2 that separate the other three countries is approximately: PC2 = FreshPotatoes - SoftDrinksAn attempt to remedy this is callled Sparse PCA (you can look it up on Google Scholar), in which the principal components are combinations of only a few features. This allows you to figure out which features are not important.
Linear regression has a notion of input and output. You want to model how the output depends on the input.
PCA does not. You have a pointcloud and want to find a compressed representation of it.
I did a singular value decomposition on a data set similar to the one Richardson used (except with international data). The original post here looks at the projection to country-coordinates, looking at what axes describe primary differences between countries. My students had no problem with that -- Wales and North Ireland are most different, in your example, and 'give' the first principal axis. But then I continued to do it with the foods, as Richardson did (look at Figure 4 in the linked file). Students concluded in large numbers that people just don't like fresh fruit and do like fresh potatoes. Hm. They didn't conclude that people don't like Wales and do like North Ireland; they accurately saw it as an axis. But once we were talking about food instead of countries, students saw projection to the eigenspace as being indicative of some percentage of approval.
How could we visually display both parts of this principal component analysis to combat this prejudice that sometimes leads us to read left to right as worse to better?
Ladder of Abstraction Essay: http://worrydream.com/#!2/LadderOfAbstraction
Stop Drawing Dead Fish Video: https://vimeo.com/64895205
This is awesome, thanks for sharing!