Decomposing the Human Palate with Matrix Factorization (opens in new tab)

(jeremymcohen.net)

118 pointsjmcohen11y ago20 comments

20 comments

19 comments · 9 top-level

fluidcruft11y ago· 7 in thread

> In other words, matrix factorization is the real deal. It’s the stuff that separates true data scientists from charlatans — the data alchemists, data phrenologists, and data astrologers of the world

I certainly hope this is sarcasm. Matrix factorization is, like, the go-to tool for people that don't know anything about what they are studying (i.e. it's the phrenologist's favorite weapon). Factor a matrix, throw the results out there, slap on some perfunctory "discussion" that has no real mechanistic insight. Boom. Published.

But maybe I'm describing "the stuff that separates true scientists from data scientists".

Data science manifesto: The purpose of computing is numbers.

Homunculiheaded11y ago

I have a running joke with my machine learning friends that I will write a Data Science/ML book titled "A Thousand Ways to Say 'Singular Value Decomposition'". The number of papers and techniques out there that are SVD with a few minor tweaks and a unique philosophical interpretation of SVD is hilarious.

Here are some examples:

Principal Component Analysis - SVD does dimensionality reduction where some n% of variance should be accounted for.

One layer Autoencoder - SVD done by a neural network

Latent Semantic Analysis - SVD on td-idf matrix we interrupt lower dimensions as having semantic importance

Matrix Factorization - SVD only now we interrupt lower dimensions as representing latent variables

Collaborative Filtering - SVD where we interrupt lower dimensions as representing latent variables AND we use a a distance measure to determine similarity.

fchollet11y ago

> One layer Autoencoder - SVD done by a neural network

Not necessarily. Any serious user of autoencoders would apply some kind of L1 regularization or other sparsity constraint to the coefficients learned, so that the autoencoder does not learn the principal components of the data but instead learns an analogous sparse decomposition of the data (with the assumption that sparse representations have better generalization power).

Also I don't think any of the techniques you mentioned is being passed as "not SVD" by its practitioners. People know they're SVD. These names are just used as labels for use cases of SVD, each with their specific (and crucial) bells and whistles. And yes, these labels are useful.

Cognition is fundamentally dimensionality reduction over a space of information, so clearly most ML algorithms are going to be isomorphic to SVD in some way. More interesting to me are the really non-obvious ways in which that is happening (eg. RNNs learning word embeddings with skip-gram are actually factorizing a matrix of pairwise mutual information of words over a local context...)

That doesn't make these algorithms any less valuable.

1 more reply

vedant11y ago

Nota bene, for anyone having trouble parsing Homunculiheaded's description of each algorithm: s/interrupt/interpret

hyperbovine11y ago

NMF != SVD.

robertsami11y ago

I think an important takeaway here is that when you know very little about the nature of your data, dimensionality reduction is a great place to start. You're right to criticize matrix factorization as a end-all be-all tool for "machine learners", but don't mistake Jeremy's point here -- matrix factorization, and dimensionality reduction techniques in general, is a great first step in understanding any dataset.

anigbrowl11y ago

Given the overall tone of the piece, I feel sure that it is sarcasm. I laughed hardall the way through the piece, starting with: But before I get into the details, I want to motivate the algorithm by pointing out an application to the most heavily studied problem in computer science: how to get people to buy more things.

bluishgreen11y ago

SVDs are as basic as hash-tables are to programming.

That would probably be the second thing people learn right after learning how to do some basic regression analysis.

vsbuffalo11y ago· 2 in thread

I find this deeply depressing: "the most heavily studied problem in computer science: how to get people to buy more things".

cle11y ago

I don't find it depressing at all. I find it empowering. It's the vehicle we use to drive technological progress. We've figured out ways to have people voluntarily give us money to drive technological progress, and so far this has been phenomenally successful. I was just thinking this morning about NASA's use of AWS to crunch data. That was driven by Amazon's ability to get people to buy more things.

cma11y ago

This is like saying we should create more government beauracy , because the software and computing tools companies will build to deal with it will in the end get us to Mars faster.

gwern11y ago· 1 in thread

Where did 40 factors come from? That seems to be entirely arbitrary - I'm not seeing any of the usual factorization diagnostics like scree plots or fits.

jmcohenOP11y ago

Right, the choice of 40 as the number of factors was totally arbitrary and unscientific. We tried factorizations with 30, 40, and 50 factors, and chose the one that produced the most interpretable latent factors.

Balgair11y ago

Since our actual sense of taste is composed of 5 degrees of freedom (1 sour receptor, ~30 bitter receptors, 1 umami receptor, 1 sweet recetor, and 1-3 salty receptors) I wonder how this analysis is then factored in. As other commentators have noticed, tasting is very much a social and visual sensation (olfaction is also huge, try tasting jelly beans while pinching your nose, it's like wax). Perhaps a method to further break the foods down into recipe ingredients and then compare would be beneficial to further categorize taste. You may find that some people don't have certain receptors (many people lack all 30 bitter receptors) and that foods that have these compounds are under represented in the data.

Still, great walk through, it was frank and I loved it!

mmastrac11y ago

If I'm not mistaken, this means that the idea of an "eigentaste" is probably half actual taste (sweet, savoury, sour, fishy) and half a function of local and global society (cooked on a BBQ, cooked in a slow-cooker, considered "Indian").

It makes sense. You can change the desirability of the same pizza by either 1) cooking it on the BBQ, or 2) calling it a vegetarian pizza, even if the end product would be virtually identical in the end.

Obviously this is reflective of the subset of society that gets recipes from AllRecipes.com, but I'll wager that you'd find similar categories if you did a larger analysis.

ecesena11y ago

I tried looking for "carbonara", meaning this dish [1]. There are 5 different results, and I believe that all of them refer to the same dish (+/- small variations). However their taste compositions are strongly different. Maybe a little bit of clustering at recipe-name level could improve the result.

[1] http://en.wikipedia.org/wiki/Carbonara

JasonCEC11y ago

Interesting take. I find the categorization around 'other foods' interesting, if not particularly novel. If there's anyone hanging around this board interested in applying machine learning and data science techniques to human sensory data... my company[1] builds flavor profiling and quality control tools for the craft beverage industry. Hit me up[2]!

[1] www.Gastrograph.com [2] JasonCEC [at] ^above url

raverbashing11y ago

Good thing about this article is how it breaks down all the steps, I was surprised.

And a very interesting application of NNMF, didn't know it took so long to process the data (but then again, Matlab is usually slow)

jorjordandan11y ago

educational and funny, great intro to matrix factorization. Thanks!

j / k navigate · click thread line to collapse