undefined | Better HN

0 pointsPaulHoule3y ago0 comments

I’ve thought about satisfaction and mood tracking (I am sure these are linked) but haven’t built anything that i really use other than my memory.

The embedding system uses a probability-calibrated SVM. My average AUC is 0.77, I hear TikTok gets in the low 80’s and they are using collaborative filtering. I got 0.72 with a bag-of-words and logistic regression model.

From a product standpoint it’s got the disadvantage that it takes about 1000 judgements to really get good, right now I am training over the last 40 days of data because it doesn’t really get better with more than that which is good news because the compute and storage are nicely bounded.

0 comments

2 comments · 1 top-level

extasia3y ago· 1 in thread

Are you looking to make this a product?

On an unrelated note I realized recently that the 'bag' in bag-of-words is another name for the multiset data structure... Which makes sense when you think about the text as being a _set_ of tokens which can appear _multiple_ times.

PaulHouleOP3y ago

It's got potential as a product but I haven't committed to it yet. I have two ideas in mind, (1) a consumer product which is a bit like social media without the social (basically what it is now) and (2) a "pro" product that is capable of handling a network of classifications and more complex workflows.

I was talking to somebody about the potential as an open source project and came to the conclusion that it's a research project right now but my research projects are more solid than average. I know I'm not afraid to demo it because I run it every day and it spins like a top.

If you want to chat about it look up my profile and send me an email.

j / k navigate · click thread line to collapse