The embedding system uses a probability-calibrated SVM. My average AUC is 0.77, I hear TikTok gets in the low 80’s and they are using collaborative filtering. I got 0.72 with a bag-of-words and logistic regression model.
From a product standpoint it’s got the disadvantage that it takes about 1000 judgements to really get good, right now I am training over the last 40 days of data because it doesn’t really get better with more than that which is good news because the compute and storage are nicely bounded.