Let’s have an open discussion on this topic and share the steps on how you grew !
How do you keep making your users happy at this stage?
My system is quite a bit different from others because, like TikTok but even more so, it demands a thumbs up/thumbs down judgement for every article so I get a set of negative samples that are really reliable.
There are numerous frontiers of improving this system. One of them is that there are certain things, like "roundup" articles that cover a wide range of topics (say https://www.theguardian.com/world/live/2024/jun/03/russia-uk...) that the embedding doesn't capture well, adding some new features could clear out maybe 10% of articles I'd rather not see but I am not in such a rush because overall the system is very satisfying to me and I am already blending in more random articles than that to get samples to keep it calibrated and also sometimes discover new topic areas I find interesting.
Another interesting frontier is sequential recommendation
https://paperswithcode.com/task/sequential-recommendation
but I'm not sure if I really want to take an ML approach to this because I'm not sure there is enough training data for one person's content-based recommendation. I'm not sure exactly how I want to do it, but when I post to a place like HN I do not want to post a stick of five articles from phys.org, rather I want there to be some diversity in my feed not just over the course of a 300 article batch but on the scale of individual posts. Items can be hung up in queues for several days in this process; most "news" on HN is fairly evergreen and it doesn't matter if it is delayed a week but articles about sports have a short shelf life as you look like a fool if you post an article about what happened in week 3 of the NFL after the games of week 4 have played. So I need some way for sports articles to "jump the line" ahead of other articles but I don't want that to privilege sports over everything else.
Similarly there is "the probability that article A is relevant" but there is also "Is A or B a higher quality article?" One Google innovation was using a document quality score (PageRank) asides a normal document-query ranking which is tricky because now you're not optimizing for one thing but trying to optimize for two things that could compete with each other. I am thinking about switching that system from a batch to a streaming mode and need some answers for that.
- blog posts selling coaching/master class
- yt videos promoting “how I made $10K/mo using this simple trick!”
- Reddit posts linking to yt video series and blog posts
- Show HN and use a guerilla tactic to promote your blog and video series
- switch coaching to “ai guru”