I am working on a smart RSS reader that collects about 1000 articles on a good day from various sources including CS papers from arXiv. It selects about 300 articles (summary only) most days that i browse through a TikTok like interface (i judge one article at a time so I get valid negatives unlike the typing “learning to rank” problem). I can favorite an article to retrieve it later, say i like it to see more like it in the future but not save, or say i dislike it.
It is powered by transformer models and sbert.net, these are used to assign articles to 20 clusters generated daily, i see the top 15 from each cluster. This does a reasonable job of handling a diverse feed that includes CS abstracts, trade publication article, sports news, etc. I have high satisfaction in days that the system gets a lot of articles (peaks on Thorsday) but less on the weekends, sometimes I backfill high-scoring articles from last week then.
I tried using fine-tuned BERT-like models for classification and got them to equal the performance of the embedding-based system after a huge amount of work and a much longer training time. My problem is pretty noisy and there is some limit to how high i can get the AUC.