twitter/the-algorithm (opens in new tab)

(github.com)

374 pointsjohns4y ago380 comments

380 comments

203 comments · 47 top-level

axg114y ago· 46 in thread

I've worked on very large scale recommendation systems at a FAANG. If Twitter's system resembles anything like ours, the concept of publishing or open sourcing "the algorithm" doesn't make sense.

Even if we were to open source all associated code and publish all related documents it would be very difficult to make sense of the entire system. That is precisely why companies such as Twitter A/B test the hell out of everything. What most people think of as "the algorithm" is a complex system that receives many inputs (maybe hundreds) and has dependencies on many other internal Twitter services. Tweets likely pass through multiple filtering steps as well as scoring before you ever see them. Each of these steps is highly contextual, depending on: location, past tweets, verification status, etc. You can attempt to predict the effect of a certain change, but you never know the actual outcome until you test it.

I think what will ultimately happen is that _some_ details will be published. Elon will parade that around as a victory for free speech as Twitter is now more "open". In reality, nothing of value will be gained as "the algorithm" isn't a simple function.

suadeo4y ago

As someone who’s also worked in this area, I disagree with this take.

There is typically clear objective function of a recommendation system.

What Twitter is optimizing for is what’s of interest here. And some of the hidden business rules. It’s likely these are specified in the code in an obvious way.

How exactly they achieve that is the part that is complex and relatively indecipherable.

It’s possible that it’s designed in such a way the optimization objectives are also unclear, but that would indicate a bad design and be to the detriment of the company and users.

arriu4y ago

Yeah seriously. If they are not able to put into words or diagrams what the algorithm is doing then the company itself has no idea how it works. And that to me would suggest it’s far from optimal.

Many complicated research papers have had no issues describing their models at a high level. This should be no different.

axg114y ago

Twitter PMs would argue they've already communicated the objective function: relevance and engagement.

My point is that the devil is in the details and implementation. These details are likely something that no one person understands and no one person is able to fix. The concept of being able to extract "the algorithm", factor it out from the codebase and share it with the public doesn't make sense to me. It won't be possible to fully understand how Twitter serves recommendations and ranks posts without understanding how all the different services at Twitter interact. Are they planning on open sourcing all of Twitter? Highly doubtful.

actuator4y ago

Yeah, this makes no sense. There is no golden algorithm that Facebook, Tiktok or Twitter has figured out.

All these feed rankings are complex combinations of features, models coupled with weights and filters. On top of this abuse detection layers are added.

Unless Musk is planning to open source user data to show what all the "scores" and "features" for all the entities are and how they were reached to, this will make no sense. The whole argument against some people being downranked has been, why me? Just writing a whitepaper to tell the general methodology, is not going to make that go away.

On top of that, exposing every vector through which you measure and stop abuse, will just allow for more sophisticated abuse.

vrfvr4y ago

Can you mention the source of your information?

KrishnaShripad4y ago

The openness for the algorithm is to make public how the algorithm works internally. It has nothing to do with how "novel" or "good" or "bad" the algorithm is. It is just a way to check if there is anything fishy going on in there. I don't see what the fuss is about if it only makes the defacto town square of the internet more transparent and open.

michaelmrose4y ago

Twitter isn't the town square of the internet. The internet is the town square of the internet and you aren't owed a fair ranking in relation to a certain search term user or combination thereof because no singular definition of fair.

1 more reply

346794y ago

I don't believe the point was ever to convey an complete understanding of the system, but to allow for the identification of "red flags": Specific functions that would be deemed controversial or counter to equal opportunity speech.

madeofpalk4y ago

Is the idea somewhere in the-algorithm there's a function called `derankGOPMembers()`?

Twitter's been pretty transparent in how it "deranks" certain accounts [1]. What more would come from opening the code that certainly not include the actual database of "no no terms" (if you were to believe that exists)?

[1] https://blog.twitter.com/en_us/topics/company/2022/our-ongoi...

3 more replies

hama_industries4y ago

This is likely in response to upcoming EU legislation on algorithm transparency[0]. It's not useful, but they'll need to do it eventually.

[0]: https://mashable.com/article/eu-digital-services-act-big-tec...

Traster4y ago

It's so difficult as someone who works in technology to tolerate this idea that there is "an algorithm". Maybe this will get flagged on HN but the overwhelming feeling is to just scream "don't be such a fucking idiot, this is a $43Bn company it doesn't boil down to 50 lines of code" and hell, there are plenty of examples of 50 lines of code at my company I could spend a month really understanding and even then not understand the full ramifications. It's a stupid persons idea of how tech works.

farias04y ago

I don't think that most people actually believe "the algorithm" is like a Python script or something.

3 more replies

cscurmudgeon4y ago

> it doesn't boil down to 50 lines of code

No one said that. You created a straw man and are arguing with it.

This comment says more about you than you think.

1 more reply

kringo4y ago

Understood and several of us have dealt with large complex systems.

Open sourcing algorithm or code is not about everyone go and analyze the same, instead when controversy or issues arise it'll be readily available for independent experts to review it.

axg114y ago

My point isn't that it's impossible to analyze. It _is_ impossible to analyze without the context of all of Twitter's other services though. Twitter are not going to open source all of their services.

dmarcos4y ago

I agree. I doubt the Twitter moderation system is more complicated than the Linux kernel or any of the open source compilers. Sufficiently motivated individuals contribute, find a fix bugs all the time.

memish4y ago

Even just releasing the audit trail of shadowbans, upranking, downranking would be meaningful. Full transparency of actions taken in the past and going forward.

stjohnswarts4y ago

This will never happen, not even under Elon. It makes them wide open to litigation.

1 more reply

dekhn4y ago

I guess I kind of think about these things not as algorithms, but as a collection of frontends and backends. Collectively, any human request (typing [ ramen shops near me ] as an example) will be handled by a bunch of different code, typically that code is structured as RPCs.

We can think of the main interaction as being a query which is an RPC payload. The contents contain the user request and a wide amount of other context (either referenced by a collection of keys like cookies, or materialized like fields that specify the user's age) and the response is a web page which contains sections (the web search response to the query, as well as the ads; either these could be rendered to two different frames, or interspersed, by the result presentation engine).

That query -> frontend translates into a tree or a graph of requests which collect up various bits of contextual data required to satisfy the query. For example, the query terms might be rewritten slightly and then sent to a web search backend which searches/ranks documents and returns the top matching documents on the organic web, or sent to an ads backend that returns the top matching bidders for those query terms. Again, just RPC/responce, although the actual context that the frontend and backend systems are dealing with, and use to modify the result, are truly enormous.

Each of those backend systems itself was produced with an enormous amount of data processing and contextual data that is available at serving time. All of this is implemented using various algorithms; everything from the TCP algorithms that manage bandwidth to the neural networks doing inference on the joint product of the user context and the query context and the ad context, and the logging system that writes the queries and their clicks to centralized storage for more ML training.

In theory though you could set up a system that compiled the full web stack, and ran the end to end of a user query, dumping all the intermediate RPCs, etc, from a modestly sized instantiation of the production system. and people could sit down and inspection what terms affected query result order, or which pages were omitted at which part of the filtering, or what data was logged.

It would be hell for a team to maintain and keep up to date wrt the production system, but many folks do this any way to have a simple version of the system around so they can make quick changes and see if it breaks part of the complex system without doing a full deployment.

dsugarman4y ago

Even if publishing the underlying algorithm isn't useful, they should be able to do research on what the end result of endless a/b testing has created in practice. I'm sure you could find the highest level heuristics and that would be extremely valuable to both the public and decision makers internally.

nojito4y ago

You won’t know what they were A/B testing though.

Railsify4y ago

I get what you're saying. I'd imagine this repo will be closer to pseudo code than "complete" code for all systems that a tweet flows through. For example, the system that flags/remove picture of hotdogs would likely be represented as "if image contains hotdog: weight: -500"

zeckalpha4y ago

That could be a starting point for a DSL that realizes the algorithm.

risyachka4y ago

As you probably know, the algorithm is NOT the code, never has been.

It can be pseudo-code or diagram or whatever that can be used to understand what logic lies behind decision making.

cscurmudgeon4y ago

Just because it has evolved to be a complex monster that is unnaccountable doesn't mean it has to be that way in the future.

There are ways to translating trained ML models and associated systems into understandable hierarchical rules.

Twitter's timeline is NOT AGI.

tonguez4y ago

it’s almost like this thread is nothing but people being pedantic and saying, there isn’t just one algorithm, it’s multiple algorithms. yes do you really think elon doesn’t know that? it’s almost like he’s just trying to get the point across in the most simple and basic way possible. pointing out that the recommendation algorithm isn’t just one algorithm isn’t profound at all. this entire conversation is mostly just people who want to argue and point out that they know something about how large tech companies work. congrats, why don’t you tell elon since you’re obviously so much smarter

andrewmcwatters4y ago

For someone who worked on recommendation systems, you really don't seem to understand the concept of an "algorithm" across the abstraction of multiple systems and at different layers of the stack you worked on.

In fact, a lot of people here really think what people are talking about is the equivalent of what is handled in a subroutine.

No, what people are talking about when they talk about "the algorithm" is anything affecting the result set they're reading. Concepts like eventual consistency and edge computing are... well... a part of a model which laypeople, and even reasonably technical people call an "algorithm."

Being pedantic about whether or not this happens in an SQL query, or across multiple codebases, or by region, doesn't escape the question.

mistermann4y ago

Absolutely agree with what you say, but I wanted to pick a nit here:

> Being pedantic about whether or not this happens in an SQL query, or across multiple codebases, or by region, doesn't escape the question.

Actually, epistemic ~"muddying of the waters" is a well proven technique to control perceptions and public discourse. If it works on HN folks, I expect it would work much more easily on amateurs.

lamontcg4y ago

Did you just tell me to go write a map/reduce function in Erlang?

colonwqbang4y ago

How did you work with a system that couldn't be understood even with access to full source and documentation? Surely your engineering process wasn't purely stochastic.

karmasimida4y ago

Let alone those 10s of GB of embeddings which might be user identifiable, without those, how do they claim to 'open source' the algorithm?

dillondoyle4y ago

It would be gamed 100%.

abledon4y ago

theres gonna be a few crazy guys who devote their entire time to understanding some obscure part of this algorithm

MangoCoffee4y ago

i'm looking forward to see what's going to be on it.

temp89644y ago

I don’t understand people like you. Just because Elon Musk used the term “the algorithm” in conversation, it must be interpreted literally? And for someone like Elon Musk, he doesn’t know the complexity and need you to point out?

stefan_4y ago

It's so weird. We get it, data scientists, your implementations are a garbage mess of last minute paper submission levels and you yourself are blind to half the things happening. That doesn't mean an implicit algorithm doesn't still exist. The term is perfectly applicable and it would probably do you a world of good to abstract it to the point where you can legitimately put it into a single repo.

mhh__4y ago

Of course it doesn't make sense. I think it's just a dog whistle to the people who believe google have a guy in a room somewhere turning the "conservative search results" lever down a notch during elections.

tyingq4y ago

Not saying Google is turning down conservative search results, but they absolutely can. Using the same cycle of human raters and tweaking weights they used to push down comparison shopping sites.

See http://graphics.wsj.com/google-ftc-report/

1 more reply

memish4y ago

"Among the elite, and within Twitter specifically, there is much more inclination to ban the right.

I say this as someone whose political views, if you force them onto the left-right spectrum, probably end up about 80% toward the left. E.g. I've spent millions over the past several elections supporting the Democrats.

It used to be that censorship was something the right did, and free speech was something the left were in favor of. But over the last few decades, banning "problematic" ideas has become a huge component of left culture (http://paulgraham.com/heresy.html).

Plus tech companies in general, and especially Twitter, lean to the left. Imagine walking around Twitter pre-Covid. You'd find plenty of openly far-left employees. How many openly far-right employees would you find? I don't think you'd find any.

The combination of (a) the left's recent focus on banning heretical ideas, (b) the leftward lean of tech companies generally, and (c) the leftward lean of Twitter even among tech companies, means that right-wing speech is much more likely to get banned on Twitter than left.

That's why people on the far right keep starting lame Twitter alternatives. You don't see people on the far left doing that. They don't need to. They have Twitter."

https://twitter.com/paulg/status/1515235822890532864

4 more replies

MetaWhirledPeas4y ago

> I think it's just a dog whistle to the people who believe google have a guy in a room somewhere turning the "conservative search results" lever down a notch during elections.

Even if that's precisely true, is it not good to be creating a more trusted space for everyone? The grievances, regardless of merit, are mostly coming from the right. If you want to create a service that caters to all you're going to have to address their concerns. If he can do that in a way that is fair to all, it sounds like a win to me.

2 more replies

briandear4y ago

And that hasn’t happened? Google got fined by the EU for manipulating search results. It definitely happens.

https://amp.usatoday.com/amp/1248099002

https://www.vox.com/2017/6/27/15878980/europe-fine-google-an...

1 more reply

NaturalPhallacy4y ago

A comparison I did from four months ago across google, ddg, yandex, and search.brave, searching for "are conservatives being silenced by big tech"

https://i.imgur.com/MVlshAT.png

You don't have to be conservative to see there's a pretty significant bias, just in the headlines. I'm a Pacific Green and I can still see it.

3 more replies

KrishnaShripad4y ago

Was it the algorithm or was it a "guy in a room" which/who decided to block New York Post's article on Hunter Biden's laptop scandal?

2 more replies

throwaway8943454y ago

It's not a dog whistle, it's just a simplification because it's easier to say "the algorithm" than describing the system. Moreover, this nomenclature was mainstreamed by left-wing concern about "racist algorithms", especially in law enforcement, so it's not a partisan phrase at all.

1 more reply

commandlinefan4y ago

> turning the "conservative search results" lever

It may not have been algorithmic, but it definitely happened.

3 more replies

topspin4y ago

> the concept of publishing or open sourcing "the algorithm" doesn't make sense

Whatever. He paid for it. Private company. Do what it wants.

> it would be very difficult to make sense of the entire system

No. Not buying that. Difficult isn't the same as impossible and, if only to game the system (harder,) people will figure it out. And even if it isn't 100% possible to reproduce the results based on what is released significant insights will still emerge.

Further, there is some ceiling on the complexity. Twitter operates at scale and that means they can't actually burn 52kWh of power for every tweet or store TiBs of metadata for every user to do the analysis or take 30 minutes to publish. Likely it's a pretty efficient system and, therefore, limited in complexity.

Irishsteve4y ago· 26 in thread

Many people have commented that it is empty. However what they do not realize is that there has never actually been an algorithm and that is why it is empty.

bastardoperator4y ago

This is most likely the correct answer. I doubt there is a single piece of code or algorithm that controls all of twitter. From my limited perspective it's just a catch phrase to simplify what is likely tens of thousands of lines of code.

andrewmcwatters4y ago

Well, that's an algorithm. If you have to draw a diagram showing how edge cache nodes affect a user's results based on whether or not they signed in from New York or Phoenix, or if the downtime of a cluster affects the relevance rankings of tweets for a photography enthusiast, then guess what...

That's an algorithm.

4 more replies

ltbarcly34y ago

Yes there is an algorithm. It may not be concise, it may not be found in one place, and nobody alive may be able to explain it to you, but the existence of timelines that are ordered certainly implies the existence of 'the algorithm' that is used to order timelines.

I mean even if timelines were totally random, or based on some external facts, there is an algorithm that is being used to order them.

This isn't just an academic distinction. Claiming 'there is no algorithm' because the algorithm is intentionally or unintentionally obfuscated or complicated has implications if that claim of 'no algorithm' is accepted. If my algorithm for approving mortgage applications is explicitly racist, I can just spread it's functionality across myriad services owned by lots of teams, make it almost impossible to figure out how it works, and then avoid any responsibility by saying 'there is no algorithm to decide loan approvals'? That would be bullshit!

influx4y ago

There's not an algorithm to order timelines?

jonwinstanley4y ago

If it’s a list of tweets ordered based on a kajillion ML data points that varies per user is it still an algorithm?

And does every user have their own algorithm?

And could it be made readable to a human?

2 more replies

dmonitor4y ago

Turns out, it's been RNG this entire time.

1 more reply

Hamuko4y ago

  ORDER BY timestamp DESC;

2 more replies

krapp4y ago

There probably is not "an algorithm" on a site as large and complex as Twitter, no. There are probably dozens if not hundreds of algorithms spread throughout the codebase which affect the timeline for individual users, possibly even code entirely self-generated by ML systems.

3 more replies

apgwoz4y ago

Turns out they never actually scaled their mailbox pattern. As a result, what you see is what they delivered before timeout.

jdrc4y ago

just birds poking random tweets

WORMS_EAT_WORMS4y ago

This would be some pretty funny semantics... but, however it works, message is the same: Transparency for how these work

- Search results

- Comment Order

- Timeline Order

- Trends

- Human vs code

Personalization in general. Big gigantic “why” when it happens to you

Diesel5554y ago

Of note for anyone confused and reading the comments now, the link was to an empty, but real, twitter repository at https://github.com/twitter/the-algorithm. Now it is a 404.

temp89644y ago

Here’s the play: interpret “the algorithm” in a stupid way, and then claim it doesn’t exist, and then make a joke about it on company repo.

You know what? This does demonstrate the internal problems inside Twitter and shows the need for shakeup.

omoikane4y ago

That sounds like a variant of this joke: https://picturesofpeoplescanningqrcodes.tumblr.com/

I-M-S4y ago

Joke's on them, in the last 2 years QR codes made the greatest upset of all times

andrewmcwatters4y ago

If there is a total set of data, and a subset of it is produced for users, there is an algorithm. `SELECT TOP` is an algorithm. A ML model with a trillion parameters is an algorithm.

There is no "no algorithm."

sverhagen4y ago

You're not wrong. But would it be useful to define better what we mean with algorithm then? Like: we mean "non-obvious algorithm"?

1 more reply

NaturalPhallacy4y ago

We have a leak revealing a "trends blacklist" at the very least. Some of their measures are detectable too: https://taishin-miyamoto.com/ShadowBan/

yifanl4y ago

As far as I'm concerned, the algorithm is and can only ever be an implementation detail. If the leadership of Twitter changes overnight, it'd be highly likely the types of content that gets recommended will subtly/not-so-subtly change... because the algorithm has been changed.

It's an interchangeable function, it would only be publicized if it's clear to leadership that it wouldn't affect their revenue if people started trying to game towards the published algorithm.

bberenberg4y ago

I feel like this is a line from The Program (https://www.programaudioseries.com/)

AviationAtom4y ago

You don't think recent news of the EU compelling social media companies to disclose their algorithms has anything to do with this?

layer84y ago

No, the repository is a joke (IME). Regarding the EU, the details haven’t even been nailed down yet, and the timeline is that it will only become a requirement in 2024 I believe. The goal also isn’t to publish “the algorithm”, but to give researchers and civil society representatives access to the training data and information about the most important factors controlling the algorithm, so they can make assessments on the basis of that. It is not about open-sourcing code.

1 more reply

bilekas4y ago

This, personally I was thinking in light of the new upcomming sale, the current Twitter staff might decide to 'open source' some algorithm. Which would be an effective method of protest against a sale like that.

But there definitely is a relationship algo that could be considered theirs, like all social medias inflating the bubbles users all feel.

stjohnswarts4y ago

There's clearly an (probably shifting daily) algorithm. I'm not sure what the statement the site is trying to make. Even a random number generator that selects tweets at random for your feed is an algorithm.

crate_barre4y ago

Twitter is mostly an infrastructure operation imho, someone school me. It’s a scale-based problem space, how do you get all these tweets out in real time at minimum, and at best, how do you do some level of topic bucketing on top of it.

Karrot_Kream4y ago

Right but "the algorithm" is inextricably tied in with scaling. "the algorithm" is designed to handle updates at scale and the model probably has different parts updating to different events on different cadences.

shrimpx4y ago· 13 in thread

I don't understand the concept of open-sourcing "the algorithm".

First of all, "the algorithm" is probably hundreds of thousands of lines of code, including all the tedious boilerplate like cache policies and multi-AZ logic.

And second of all, doesn't the algorithm include machine learning components, which are trained on terabytes of data? That data will likely be impossible to open source. And open sourcing the neural nets without the training data is mostly meaningless from a transparency perspective?

owaislone4y ago

Open sourcing is in this is not about the implementation or the CS algorithm. It is more about transparency. I think the idea is that the public should know how tweets are ranked, why tweets show up in timelines and which timelines, what makes tweets popular etc. Imagine Google publishing a document detailing how their system ranks pages aka publishes internal SEO rules officially. I don't know if it is a good idea or not. People with enough resources might be able to game the system (if they don't do it already).

Traster4y ago

The only change is that people with resource would stop guessing how to game the system, and start employing people to ensure they systematically game the system.

rootusrootus4y ago

> if they don't do it already

That's an interesting point. A practical description of the algorithm from the perspective of someone trying to game it may be more useful than anything Twitter or Google would release.

WORMS_EAT_WORMS4y ago

Probably right but I think trying to be transparent is better than not trying at all.

Gesture carries weight to the users too.

Not sure any big company has tried this before. I could be wrong, but either way looking forward to it / FWIW hope it catches on.

ProAm4y ago

Or is it six cats in a basement with a laser pointer and a mouse?

The point of releasing it is to let people know exactly why they see the tweets they do in the order they do. I hope Elon just goes back to time base ordering of tweets.

LordDragonfang4y ago

I personally know a FAANG employee whose full time job is building tools to try and help understand why the company's recommendation algorithm picks the things it does (and more specifically, predict how changes to the algo will affect that).

Even the people who build these systems barely know what the algorithm is going to do, much less why. It will be a herculean task to try and convey that to an average user.

1 more reply

nthot4y ago

Disclaimer: I'm pretty new to twitter, so I may be misunderstanding something. On my Twitter Home Screen there are three little stars on the upper right of the central container which allow you to toggle between "top tweets" (ie. 'The Algorithm') and "latest tweets" which is the time based ordering.

2 more replies

buu7004y ago

Maybe it's a room full of lava lamps.

jonas214y ago

This is pretty standard in the machine learning world. You'll open-source the code and weights trained on a public data set (these are often licensed specifically for non-commercial use). But in production, you'll be using different weights trained on a proprietary data set.

vijaybritto4y ago

The whole point of the repo seems to just show that there's nothing called as "the algorithm" It's probably something like 100s of 1000s of algorithms doing different things

threeseed4y ago

You can open source the model code.

And developers will be able to train a model using it on a subset of Twitter data. Just that the quality of the outcome won't be the same as having the full set of Twitter data.

thorgutierrez4y ago

If it's too complicated, there is a good chance Elon will ask to simplify it until it can be open sourced.

mrkramer4y ago

Ever heard of Pseudocode which happens to be human-readable? If they are really going do it they will release source code for programmers and in general computer scientists to analyse and on top of that they will release pseudocode which non-techincal people can somewhat understand.

jdrc4y ago· 12 in thread

I know the algorithm i use, it ends with ORDER BY date DESC.

themodelplumber4y ago

Nice for some things. Absolute death for discovery in a lot of cases though. Either trim your follows down dramatically, or miss some incredible stuff...

jdrc4y ago

there is nothing to discover anymore. everyone is retweeting the same stories. and the best bet at discovering is via soomeone you follow

onychomys4y ago

Same here, and it blows my mind that we're in such a minority.

xvmt4y ago

This is what YouTube's Subscriptions tab does and I hate it. Rarely uploading quality creators are hidden in a sea of daily uploads from channels which focus on regular uploads rather than quality.

I think there is a place for a smarter algorithm than "ORDER BY date DESC", but one that is not designed to manipulate users into addiction.

jdrc4y ago

Yeah it s not good for youtube because youtube doesnt have retweets, so people subscribe to everything they like. On twitter you can subscribe to a few people and they ll still retweet most of the stuff that the algorithm would give u

hotpotamus4y ago

Is that actually possible on Twitter? (I have an account but don't use it.) I do believe them that when a user follows several 100+ other users, it would probably be impossible to read every tweet chronologically.

dilap4y ago

i always use chronological, atm following very few people, but in the past following too many to keep up w/.

even when following too many to read everything, i preferred chrono because it would yield a coherent slice of what was happening. an unbiased sample.

twitter is basically a medium for conversation.

imagine there's a large party. would you rather listen to an out-of-order "most important" set parts of the conversation, or just a slice of conversation from a particular time?

well, actually, both can be interesting, but generally the slice is more coherent. :-)

almog4y ago

Assuming we're talking about the timeline "latest" mode (rather than "top"), IIRC in the past the used to do the home timeline processing on the read paths, but have long since opted to optimize for read latency, thus the heavy lifting is performed during the insertion, so in broad strokes, when someone post a tweet, the three main things that happen are:

1. Insertion of tweet to tweets table.

2. Insertion of that tweet-id to the home timelines of all that user's followers.

3. Insertion of that tweet-id to the user-timeline of that user.

On the read-path, if I'm not mistaken, the only join that happen is between the requested timeline and the tweets table (which is replicated across cluster of machines but not partitioned, or at least I remember reading that was the case not many years ago)

madeofpalk4y ago

Yes. It is. It's been an option since... forever? Many many years at least. https://i.imgur.com/JM6saa2.png

For about a week they made a change that prevent that chronological timeline from being the default, but they reasonably quickly rolled that back. https://www.theverge.com/2022/3/14/22977782/twitter-default-...

lysp4y ago

I use lists by subject matter, including a list of personal contacts.

I then use tweet deck which shows a column of tweets per list.

As these are separated by subject and are chronological, it makes it far easier to follow.

ironmagma4y ago

Impossible to read, maybe. Impossible to display, definitely not.

themanmaran4y ago

Sort by recent is an available setting. Whether you can keep up with the volume is up to how many people you choose to follow.

CincinnatiMan4y ago· 11 in thread

Makes me wonder how Twitter employees internally are handling the news. If they are celebrating or commiserating?

mostlysimilar4y ago

Twitter Locks Down Product Changes After Agreeing to Musk Bid - https://finance.yahoo.com/news/twitter-locks-down-product-ch...

brink4y ago

Looks like a wise move to keep potential destructive protests at bay.

1 more reply

memish4y ago

> Twitter imposed the temporary ban to keep employees who may be miffed about the deal from “going rogue,” according to one of the people.

What could a rogue employee do?

7 more replies

thejackgoode4y ago

Are there precedents of this? This looks like there is not much trust internally. I doubt this was the first Elon’s decision

2 more replies

temp89644y ago

Thanks! Good to know.

I was actually wondering some people may want to remove traces of what they have been doing.

I wish someday we can see the internal communications lead to the Hunter Biden laptop story ban.

philjohn4y ago

Wonder if their RSU grants have a clause that converts them to cash in the event of a takeover.

And having been in a company that was taken over, it's a mixture of emotions - is my job safe, will this be the same culture I joined for etc. etc.

bogomipz4y ago

>"Wonder if their RSU grants have a clause that converts them to cash in the event of a takeover"

This is interesting question since RSUs are a big part of total comp but how are unvested RSUs dealt with when the stock is retired? Are those put on a future cash comp schedule? And if so at what conversion rate?

1 more reply

syshum4y ago

I am sure there is a small set of Vocal employee's blasting internal communications with the end of the world messages, a majority of employee's just wanting to keep working and get their salary, and a different small set of employees hoping Elon purges the first set of employee's making the work place better.

coffeeblack4y ago

There are probably some secret celebration.

paulpauper4y ago

I am sure those with in the money options are happy

datalopers4y ago

Sell RSUs and get out.

transitivebs4y ago· 7 in thread

Is this supposed to be a joke? It's clearly an empty repo.

Either this is a mistake, or this is a really, really misguided attempt at a joke from Twitter.

koboll4y ago

The tweet announcing it was captioned "watch this space":

https://twitter.com/willnorris/status/1518694675909013504

Which seems like a promise they intend to actually open source something there.

seaman19214y ago

Who is that guy even ? Is he even associated with twitter ? Calling it an announcement seems like a stretch

1 more reply

transitivebs4y ago

For anyone wanting a non-empty version, check out my article on how Twitter's algorithmic feed works from last week https://transitivebullsh.it/oss-twitter-algorithm-part-1

aeyes4y ago

How you _think_ it works from 1000 miles above.

evandale4y ago

It wouldn't surprise me. It wouldn't be the first joke about him and his buyout.

https://twitter.com/TwitterComms/status/1511456430024364037

dmonitor4y ago

It was created 20 minutes ago. Maybe a WIP?

transitivebs4y ago

For a company of Twitter's scale and resources, any public repos are supposed to go through legal to clear (this is how things work at every FAANG I've worked at).

So if it was a WIP, it'd be a private repo until it's ready to release publicly.

1 more reply

mrintellectual4y ago· 5 in thread

This seems to be a practical joke by a Twitter engineer as opposed to an actual release.

somishere4y ago

Could you take it any other way? I mean obviously you could, because most here appear to be taking 'the-algorithm' very seriously .. but, seriously? It's funny. No joke.

johntb864y ago

It seems more like a symbolic gesture.

hunterb1234y ago

Or a commit is being prepared? It's only 20 minutes old.

mrintellectual4y ago

It just seems unlikely that the algorithm would be open-sourced right after a deal for Twitter is agreed upon (but before it actually goes through). I've never seen a buyout of this scale done by an individual, but I imagine the SEC and several other parties will need to be involved.

At the minimum, I would make a private Github repo first, add all relevant commits, and then make it public once there's actually content.

1 more reply

kaladin-jasnah4y ago

https://twitter.com/elonmusk/status/1518677066325053441 states that Elon Musk is to looking "make the algorithms open source."

johnsOP4y ago· 4 in thread

Will Norris who works on OSS at Twitter posted this[0]: "watch this space https://github.com/twitter/the-algorithm"

[0]: https://twitter.com/willnorris/status/1518694675909013504

bsimpson4y ago

Oh hey - I didn't know he'd left Google. He was a big part of open source there too.

willnorris4y ago

same team. different company.

1 more reply

lupire4y ago

did the repo ever exist or was he joking, or teasing?

FR104y ago

I just checked and its gone. But when it did exist it was just an empty repo no README even.

standyro4y ago· 3 in thread

I tried to make a pull request already, haha.

error forking repo: HTTP 403: The repository exists, but it contains no Git content. Empty repositories cannot be forked. (https://api.github.com/repos/twitter/the-algorithm/forks)

My thoughts:

- Explicit rules for temporary and permanent bans

- Edit button

- More fun and thoughtful conversations like HN

- Less thought bubble Brooklyn based reporters, less VC and side grind hustle snake oil, maybe more comedians and memes?

semitones4y ago

Care to explain "thought bubble Brooklyn based reporters"? Did you choose Brooklyn for the alliteration? Or is there something about my home I should know about?

JohnWhigham4y ago

Large amount of blue checkmark journalists live there, simple as that.

1 more reply

standyro4y ago

It was just a joke. I have many friends in Brooklyn. I just really dislike the culture of Twitter addicted journalists who see the world through a myopic lens of whatever is trending on Twitter is important to cover, and there’s quite a lot of them in New York.

nighthawk4544y ago· 3 in thread

Seems weird to start as a non-private repo until there's some content. Also bit of an unusual name. Can't tell if this is internal trolling or the future

pavon4y ago

The Twitter board has unanimously approved Musk's purchase, pending approval by stockholders, and Musk has stated that once he owns the company they will open source the "algorithms" for transparency.

So not a troll, but yes it is odd to put up an empty repo, and announce the repo before there is anything in it.

madeofpalk4y ago

Seems like a perfect troll/joke, especially given everything.

1 more reply

extheat4y ago

They probably have their own internal Git hosting service. Pushing to the public Github repo could be done at a later point, but the Git repo hasn't actually been created yet here, just the project on Github.

unethical_ban4y ago· 3 in thread

There are elements of their algo that I think should be openly defined, and perhaps there should be some regulatory branch that reports to Congress that has full access. However, obfuscation is often necessary to countering bad actors.

LegitShady4y ago

>perhaps there should be some regulatory branch that reports to Congress

I think only if you offer twitter users the level of first amendment protection they'd expect with a government body. Otherwise reporting to congress would be an a bold faced circumvention the first amendment. Twitter is a privately held company with no need to report to congress.

readbeard4y ago

On the other hand, wouldn't open sourcing the algorithm help accelerate the identification of possible exploits?

unethical_ban4y ago

"algorithm" here isn't some fancy, hard to debug code. It is the business logic of weighting tweets and how recommendations are made.

There is great opportunity to abuse this by Twitter, yes. There is also a lot of money to be made. But in defense of some of that being secret, is the fact that any publicly known ruleset (with no hidden exceptions) _will_ be exploited by bad actors. Imagine if search engines told spam sites exactly why their site dropped in page rankings.

EMIRELADERO4y ago· 3 in thread

Kind of unrealistic but I hope Twitter now open-sources not only the algorithm but also the Rails monolith itself. Would be kind of interesting to see how everything is done

influx4y ago

The rails monolith is long gone.

EMIRELADERO4y ago

In that case the same applies for the microservices

1 more reply

kaladin-jasnah4y ago

That gives even more reason to open-source it, right? If Twitter isn't using that codebase anymore.

threeseed4y ago· 3 in thread

Anyone who actually uses Twitter already knows the algorithm:

* Chronological - reverse sort by date

* Home - for all of the followed topics, recommended topics, retweets and tweets in the past day determine the estimated level of engagement, include the highest and reverse sort by date. This is likely to be a fairly basic ML model.

It will be uncontroversial, technically unsophisticated and of no practical use to anyone - users, developers or researchers.

This is not going to be PageRank where some genuine new insight was discovered.

whimsicalism4y ago

What are you basing this on? I wouldn't assume response prediction is using a "fairly basic ML model," it can be a lot more than that.

threeseed4y ago

If it's a simple, constrained problem where the number of available features is low then inherently the complexity can never be high.

I've built hundreds of models and run a ML company and I don't believe it's technically possible for this rule not to be the case.

1 more reply

tomcam4y ago

You forgot the gigantic amount of human intervention required to unperson people who tweet against Twitter‘s political interests

1 more reply

mrkramer4y ago· 2 in thread

Imagine having something like this for Google's and YouTube's algorithms; $100bn+ SEO industry would go bankrupt or at least they would pivot to some sort of advising but there wouldn't be the mayhem that we have today.

bigfudge4y ago

Scientific knowledge is written down and freely available but I still don’t understand most of it. I think a public algorithm would increase SEO business if anything because it would get more effective once the bullshit was debunked.

sjtindell4y ago

Results would also become an absolute cesspool. Say what you want about how they are now, but if the people gaming it could see the exact rules, it would become completely useless.

Traster4y ago· 2 in thread

At the time of posting, Will Norris (the open source lead at twitter, admin of their github account presumably) posted this. It has 44 retweets, 193 likes, 17 quote tweets, on github it has 1.6k stars.

That seems... bizarre to me?

______-_-______4y ago

That's more stars than I'll ever have on any of my repos. Maybe I made a mistake not joining a faang

enw4y ago

Nope. People are just excited that the Twitter cesspool might finally improve.

g105b4y ago· 2 in thread

Can someone explain this to me? All I can see from this link is an empty GitHub repository. Not sure what I'm missing here.

tux19684y ago

It's just a declaration of intent at this point.

topspin4y ago

It's 404 today.

LugarOS4y ago· 2 in thread

It's empty.

heffer4y ago

It's a performance.

enahs-sf4y ago

Did it get taken down already? I think open-sourcing the algo would materially change the value of the deal.

2 more replies

rickreynoldssf4y ago· 1 in thread

Wait until Musk finds out its a bunch of gnarly PHP 5.4 code much of which is a black box everyone is afraid to touch.

Sirened4y ago

The Timeline is actually just a SQL expression with 500 sub-queries

paxys4y ago· 1 in thread

I'm going to guess some engineers at Twitter with Github org permissions are having fun with the "release the algorithm" discussion.

Sirened4y ago

haha that's what I figured, someone decided to quit if Musk acquired Twitter and figured they'd just leave one last practical joke

Barrin924y ago· 1 in thread

whatever will show up in this repo, I hope people realize that depending on what data you put into some algorithm you can get whatever output you want, and twitter is never going to (and neither can or should they) publish everyone's personal information and interaction on the site.

So I'm not sure what the ultimate point of this exercise is other than producing faux-transparency.

bigfudge4y ago

I still think there could be lots of interest, even without the data. In fact some of the most interesting parts would be the algorithm in the broadest sense — how tech interacts with company policy and SOP. For one small example, what aspects of moderation/banning happen automatically without any further human intervention?

qgin4y ago· 1 in thread

I have literally no idea how a "twitter algorithm" could be published on github. Maybe I've been doing recommender systems wrong.

solenoidalslide4y ago

You can publish data flows and models to github in addition to source code components.

Synaesthesia4y ago· 1 in thread

So nobody is being shadowbanned or suppressed?

dimgl4y ago

Last I heard, Twitter has internal tooling that allows moderators to shadowban or suppress.

minroot4y ago· 1 in thread

Why do we want to know the "algorithm"?

bee_rider4y ago

Clearly it will contain a straightforward bias against our pet interest, confirming a grand conspiracy and validating our paranoia.

zelon884y ago· 1 in thread

if ($has_blue_checkmark) show_post_to($everyone);

rzarate4y ago

Ugh, php...

TrapLord_Rhodo4y ago· 1 in thread

Musks first order of business?

LegitShady4y ago

clean house

oxplot4y ago· 1 in thread

Musk has repeatedly talked about "open sourcing" twitter's algorithm. Given Musk is (understandably) super impatient, this repo may be his first move. I expect this to start with bunch of readme and other high level docs and evolve into details and eventually code.

12ian344y ago

Seems like the most reasonable take. This move feels in-character for Musk.

edouard-harris4y ago

Assuming Twitter is serious about publishing their feed algorithm [1], it's possible they're merely anticipating the EU's upcoming Digital Services Act which was finalized over the weekend. Among other things, the Act will compel large platforms to "make the working of their recommender algorithms (used for sorting content on the News Feed or suggesting TV shows on Netflix) transparent to users." [2]

Twitter's EU user base is probably [3] above the 45 million threshold that triggers the strictest transparency requirements under the Act. So perhaps they figure if they're going to be forced to disclose anyway, they might as well do it proactively.

[1] If it's even coherent to talk about their feed ranking system as a single algorithm — see the other comments in this thread.

[2] https://www.theverge.com/2022/4/23/23036976/eu-digital-servi...

[3] https://www.statista.com/statistics/242606/number-of-active-...

nickysielicki4y ago

Surely you guys don’t think that twitters sorting algorithm is already factored out into its own repo. Of course it’s empty.

That doesn’t mean it’s a joke, I see it as a show of goodwill — that there are a handful of people inside Twitter that are excited for transparency and for a revenue model that isn’t entirely based on ads, that are excited to get to work on this right away.

pddpro4y ago

Does it remind anyone of Po and the Dragon scroll from Kung Fu Panda?

xena4y ago

I can't believe they missed the chance to make it a rick roll. Such a wasted opportunity.

newbamboo4y ago

The government, at federal, state and local levels, all rely on Twitter to conduct official taxpayer funded work. Taxpayer funded work should not happen on proprietary systems that operate with zero oversight or public transparency.

Elon polled Twitter users about this and the response was overwhelmingly in favor of open source and transparency. Everyone on Twitter got a vote.

If you oppose transparency, as many now are, you lose your credibility. So it’s another one of Elon’s people hacks, and look at all the morons falling for it.

bpodgursky4y ago

I'm very technical and I think it would still be valuable to have a list of all the things that weight into the timeline view, even without the models or underlying data.

Like, there's no public admission right now of whether "shadow banning" or "ghost banning" is even officially a thing!

Some transparency seems unquestionably more powerful than none, and we can work from there.

yabones4y ago

There is something vaguely threatening about this.

rvz4y ago

Perhaps Twitter will be the new Mozilla if it decides to open-source 'everything' then.

Maybe that is where it is going.

holtkam24y ago

I don't get it ¯\_(ツ)_/¯

sakopov4y ago

I agree that there is no such thing as "the algorithm." It is Twitter in its entirety. And with that I have a wild question. Can Musk make Twitter fully open-source on GitHub?

hazb4y ago

"The algorithm" could mean a lot of things. Whatever it means, it probably spans hundreds or even thousands of services. That doesn't mean it cannot be made open-source.

I imagine they'd probably start with documentation and white-papers that communicate "here's how we intend for it to work".

It's seriously unlikely anyone in Twitter knows actually works how any non-trivial algorithm in the company works. To figure THAT out, they could decide to do a company-wide documentation and instrumentation push like they probably would've had to do for GDPR anyway, which is painful and boring and going to take a very long time.

Failing that, they could just say 'the algorithm as it stands is no longer fit for purpose, given part of its core requirement has become that it needs to be transparent and publishable, and presumably legible. We need to make a new one. Publish the core algorithm. We probably won't deploy it in that exact state, it's going to span multi-services and so on, you obviously don't get the data we used to train the models, but we will work backwards from it and here's an open mechanism to measure how true-to-form it actually is'

tmaly4y ago

I could see GPT-3 being added in the empty space.

qudat4y ago

I’ve spent the better part of a decade writing open source projects for few to see. An empty repo gets hundreds of stars immediately. It’s all a popularity contest.

drnonsense424y ago

Apples are red. The sky is blue. Twitter shadowbans and tinkers with who sees who. I wonder what the old guard will do with the codebase over the next few months.

a-dub4y ago

it's probably just a ripoff of pagerank with a separate spam filtering and banning system along with an army of contractors manually fixing it up.

if twitter is a game, sinking $43bn into it is kinda like winning or losing the grand final boss level. (unclear which)

wish elon would get back to facilitating the building of useful things. we still don't have a great clean energy generation story.

asd884y ago

#drama?

4e5303449630494y ago

Nice, making it much easier to game!

arthurcolle4y ago

is this performance art?

ArtWomb4y ago

It was all in your head ;)

NaturalPhallacy4y ago

Not "the algorithm", but you can check if twitter is silently suppressing your account here: https://taishin-miyamoto.com/ShadowBan/

u1tron4y ago

It's already gone.

j / k navigate · click thread line to collapse

380 comments

203 comments · 47 top-level

axg114y ago· 46 in thread

I've worked on very large scale recommendation systems at a FAANG. If Twitter's system resembles anything like ours, the concept of publishing or open sourcing "the algorithm" doesn't make sense.

suadeo4y ago

As someone who’s also worked in this area, I disagree with this take.

There is typically clear objective function of a recommendation system.

What Twitter is optimizing for is what’s of interest here. And some of the hidden business rules. It’s likely these are specified in the code in an obvious way.

How exactly they achieve that is the part that is complex and relatively indecipherable.

It’s possible that it’s designed in such a way the optimization objectives are also unclear, but that would indicate a bad design and be to the detriment of the company and users.

arriu4y ago

Yeah seriously. If they are not able to put into words or diagrams what the algorithm is doing then the company itself has no idea how it works. And that to me would suggest it’s far from optimal.

Many complicated research papers have had no issues describing their models at a high level. This should be no different.

axg114y ago

Twitter PMs would argue they've already communicated the objective function: relevance and engagement.

actuator4y ago

Yeah, this makes no sense. There is no golden algorithm that Facebook, Tiktok or Twitter has figured out.

All these feed rankings are complex combinations of features, models coupled with weights and filters. On top of this abuse detection layers are added.

On top of that, exposing every vector through which you measure and stop abuse, will just allow for more sophisticated abuse.

vrfvr4y ago

Can you mention the source of your information?

KrishnaShripad4y ago

michaelmrose4y ago

1 more reply

346794y ago

madeofpalk4y ago

Is the idea somewhere in the-algorithm there's a function called `derankGOPMembers()`?

[1] https://blog.twitter.com/en_us/topics/company/2022/our-ongoi...

3 more replies

hama_industries4y ago

This is likely in response to upcoming EU legislation on algorithm transparency[0]. It's not useful, but they'll need to do it eventually.

[0]: https://mashable.com/article/eu-digital-services-act-big-tec...

Traster4y ago

farias04y ago

I don't think that most people actually believe "the algorithm" is like a Python script or something.

3 more replies

cscurmudgeon4y ago

> it doesn't boil down to 50 lines of code

No one said that. You created a straw man and are arguing with it.

This comment says more about you than you think.

1 more reply

kringo4y ago

Understood and several of us have dealt with large complex systems.

Open sourcing algorithm or code is not about everyone go and analyze the same, instead when controversy or issues arise it'll be readily available for independent experts to review it.

axg114y ago

dmarcos4y ago

memish4y ago

Even just releasing the audit trail of shadowbans, upranking, downranking would be meaningful. Full transparency of actions taken in the past and going forward.

stjohnswarts4y ago

This will never happen, not even under Elon. It makes them wide open to litigation.

1 more reply

dekhn4y ago

dsugarman4y ago

nojito4y ago

You won’t know what they were A/B testing though.

Railsify4y ago

zeckalpha4y ago

That could be a starting point for a DSL that realizes the algorithm.

risyachka4y ago

As you probably know, the algorithm is NOT the code, never has been.

It can be pseudo-code or diagram or whatever that can be used to understand what logic lies behind decision making.

cscurmudgeon4y ago

Just because it has evolved to be a complex monster that is unnaccountable doesn't mean it has to be that way in the future.

There are ways to translating trained ML models and associated systems into understandable hierarchical rules.

Twitter's timeline is NOT AGI.

tonguez4y ago

andrewmcwatters4y ago

In fact, a lot of people here really think what people are talking about is the equivalent of what is handled in a subroutine.

Being pedantic about whether or not this happens in an SQL query, or across multiple codebases, or by region, doesn't escape the question.

mistermann4y ago

Absolutely agree with what you say, but I wanted to pick a nit here:

> Being pedantic about whether or not this happens in an SQL query, or across multiple codebases, or by region, doesn't escape the question.

Actually, epistemic ~"muddying of the waters" is a well proven technique to control perceptions and public discourse. If it works on HN folks, I expect it would work much more easily on amateurs.

lamontcg4y ago

Did you just tell me to go write a map/reduce function in Erlang?

colonwqbang4y ago

How did you work with a system that couldn't be understood even with access to full source and documentation? Surely your engineering process wasn't purely stochastic.

karmasimida4y ago

Let alone those 10s of GB of embeddings which might be user identifiable, without those, how do they claim to 'open source' the algorithm?

dillondoyle4y ago

It would be gamed 100%.

abledon4y ago

theres gonna be a few crazy guys who devote their entire time to understanding some obscure part of this algorithm

MangoCoffee4y ago

i'm looking forward to see what's going to be on it.

temp89644y ago

stefan_4y ago

mhh__4y ago

tyingq4y ago

Not saying Google is turning down conservative search results, but they absolutely can. Using the same cycle of human raters and tweaking weights they used to push down comparison shopping sites.

See http://graphics.wsj.com/google-ftc-report/

1 more reply

memish4y ago

"Among the elite, and within Twitter specifically, there is much more inclination to ban the right.

That's why people on the far right keep starting lame Twitter alternatives. You don't see people on the far left doing that. They don't need to. They have Twitter."

https://twitter.com/paulg/status/1515235822890532864

4 more replies

MetaWhirledPeas4y ago

> I think it's just a dog whistle to the people who believe google have a guy in a room somewhere turning the "conservative search results" lever down a notch during elections.

2 more replies

briandear4y ago

And that hasn’t happened? Google got fined by the EU for manipulating search results. It definitely happens.

https://amp.usatoday.com/amp/1248099002

https://www.vox.com/2017/6/27/15878980/europe-fine-google-an...

1 more reply

NaturalPhallacy4y ago

A comparison I did from four months ago across google, ddg, yandex, and search.brave, searching for "are conservatives being silenced by big tech"

https://i.imgur.com/MVlshAT.png

You don't have to be conservative to see there's a pretty significant bias, just in the headlines. I'm a Pacific Green and I can still see it.

3 more replies

KrishnaShripad4y ago

Was it the algorithm or was it a "guy in a room" which/who decided to block New York Post's article on Hunter Biden's laptop scandal?

2 more replies

throwaway8943454y ago

1 more reply

commandlinefan4y ago

> turning the "conservative search results" lever

It may not have been algorithmic, but it definitely happened.

3 more replies

topspin4y ago

> the concept of publishing or open sourcing "the algorithm" doesn't make sense

Whatever. He paid for it. Private company. Do what it wants.

> it would be very difficult to make sense of the entire system

Irishsteve4y ago· 26 in thread

Many people have commented that it is empty. However what they do not realize is that there has never actually been an algorithm and that is why it is empty.

bastardoperator4y ago

andrewmcwatters4y ago

That's an algorithm.

4 more replies

ltbarcly34y ago

I mean even if timelines were totally random, or based on some external facts, there is an algorithm that is being used to order them.

influx4y ago

There's not an algorithm to order timelines?

jonwinstanley4y ago

If it’s a list of tweets ordered based on a kajillion ML data points that varies per user is it still an algorithm?

And does every user have their own algorithm?

And could it be made readable to a human?

2 more replies

dmonitor4y ago

Turns out, it's been RNG this entire time.

1 more reply

Hamuko4y ago

  ORDER BY timestamp DESC;

2 more replies

krapp4y ago

3 more replies

apgwoz4y ago

Turns out they never actually scaled their mailbox pattern. As a result, what you see is what they delivered before timeout.

jdrc4y ago

just birds poking random tweets

WORMS_EAT_WORMS4y ago

This would be some pretty funny semantics... but, however it works, message is the same: Transparency for how these work

- Search results

- Comment Order

- Timeline Order

- Trends

- Human vs code

Personalization in general. Big gigantic “why” when it happens to you

Diesel5554y ago

Of note for anyone confused and reading the comments now, the link was to an empty, but real, twitter repository at https://github.com/twitter/the-algorithm. Now it is a 404.

temp89644y ago

Here’s the play: interpret “the algorithm” in a stupid way, and then claim it doesn’t exist, and then make a joke about it on company repo.

You know what? This does demonstrate the internal problems inside Twitter and shows the need for shakeup.

omoikane4y ago

That sounds like a variant of this joke: https://picturesofpeoplescanningqrcodes.tumblr.com/

I-M-S4y ago

Joke's on them, in the last 2 years QR codes made the greatest upset of all times

andrewmcwatters4y ago

If there is a total set of data, and a subset of it is produced for users, there is an algorithm. `SELECT TOP` is an algorithm. A ML model with a trillion parameters is an algorithm.

There is no "no algorithm."

sverhagen4y ago

You're not wrong. But would it be useful to define better what we mean with algorithm then? Like: we mean "non-obvious algorithm"?

1 more reply

NaturalPhallacy4y ago

We have a leak revealing a "trends blacklist" at the very least. Some of their measures are detectable too: https://taishin-miyamoto.com/ShadowBan/

yifanl4y ago

It's an interchangeable function, it would only be publicized if it's clear to leadership that it wouldn't affect their revenue if people started trying to game towards the published algorithm.

bberenberg4y ago

I feel like this is a line from The Program (https://www.programaudioseries.com/)

AviationAtom4y ago

You don't think recent news of the EU compelling social media companies to disclose their algorithms has anything to do with this?

layer84y ago

1 more reply

bilekas4y ago

But there definitely is a relationship algo that could be considered theirs, like all social medias inflating the bubbles users all feel.

stjohnswarts4y ago

crate_barre4y ago

Karrot_Kream4y ago

shrimpx4y ago· 13 in thread

I don't understand the concept of open-sourcing "the algorithm".

First of all, "the algorithm" is probably hundreds of thousands of lines of code, including all the tedious boilerplate like cache policies and multi-AZ logic.

owaislone4y ago

Traster4y ago

The only change is that people with resource would stop guessing how to game the system, and start employing people to ensure they systematically game the system.

rootusrootus4y ago

> if they don't do it already

That's an interesting point. A practical description of the algorithm from the perspective of someone trying to game it may be more useful than anything Twitter or Google would release.

WORMS_EAT_WORMS4y ago

Probably right but I think trying to be transparent is better than not trying at all.

Gesture carries weight to the users too.

Not sure any big company has tried this before. I could be wrong, but either way looking forward to it / FWIW hope it catches on.

ProAm4y ago

Or is it six cats in a basement with a laser pointer and a mouse?

The point of releasing it is to let people know exactly why they see the tweets they do in the order they do. I hope Elon just goes back to time base ordering of tweets.

LordDragonfang4y ago

Even the people who build these systems barely know what the algorithm is going to do, much less why. It will be a herculean task to try and convey that to an average user.

1 more reply

nthot4y ago

2 more replies

buu7004y ago

Maybe it's a room full of lava lamps.

jonas214y ago

vijaybritto4y ago

The whole point of the repo seems to just show that there's nothing called as "the algorithm" It's probably something like 100s of 1000s of algorithms doing different things

threeseed4y ago

You can open source the model code.

And developers will be able to train a model using it on a subset of Twitter data. Just that the quality of the outcome won't be the same as having the full set of Twitter data.

thorgutierrez4y ago

If it's too complicated, there is a good chance Elon will ask to simplify it until it can be open sourced.

mrkramer4y ago

jdrc4y ago· 12 in thread

I know the algorithm i use, it ends with ORDER BY date DESC.

themodelplumber4y ago

Nice for some things. Absolute death for discovery in a lot of cases though. Either trim your follows down dramatically, or miss some incredible stuff...

jdrc4y ago

there is nothing to discover anymore. everyone is retweeting the same stories. and the best bet at discovering is via soomeone you follow

onychomys4y ago

Same here, and it blows my mind that we're in such a minority.

xvmt4y ago

This is what YouTube's Subscriptions tab does and I hate it. Rarely uploading quality creators are hidden in a sea of daily uploads from channels which focus on regular uploads rather than quality.

I think there is a place for a smarter algorithm than "ORDER BY date DESC", but one that is not designed to manipulate users into addiction.

jdrc4y ago

hotpotamus4y ago

dilap4y ago

i always use chronological, atm following very few people, but in the past following too many to keep up w/.

even when following too many to read everything, i preferred chrono because it would yield a coherent slice of what was happening. an unbiased sample.

twitter is basically a medium for conversation.

imagine there's a large party. would you rather listen to an out-of-order "most important" set parts of the conversation, or just a slice of conversation from a particular time?

well, actually, both can be interesting, but generally the slice is more coherent. :-)

almog4y ago

1. Insertion of tweet to tweets table.

2. Insertion of that tweet-id to the home timelines of all that user's followers.

3. Insertion of that tweet-id to the user-timeline of that user.

madeofpalk4y ago

Yes. It is. It's been an option since... forever? Many many years at least. https://i.imgur.com/JM6saa2.png

lysp4y ago

I use lists by subject matter, including a list of personal contacts.

I then use tweet deck which shows a column of tweets per list.

As these are separated by subject and are chronological, it makes it far easier to follow.

ironmagma4y ago

Impossible to read, maybe. Impossible to display, definitely not.

themanmaran4y ago

Sort by recent is an available setting. Whether you can keep up with the volume is up to how many people you choose to follow.

CincinnatiMan4y ago· 11 in thread

Makes me wonder how Twitter employees internally are handling the news. If they are celebrating or commiserating?

mostlysimilar4y ago

Twitter Locks Down Product Changes After Agreeing to Musk Bid - https://finance.yahoo.com/news/twitter-locks-down-product-ch...

brink4y ago

Looks like a wise move to keep potential destructive protests at bay.

1 more reply

memish4y ago

> Twitter imposed the temporary ban to keep employees who may be miffed about the deal from “going rogue,” according to one of the people.

What could a rogue employee do?

7 more replies

thejackgoode4y ago

Are there precedents of this? This looks like there is not much trust internally. I doubt this was the first Elon’s decision

2 more replies

temp89644y ago

Thanks! Good to know.

I was actually wondering some people may want to remove traces of what they have been doing.

I wish someday we can see the internal communications lead to the Hunter Biden laptop story ban.

philjohn4y ago

Wonder if their RSU grants have a clause that converts them to cash in the event of a takeover.

And having been in a company that was taken over, it's a mixture of emotions - is my job safe, will this be the same culture I joined for etc. etc.

bogomipz4y ago

>"Wonder if their RSU grants have a clause that converts them to cash in the event of a takeover"

1 more reply

syshum4y ago

coffeeblack4y ago

There are probably some secret celebration.

paulpauper4y ago

I am sure those with in the money options are happy

datalopers4y ago

Sell RSUs and get out.

transitivebs4y ago· 7 in thread

Is this supposed to be a joke? It's clearly an empty repo.

Either this is a mistake, or this is a really, really misguided attempt at a joke from Twitter.

koboll4y ago

The tweet announcing it was captioned "watch this space":

https://twitter.com/willnorris/status/1518694675909013504

Which seems like a promise they intend to actually open source something there.

seaman19214y ago

Who is that guy even ? Is he even associated with twitter ? Calling it an announcement seems like a stretch

1 more reply

transitivebs4y ago

For anyone wanting a non-empty version, check out my article on how Twitter's algorithmic feed works from last week https://transitivebullsh.it/oss-twitter-algorithm-part-1

aeyes4y ago

How you _think_ it works from 1000 miles above.

evandale4y ago

It wouldn't surprise me. It wouldn't be the first joke about him and his buyout.

https://twitter.com/TwitterComms/status/1511456430024364037

dmonitor4y ago

It was created 20 minutes ago. Maybe a WIP?

transitivebs4y ago

For a company of Twitter's scale and resources, any public repos are supposed to go through legal to clear (this is how things work at every FAANG I've worked at).

So if it was a WIP, it'd be a private repo until it's ready to release publicly.

1 more reply

mrintellectual4y ago· 5 in thread

This seems to be a practical joke by a Twitter engineer as opposed to an actual release.

somishere4y ago

Could you take it any other way? I mean obviously you could, because most here appear to be taking 'the-algorithm' very seriously .. but, seriously? It's funny. No joke.

johntb864y ago

It seems more like a symbolic gesture.

hunterb1234y ago

Or a commit is being prepared? It's only 20 minutes old.

mrintellectual4y ago

At the minimum, I would make a private Github repo first, add all relevant commits, and then make it public once there's actually content.

1 more reply

kaladin-jasnah4y ago

https://twitter.com/elonmusk/status/1518677066325053441 states that Elon Musk is to looking "make the algorithms open source."

johnsOP4y ago· 4 in thread

Will Norris who works on OSS at Twitter posted this[0]: "watch this space https://github.com/twitter/the-algorithm"

[0]: https://twitter.com/willnorris/status/1518694675909013504

bsimpson4y ago

Oh hey - I didn't know he'd left Google. He was a big part of open source there too.

willnorris4y ago

same team. different company.

1 more reply

lupire4y ago

did the repo ever exist or was he joking, or teasing?

FR104y ago

I just checked and its gone. But when it did exist it was just an empty repo no README even.

standyro4y ago· 3 in thread

I tried to make a pull request already, haha.

error forking repo: HTTP 403: The repository exists, but it contains no Git content. Empty repositories cannot be forked. (https://api.github.com/repos/twitter/the-algorithm/forks)

My thoughts:

- Explicit rules for temporary and permanent bans

- Edit button

- More fun and thoughtful conversations like HN

- Less thought bubble Brooklyn based reporters, less VC and side grind hustle snake oil, maybe more comedians and memes?

semitones4y ago

Care to explain "thought bubble Brooklyn based reporters"? Did you choose Brooklyn for the alliteration? Or is there something about my home I should know about?

JohnWhigham4y ago

Large amount of blue checkmark journalists live there, simple as that.

1 more reply

standyro4y ago

nighthawk4544y ago· 3 in thread

Seems weird to start as a non-private repo until there's some content. Also bit of an unusual name. Can't tell if this is internal trolling or the future

pavon4y ago

So not a troll, but yes it is odd to put up an empty repo, and announce the repo before there is anything in it.

madeofpalk4y ago

Seems like a perfect troll/joke, especially given everything.

1 more reply

extheat4y ago

unethical_ban4y ago· 3 in thread

LegitShady4y ago

>perhaps there should be some regulatory branch that reports to Congress

readbeard4y ago

On the other hand, wouldn't open sourcing the algorithm help accelerate the identification of possible exploits?

unethical_ban4y ago

"algorithm" here isn't some fancy, hard to debug code. It is the business logic of weighting tweets and how recommendations are made.

EMIRELADERO4y ago· 3 in thread

Kind of unrealistic but I hope Twitter now open-sources not only the algorithm but also the Rails monolith itself. Would be kind of interesting to see how everything is done

influx4y ago

The rails monolith is long gone.

EMIRELADERO4y ago

In that case the same applies for the microservices

1 more reply

kaladin-jasnah4y ago

That gives even more reason to open-source it, right? If Twitter isn't using that codebase anymore.

threeseed4y ago· 3 in thread

Anyone who actually uses Twitter already knows the algorithm:

* Chronological - reverse sort by date

It will be uncontroversial, technically unsophisticated and of no practical use to anyone - users, developers or researchers.

This is not going to be PageRank where some genuine new insight was discovered.

whimsicalism4y ago

What are you basing this on? I wouldn't assume response prediction is using a "fairly basic ML model," it can be a lot more than that.

threeseed4y ago

If it's a simple, constrained problem where the number of available features is low then inherently the complexity can never be high.

I've built hundreds of models and run a ML company and I don't believe it's technically possible for this rule not to be the case.

1 more reply

tomcam4y ago

You forgot the gigantic amount of human intervention required to unperson people who tweet against Twitter‘s political interests

1 more reply

mrkramer4y ago· 2 in thread

bigfudge4y ago

sjtindell4y ago

Results would also become an absolute cesspool. Say what you want about how they are now, but if the people gaming it could see the exact rules, it would become completely useless.

Traster4y ago· 2 in thread

That seems... bizarre to me?

______-_-______4y ago

That's more stars than I'll ever have on any of my repos. Maybe I made a mistake not joining a faang

enw4y ago

Nope. People are just excited that the Twitter cesspool might finally improve.

g105b4y ago· 2 in thread

Can someone explain this to me? All I can see from this link is an empty GitHub repository. Not sure what I'm missing here.

tux19684y ago

It's just a declaration of intent at this point.

topspin4y ago

It's 404 today.

LugarOS4y ago· 2 in thread

It's empty.

heffer4y ago

It's a performance.

enahs-sf4y ago

Did it get taken down already? I think open-sourcing the algo would materially change the value of the deal.

2 more replies

rickreynoldssf4y ago· 1 in thread

Wait until Musk finds out its a bunch of gnarly PHP 5.4 code much of which is a black box everyone is afraid to touch.

Sirened4y ago

The Timeline is actually just a SQL expression with 500 sub-queries

paxys4y ago· 1 in thread

I'm going to guess some engineers at Twitter with Github org permissions are having fun with the "release the algorithm" discussion.

Sirened4y ago

haha that's what I figured, someone decided to quit if Musk acquired Twitter and figured they'd just leave one last practical joke

Barrin924y ago· 1 in thread

So I'm not sure what the ultimate point of this exercise is other than producing faux-transparency.

bigfudge4y ago

qgin4y ago· 1 in thread

I have literally no idea how a "twitter algorithm" could be published on github. Maybe I've been doing recommender systems wrong.

solenoidalslide4y ago

You can publish data flows and models to github in addition to source code components.

Synaesthesia4y ago· 1 in thread

So nobody is being shadowbanned or suppressed?

dimgl4y ago

Last I heard, Twitter has internal tooling that allows moderators to shadowban or suppress.

minroot4y ago· 1 in thread

Why do we want to know the "algorithm"?

bee_rider4y ago

Clearly it will contain a straightforward bias against our pet interest, confirming a grand conspiracy and validating our paranoia.

zelon884y ago· 1 in thread

if ($has_blue_checkmark) show_post_to($everyone);

rzarate4y ago

Ugh, php...

TrapLord_Rhodo4y ago· 1 in thread

Musks first order of business?

LegitShady4y ago

clean house

oxplot4y ago· 1 in thread

12ian344y ago

Seems like the most reasonable take. This move feels in-character for Musk.

edouard-harris4y ago

[1] If it's even coherent to talk about their feed ranking system as a single algorithm — see the other comments in this thread.

[2] https://www.theverge.com/2022/4/23/23036976/eu-digital-servi...

[3] https://www.statista.com/statistics/242606/number-of-active-...

nickysielicki4y ago

Surely you guys don’t think that twitters sorting algorithm is already factored out into its own repo. Of course it’s empty.

pddpro4y ago

Does it remind anyone of Po and the Dragon scroll from Kung Fu Panda?

xena4y ago

I can't believe they missed the chance to make it a rick roll. Such a wasted opportunity.

newbamboo4y ago

Elon polled Twitter users about this and the response was overwhelmingly in favor of open source and transparency. Everyone on Twitter got a vote.

If you oppose transparency, as many now are, you lose your credibility. So it’s another one of Elon’s people hacks, and look at all the morons falling for it.

bpodgursky4y ago

I'm very technical and I think it would still be valuable to have a list of all the things that weight into the timeline view, even without the models or underlying data.

Like, there's no public admission right now of whether "shadow banning" or "ghost banning" is even officially a thing!

Some transparency seems unquestionably more powerful than none, and we can work from there.

yabones4y ago

There is something vaguely threatening about this.

rvz4y ago

Perhaps Twitter will be the new Mozilla if it decides to open-source 'everything' then.

Maybe that is where it is going.

holtkam24y ago

I don't get it ¯\_(ツ)_/¯

sakopov4y ago

I agree that there is no such thing as "the algorithm." It is Twitter in its entirety. And with that I have a wild question. Can Musk make Twitter fully open-source on GitHub?

hazb4y ago

"The algorithm" could mean a lot of things. Whatever it means, it probably spans hundreds or even thousands of services. That doesn't mean it cannot be made open-source.

I imagine they'd probably start with documentation and white-papers that communicate "here's how we intend for it to work".

tmaly4y ago

I could see GPT-3 being added in the empty space.

qudat4y ago

I’ve spent the better part of a decade writing open source projects for few to see. An empty repo gets hundreds of stars immediately. It’s all a popularity contest.

drnonsense424y ago

Apples are red. The sky is blue. Twitter shadowbans and tinkers with who sees who. I wonder what the old guard will do with the codebase over the next few months.

a-dub4y ago

it's probably just a ripoff of pagerank with a separate spam filtering and banning system along with an army of contractors manually fixing it up.

if twitter is a game, sinking $43bn into it is kinda like winning or losing the grand final boss level. (unclear which)

wish elon would get back to facilitating the building of useful things. we still don't have a great clean energy generation story.

asd884y ago

#drama?

4e5303449630494y ago

Nice, making it much easier to game!

arthurcolle4y ago

is this performance art?

ArtWomb4y ago

It was all in your head ;)

NaturalPhallacy4y ago

Not "the algorithm", but you can check if twitter is silently suppressing your account here: https://taishin-miyamoto.com/ShadowBan/

u1tron4y ago

It's already gone.

j / k navigate · click thread line to collapse