Elon Musk stayed true to his word. As the clock struck Noon on Friday, the renegade CEO revealed portions of Twitter's proprietary algorithm.
The algorithm, which is basically the formula that decides which tweets a user sees on their timeline, is a valuable asset - many internet companies treat algorithms as something akin to state secrets. Musk framed his decision to make Twitter's algorithm open source as an effort to improve it, by enlisting the help of volunteers, and as an act of radical transparency — if you suspected Twitter was "shadowbanning" certain people, the conspiracy would finally be exposed.
As a stampede of developers, reporters, and curiosity-seekers rushed to code-sharing GitHub site to have a look at the algorithm, trying to ascertain the actual significance or usefulness of Musk's move only became hazier.
Many people quickly fixated on controversial sounding bits of the algorithm, such as "author_is_republican." But as several observers noted, the uploaded code doesn't indicate how the company is using any of it, and leaves out vital bits of information that would paint the whole picture. Even people with bona fide technical chops said the lack of necessary context made it impossible to make much sense of the published algorithm, let alone try to make any actual contributions to the open source code.
"They released a lot, which is neat, but also like, wtf is the point of this? Nobody's going to make heads or tails of this, let alone the Q-brained guys he's trying to impress," one senior software engineer who wished to remain anonymous said in a direct message to Fortune.
A former Twitter executive told Fortune that the social media service uses the data of a user, as well as an algorithm, to choose the best set of tweets to display to them. Seeing one without the other doesn't tell an accurate narrative, and is mostly "smoke and mirrors," they said.
"In order to open source the algorithm you need to open source the training set which is impossible for Twitter to do," the former Twitter executive said, adding that you can open source anything but it's not effective without that essential backdrop. "Every effort in open sourcing the algorithm without the data is completely dishonest."
"This is a political declaration in the form of a GitHub repo"
Musk, who bought Twitter for $44 billion at the end of 2022, has reveled in breaking industry norms and taunting perceived enemies, from journalists to adherents of "woke-ism." He has long vowed to expose the inner workings of Twitter's algorithm after a series of selectively released internal information, dubbed the "Twitter Files," revealed that the algorithm tended to favor the political right.
Many of his fans cheered Musk's latest gambit. One user replied to Musk's announcement that this is a "step in the right direction for the future of humanity," and an investor who led product teams at Facebook and Snapchat said it was "pretty incredible."
Comments within the code explain that these labels are "used purely for metrics collection" and to ensure that the company doesn't implement changes that negatively impact "one group over others." Musk, who took to Twitter Spaces shortly after the code release, claimed he wasn't aware of those labels and said "it definitely shouldn’t be dividing people into Republicans and Democrats, that makes no sense."
At least one prominent tech executive suspected ulterior motives in Musk's move to open source the algorithm, likening it to other incidents in which Musk selectively disclosed internal Twitter information.
"It's a necessarily incomplete thing which is the way all misinformation works, you start with the seed of truth and then you build a false narrative around it," Glitch CEO Anil Dash told Fortune, pointing towards the labels that sent some media and coders into a tizzy, such as "author_is_elon," "author_is_power_user," "author_is_democrat," and "author_is_republican." While these might seem nefarious to the layman reviewing this code, the reality is probably pretty boring, Dash said.
"They're trying to shape the conversation," said Dash. "This is a political declaration in the form of a GitHub repo, and it's not intellectually honest and ignores the history of the work they've been doing. It is not designed to enable the developer to build a better experience on Twitter."
In addition to the code release on Friday, Twitter published a blog post that explains the algorithm used to suggest tweets, which they refer to as "Home Mixer," operates by gathering tweets from various sources through a technique called "candidate sourcing." The tweets are then assessed by a machine learning model and sorted based on factors like whether you've blocked the user or if the content is not safe for work (NSFW).
Those undefined machine learning models also seemed to be crucial to other capabilities hinted at in the open sourced algorithm, such as the ability to analyze sentiments expressed in people's tweets, such as anger, humor, and sadness.
"The actual magic is in some machine learning models," the anonymous longtime software coder told Fortune.
"I dug into it a bunch and there’s zero of the good trained model data anywhere, and without that this whole algorithm show is all hat and no cattle."