Last week, Meta announced LLaMA, its latest stab at making a GPT-style “large language model”*. If AI is the future of tech, then big tech companies need to control their own models or be left behind by the competition. LLaMA joins OpenAI’s GPT (licensed by Microsoft for Bing and underpinning OpenAI’s own ChatGPT) and Google’s LaMDA (which will power Bard, its ChatGPT rival) in the upper echelons of the field.
Meta’s goal wasn’t simply to replicate GPT. It says that LLaMA is a “smaller, more performant model” than its peers, built to achieve the same feats of comprehension and articulation with a smaller footprint in terms of compute*, and so has a correspondingly smaller environmental impact. (The fact that it’s cheaper to run doesn’t hurt, either.)
But the company also sought to differentiate itself in another way, by making LLaMA “open”, implicitly pointing out that despite its branding, “OpenAI” is anything but. From its announcement:
Even with all the recent advancements in large language models, full research access to them remains limited because of the resources that are required to train and run such large models. This restricted access has limited researchers’ ability to understand how and why these large language models work, hindering progress on efforts to improve their robustness and mitigate known issues, such as bias, toxicity, and the potential for generating misinformation.
By sharing the code for LLaMA, other researchers can more easily test new approaches to limiting or eliminating these problems in large language models.
By releasing LLaMA for researchers to use, Meta has cut out one of the key limits on academic AI research: the vast cost of training an LLM*. Three years ago, each training run of GPT-3 was estimated to cost between $10m and $12m. (OpenAI didn’t disclose the actual cost, only the amount of compute used for an individual run; it also didn’t disclose how many runs it took to get it right, given the trial-and-error nature of the field.) The price tag has only increased since then, so by releasing LLaMA for researchers to use, Meta is letting them save millions – or, more realistically, opening up the prospect of foundational research altogether.
By focusing on efficiency, the company’s similarly made it cheaper to run the system. The most advanced LLaMA model is 65bn “parameters” (sort of but not quite the number of connecting lines on the vast neural network* at its heart), barely one-third of the size of GPT-3’s chunkiest boy, but Meta says the two are roughly equivalent in capability. That slimmed-down size means that LLaMA can run on much cheaper systems, even a desktop computer – if you can tolerate glacial processing times.
But Meta’s generosity wasn’t limitless. “To maintain integrity and prevent misuse … access to the model will be granted on a case-by-case basis,” the company said. Initially, it was criticised for how it adjudicated those cases, with accusations of a western bias to who are deemed eligible.
*Are any of these terms still confusing? Check out last week’s AI glossary.
Leaky LLaMA
But those criticisms were rendered moot over the weekend, when the entire model was leaked for anyone to download. Initially posted to 4Chan, a link to the BitTorrent mirror of LLaMA eventually made it to GitHub, where a cheeky user added an official-looking note encouraging others use that link “to save our bandwidth”.
It’s too early to say what effect the leak will have. The model as it stands is unusable to anyone without serious technical chops and an extremely beefy computer or the willingness to burn a few hundred pounds on cloud storage bills. Also unclear is what Meta’s response will be. “It’s Meta’s goal to share state-of-the-art AI models with members of the research community to help us evaluate and improve those models,” a Meta spokesperson said. “LLaMA was shared for research purposes, consistent with how we have shared previous large language models. While the model is not accessible to all, and some have tried to circumvent the approval process, we believe the current release strategy allows us to balance responsibility and openness.”
That leaves a lot unsaid. Will it throw lawyers at the problem and try to jam the genie back in the bottle, or will it embrace its accidental role as the developer of what is likely to rapidly become the most widely deployed AI in the world. If the latter, we could shortly see the same revolution in LLMs that hit image generators last summer. Dall-E 2 was released last May, showing a step-change in the quality of AI image generation. (Rereading the TechScape issue about the release is eye-opening for how far we’ve come in such a short time.)
But Dall-E was controlled by OpenAI, just like ChatGPT, with access carefully controlled. People knew something big was happening but were limited in their ability to experiment with the technology, while OpenAI looked like a gatekeeper that would harvest all the commercial upside of the creation.
Then, in August, Stability AI released Stable Diffusion. Basically funded entirely by the savings of ex-hedge fund trader Emad Mostaque, Stable Diffusion was open source from day one. What Meta did accidentally, Stability AI did on purpose, banking that it would have a better shot at success in the field if it sold services on top of the free-to-use model, rather than controlling access at all costs.
OpenAI v Open AI
We’re at the crossroads of two very different AI futures. In one, the companies that invest billions in training and improving these models act as gatekeepers, creaming off a portion of the economic activity they enable. If you want to build a business on top of ChatGPT, for instance, you can – for a price. It’s not extortionate, a mere $2 for every 700,000 words processed. But it’s easy to see how that could one day result in OpenAI being paid a tiny sliver of a cent for every single word typed into a computer.
You might think that no company would give up such an advantage, but there’s weakness of that world: it’s an unstable one. Being a gatekeeper only works while there is a fence around your product, and it only takes one company to decide (willingly or not) to make something almost as good available for free to blow a hole in that fence for good.
The other world is one where the AI models that define the next decade of the technology sector are available for anyone to build on top of. In those worlds, some of the benefit still accrues to their developers, who are in the position to sell their expertise and services, while some more gets creamed off by the infrastructure providers. But with fewer gatekeepers in play, the economic benefits of the upheaval are spread much further.
There is, of course, a downside. Gatekeepers don’t just extract a toll – they also keep guard. OpenAI’s API fees aren’t a pure profit centre, because the company has committed to ensuring its tools are used responsibly. It says it will do the work required to ensure spammers and hackers are kicked off promptly, and has the ability to impose restrictions on ChatGPT that aren’t purely part of them model itself – to filter queries and responses, for instance.
No such limits exist for Stable Diffusion, nor will they for the pirate instances of LLaMA spinning up around the world this week. In the world of image generation, that’s so far meant little more than a lot more AI-generated porn than in the sanitised world of Dall-E. But it won’t be long, I think, before we see the value of those guardrails in practice. And then it might not just be Meta trying to jam the genie back in the bottle.
If you want to read the complete version of the newsletter please subscribe to receive TechScape in your inbox every Tuesday.