
Profluent, a biotechnology company that is using AI to design proteins, says it has demonstrated “scaling laws” in the performance of AI models for biology that are similar to those AI researchers have previously claimed for AI models built to handle language.
The finding means that building larger AI models for protein design, and feeding them more data, will produce predictably better results—including the ability to accomplish tasks that smaller models cannot.
The discovery will give Profluent and its investors further confidence that the company can eventually realize its vision of building AI models that will allow scientists to specify in natural language exactly what properties they wish a protein to have, and then have the model output a DNA recipe for creating exactly that protein.
“We are barreling forward toward that future,” Ali Madani, founder and CEO of the Berkeley, Calif.-based Profluent, told Fortune.
Profluent has raised $44 million to date in two venture capital funding rounds. Its investors include Spark Capital, Insight Partners, and Air Street Capital. The company's current valuation has not been disclosed.
The news will likely cheer competitors too. Profluent is just one of a clutch of startups applying the same techniques behind large language models to proteins. Among the best known of these is Isomorphic Labs, a drug discovery startup owned by Google-parent company Alphabet and which it spun-out from Google DeepMind. DeepMind had pioneered the use of AI models to predict protein structures from DNA sequences. Others include EvolutionaryScale, founded by former Meta AI researchers who worked on protein language models, as well as Ginkgo Bioworks, Cradle Bio, Evozyne, and Protai.
Startups have been touting AI as a way to speed the discovery of new drugs and lower the cost of drug development for years now. But it is worth noting that so far, no AI-discovered therapies have made it all the way through human clinical trials to market, although there are an increasing number of candidates in the first two phases of the three-phase clinical trial process. In addition, AI has been successfully used to repurpose existing drugs to treat diseases different from their original targets.
For now, Profluent has shown that its latest protein design AI model, which it calls ProGen3, can, from a single input prompt, produce novel antibodies that are as effective—or sometimes significantly more effective—as commercially-available ones at binding to target proteins, while being structurally different enough to not infringe on any existing patents. It has called these “OpenAntibodies,” in a nod to open source software, and plans to make the DNA recipes for 20 of them publicly available, either completely royalty-free or through a single upfront licensing fee, according to Madani.
The company says it has also used ProGen3 to design a number of gene editing proteins that are more compact and potentially easier to use than the Nobel Prize-winning CRISPR-Cas9 system, which is the foundation of most contemporary gene editing therapies. While potent, the Cas9 protein, which are the “molecular scissors” that snips DNA, is itself a fairly large protein. This makes it difficult to package inside an engineered virus—which is how these gene therapies are delivered to the patient—alongside other components that are important for the therapy to work, such as RNA sequences that guide Cas9 to the right location for editing. It also limits the scope of the edits Cas9 can make. The ProGen3-designed gene editors overcome many of these limitations, the company said.
Profluent last year released one gene editing protein, which it called OpenCRISPR-1 and made available freely to researchers and for commercial applications, which has already been adopted by many biology researchers.
The company said in a paper published online alongside its announcement today that larger “protein language models”—which share a similar underlying architecture with large language models but are trained on protein data instead of text—perform better than smaller ones at producing more diverse sets of proteins that still perform when tested in a lab. But they also learn more quickly to adapt their suggested outputs to researchers’ preferences for proteins with certain properties, such as how stable they are or how quickly and tightly they bind to a target, improving their performance much faster based on feedback from laboratory data than do smaller models built with the same basic design.
The idea of scaling laws for large language models (or LLMs)—the kind of AI system behind OpenAI’s ChatGPT and other generative AI chatbots—was first proposed by researchers at OpenAI in 2020. The scaling laws are not absolute laws, like the laws of physics, but rather a suggestion from experimental data. The AI scaling laws suggested that when you took a LLM and made it larger in terms of its number of parameters—its tunable nodes—and fed it significantly more data during its initial training, its performance improved by a factor proportional to this size increase.
These original scaling laws seemed to hold for LLMs until last year, when many leading AI labs began to acknowledge that beyond a certain size, progress from building larger models and feeding them more data during initial training petered off. No longer were the much larger models significantly better than their smaller predecessors. And some researchers, such as OpenAI’s former chief scientist Ilya Sutskever, said the culprit was a lack of data: Having already scraped all the data from the entire public internet and also fed these LLMs additional massive datasets besides, there simply wasn’t enough human-generated data to further boost the models’ capabilities.
In response, AI companies have turned to a different method for boosting LLM performance known as “test time compute.” This involves having an AI model generate many more possible outputs at the moment of inference—when a trained model is given a prompt—and then use some sort of process to select the best answer from these possibilities. The process uses more computing power during inference, but does not require models to be larger or trained initially on more data. Researchers at OpenAI and elsewhere say “test time compute” also follows “scaling laws” in which the more time or computer power a model uses during inference, the better its performance.
The scaling laws for biological AI models Profluent has found, Madani says, are mostly about the power of data used in training. And he said that unlike with LLMs, when it comes to protein language models, companies are not close to exhausting the supply of available data.
Whereas Google DeepMind’s AlphaFold 3 model—which can predict protein structures, protein-protein binding, and small molecule-protein binding—was trained on 214 million complete protein structures, Profluent trained its largest model on about 3.4 billion protein such sequences. This is an order of magnitude more data, Madani said. And, unlike with LLMs, he says there is plenty more data to potentially use. He said Profluent currently had access to databases of some 80 billion protein sequences and that he expected to soon double that amount. He said these sequences are drawn from a mixture of public and proprietary datasets.
“We are at the beginning of a race for us and others too for continued scaling for biology,” Madani said. The company’s current work, he said, essentially “fired the starting gun” on that race and he compared it to OpenAI’s launch of GPT-2, one of its early LLMs, in 2019. “We’re basically on the precipice of GPT-2 right now,” he said. “Extrapolate what will happen. We are going to come out with a lot of exciting things.”
Correction, April 16: An earlier version of this story misspelled the last name of Profluent founder and CEO Ali Madani.