Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Fortune
Fortune
Jeremy Kahn

AI has struggled with spreadsheets and tables. A German startup's breakthrough could change that

Prior Labs cofounders, from left: Sauraj Gambhir, Frank Hutter, and Noah Hollman. (Credit: Photo courtesy of Prior Labs)

A lot of information inside companies is what’s known as “tabular data,” or data that is presented in rows and columns. Think spreadsheets and database entries and lots of figures in reports.

Well, it turns out that artificial intelligence models have difficulty working with tabular data, for several reasons. It's often a confusing jumble—sometimes text and sometimes numbers, as well as numbers in different units of measurement. What’s more, the relationship between different cells in the table is sometimes unclear. Knowing which cells influence which other cells in a table often requires domain expertise.

For years, machine learning researchers have been trying to crack this tabular data problem. Now, a group of researchers has found what they claim is an elegant solution: A large foundation model—similar to the large language models that underpin products like OpenAI’s ChatGPT—but specifically trained on tabular data. This pre-trained model can then be applied to any tabular data set, and with just a few examples, make accurate inferences about the relationship between data in various cells and also predict missing data better than any prior machine learning method.

Frank Hutter and Noah Hollman, two Germany-based computer scientists who helped pioneer this technique and recently published a paper on it in the prestigious scientific journal Nature, have teamed with Sauraj Gambhir, who has experience in finance, on a startup called Prior Labs dedicated to commercializing this technology.

Today Prior Labs, which is based in Freiburg, Germany, announced it has received 9 million euros ($9.3 million) in pre-seed funding. The round is led by London-based venture capital firm Balderton Capital along with XTX Ventures, SAP founder Hans Werner-Hector’s Hector Foundation, Atlantic Labs, and Galion.exe. A number of prominent angel investors, including Hugging Face cofounder and chief scientist Thomas Wolf, Guy Podjarny, who founded Snyk and Tessl, and Ed Grefenstette, a well-known DeepMind researcher, also participated in the funding.

“Tabular data is the backbone of science and business, yet the AI revolution transforming text, images and video has had only a marginal impact on tabular data–until now,” James Wise, a partner at Balderton Capital, said in a statement, explaining why the firm decided to invest in Prior Labs.

The model Prior Labs used for its Nature study is called a Tabular Prior-Fitted Network (TabPFN for short.) But TabPFN is trained only on the numerical data in tables, not the text. Hutter, a well-known AI researcher formerly at the University of Freiburg and the Ellis Institute Tubingen, said Prior Labs wants to take this model and make it multimodal, so that it can understand both numbers and text. Then the model will be able to understand column headings and reason about them, and users will be able to interact with the AI system using natural language prompts, just like an LLM-based chatbot.

Today’s LLM’s, even the more advanced reasoning models, such as OpenAI's o3 model, can answer some questions about what a table says, but they can't make accurate predictions based on an analysis of the data in the table. “LLMs are just horrible at that,” Hutter said. “It’s like, it's nowhere close. It's not only that, it's also super slow.” As a result, most people who needed to analyze this kind of data used older statistical methods that were fast, but not always the most accurate.

But Prior Labs’ TabPFN can make accurate predictions, including on what are called time series, where past data is used to predict the next most likely data point based on complex patterns. In a new paper the Prior Labs team published in January on the non-peer reviewed research repository arxiv.org, the team found that TabPFN outperformed existing time series prediction models. It beat the best previous small AI model for such predictions by 7.7% and beat a model that is 65 times larger than TabPFN by 3%.

Time series prediction has many applications across industries, but especially in medical and financial domains. “Hedge funds love us,” Hutter said. (One of Prior Labs’ initial customers is, in fact, a hedge fund, but Hutter said he was contractually barred from saying which one. Another initial customer with which Hutter is doing a proof of concept is software giant SAP.)

Prior Labs is offering TabPFN as an open source model—with the only license requirement being that if people use the model, they must publicly say so. So far, it has been downloaded about one million times, according to Hutter. Like most open source AI companies, Prior Labs plans to make money by working with specific customers to help them tailor the models to their use case and also by building tools and applications for specific market segments.

Prior Labs is not the only company working to crack AI's limits when it comes to tabular data. Startups Ikigai Labs, which was founded by MIT data scientist Devarat Shah, and French startup Neuralk AI are among others working on applying deep learning methods, including generative AI, to tabular data. Researchers at Google and Microsoft have also been working on this problem. Google Cloud's tabular data solutions are built in part on AutoML, a process that uses machine learning to automate the steps needed to create effective AI models, an area that Hutter helped pioneer.

Hutter said Prior intends to keep improving its models, working more on relational databases, time series, and building the ability to do what is called “causal discovery”—where a user asks which data points in a table have a causal relationship with other data in the table. Then there’s the chat feature that will let users ask questions of the tables using a chat-like interface. “All of this we will build in the first year,” he said.

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.