Get all your news in one place.
100’s of premium titles.
One app.
Start reading
TechRadar
TechRadar
Efosa Udinmwen

BitTorrent for LLM? Exo software is a distributed LLM solution that can run even on old smartphones and computers

An AI face in profile against a digital background.

  • Exo supports LLaMA, Mistral, LlaVA, Qwen, and DeepSeek
  • Can run on Linux, macOS, Android, and iOS, but not Windows
  • AI models needing 16GB RAM can run on two 8GB laptops

Running large language models (LLMs) typically requires expensive, high-performance hardware with substantial memory and GPU power. However, Exo software now looks to offer an alternative by enabling distributed artificial intelligence (AI) inference across a network of devices.

The company allows users to combine the computing power of multiple computers, smartphones, and even single-board computers (SBCs) like Raspberry Pis to run models that would otherwise be inaccessible.

This decentralized approach shares similarities with the SETI@home project, which distributed computing tasks across volunteer machines. By leveraging a peer-to-peer (P2P) network, Exo eliminates the need for a single, powerful system, making AI inference more accessible to individuals and organizations.

How Exo distributes AI workloads

Exo aims to challenge the dominance of large technology companies in AI development. By decentralizing inference, it seeks to give individuals and smaller organizations more control over AI models, similar to initiatives focused on expanding access to GPU resources.

"The fundamental constraint with AI is compute," argues Alex Cheema, co-founder of EXO Labs. "If you don’t have the compute, you can’t compete. But if you create this distributed network, maybe we can."

The software dynamically partitions LLMs across available devices in a network, assigning model layers based on each machine’s available memory and processing power. Supported LLMs include LLaMA, Mistral, LlaVA, Qwen, and DeepSeek.

Users can install Exo on Linux, macOS, Android, or iOS, though Windows support is not currently available. A minimum Python version of 3.12.0 is required, along with additional dependencies for systems running Linux fitted with NVIDIA GPUs.

One of Exo’s key strengths is that, unlike traditional setups that rely on high-end GPUs, it enables collaboration between different hardware configurations.

For example, an AI model requiring 16GB of RAM can run on two 8GB laptops working together. A more demanding model like DeepSeek R1, requiring approximately 1.3TB of RAM, could theoretically operate on a cluster of 170 Raspberry Pi 5 devices with 8GB RAM each.

Network speed and latency are critical concerns, and Exo's developers acknowledge that adding lower-performance devices may slow inference latency but insists that overall throughput improves with each device added to the network.

Security risks also arise when multiple machines share workloads, requiring safeguards to prevent data leaks and unauthorized access.

Adoption is another hurdle, as developers of AI tools currently rely on large-scale data centers. The low-cost of Exo's approach may appeal. but Exo's approach simply won’t match the speed of those high-end AI clusters.

Via CNX Software

You may also like

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.