Stanford AI Designs 16 Functional Viruses from 302 Synthetic Genomes

Image Credit: Jacky Lee

Researchers at Stanford University and the nonprofit Arc Institute have used artificial intelligence to design fully functional bacteriophages — viruses that infect and kill bacteria — in a proof-of-concept that could eventually sharpen the fight against antibiotic-resistant infections.

The work, described in a landmark September 2025 preprint on bioRxiv, is widely regarded as the first demonstration of AI-generated, fully viable viral genomes designed from scratch and then built and tested in the lab. The team combined large "genome language models" with DNA synthesis and automated screening to produce 302 synthetic phage genomes, of which 16 proved to be fully functional viruses that can infect E. coli, and in some cases, significantly outperform natural strains.

Evo 1 and Evo 2: Genome-Scale Language Models

The breakthrough builds on two Arc-developed genome models, Evo 1 and Evo 2.

  • Evo 1, introduced in 2024, was a 7-billion parameter model trained on roughly 300 billion nucleotides from about 2.7 million microbial and phage genomes. It demonstrated early utility in predicting mutation effects and designing CRISPR components.

  • Evo 2, unveiled in February 2025, scales this approach dramatically. It is trained on approximately 9.3 trillion nucleotides from more than 128,000 genomes spanning bacteria, archaea, plants, animals, and humans. Built on the StripedHyena-2 architecture, Evo 2 can handle sequence contexts of up to one million nucleotides, allowing it to capture long-range genomic patterns that conventional Transformer models struggle with. Its training data (the OpenGenome2 corpus) excludes viruses that infect animals and humans as a critical safety measure.

For this project, the team fine-tuned Evo models on a curated set of about 14,000 Microviridae family phage genomes — a group of small single-stranded DNA viruses — while strictly maintaining the exclusion of eukaryotic viruses.

The Method: AI as Genome Architect

The researchers focused on PhiX174, a compact, ~5.4 kilobase bacteriophage whose DNA genome was famously the first to be sequenced in 1977. PhiX174 infects non-pathogenic E. coli C strains and cannot infect humans, making it a standard workhorse in molecular biology and a safe "chassis" for synthetic design.

Using the fine-tuned Evo models, the team generated thousands of candidate DNA sequences inspired by PhiX174 but distinct from naturally occurring variants. From this pool, they selected 302 designs for chemical DNA synthesis.

These synthetic genomes were introduced into E. coli host cells. Sixteen of the designs successfully "booted up", forming plaques — clear zones on bacterial lawns indicating that fully functional phages were replicating and lysing cells. Electron microscopy confirmed that these AI-designed viruses assembled capsids indistinguishable from natural Microviridae.

Three Standout Artificial Phages

Among the 16 viable designs, three specific variants highlighted the model's ability to optimize different biological traits:

  1. Evo-69 (Super-Replicator): This variant showed much stronger replication than the wild-type X174. In head-to-head tests, it achieved roughly 16- to 65-fold amplification over a six-hour window, compared with just 1.3- to 4-fold for the natural virus in the same conditions.

  2. Evo-2483 (Rapid Killer): This design exhibited the fastest lytic kinetics, destroying the bacterial population in 135 minutes, compared to 180 minutes for the natural X174.

  3. Evo-36 (Structural Innovator): Perhaps the most scientifically surprising design, Evo-36 successfully incorporated a "foreign" gene, a DNA packaging protein from the distantly related phage G4, into the X174 scaffold. This demonstrated that the AI could successfully swap and integrate functional modules from different evolutionary lineages, a complex feat in protein engineering.

Overcoming Resistance

To probe robustness, the team challenged the AI-designed phages with three lab-evolved, X174-resistant E. coli strains. These bacteria carry mutations in the waa operon, which alters their surface lipopolysaccharides (LPS) and blocks the usual entry route for X174.

A cocktail of different Evo-designed phages, when propagated for a few passages on these resistant strains, successfully evolved mutations that allowed them to bypass the block. In contrast, the original X174 struggled or failed to adapt under the same conditions. The result suggests that the genetic diversity provided by the AI-generated pool offers a richer starting point for overcoming bacterial resistance than natural isolates alone.

Roots in a Century-Old Idea

Bacteriophage therapy is not new. In 1917, Félix d’Herelle at the Pasteur Institute identified viruses that prey on bacteria and tested them against dysentery. However, the rise of mass-produced antibiotics in the 1940s pushed phage therapy to the margins of Western medicine.

Interest rebounded in the 2010s as antimicrobial resistance (AMR) became a global health crisis. A 2019 analysis estimated 1.27 million deaths worldwide were directly attributable to drug-resistant infections. While synthetic biology has allowed researchers to tweak natural phages for years, Evo represents a shift from "editing" to "generative design".

Potential Gains and Major Hurdles

The Evo phages’ performance hints at how AI could reshape therapy:

  • Rapid Personalization: AI could potentially generate bespoke phage designs for a patient's specific infection in days, rather than the months currently required to hunt for natural phages in sewage or soil samples.

  • "One Health" Applications: Beyond humans, these tools could design phages to control agricultural pathogens like Salmonella or treat wastewater.

Significant hurdles remain:

  • Scaling Complexity: X174 is tiny (~5.4 kb). Designing larger phages (often 40kb–150kb) or eukaryotic viruses involves far more complex gene regulation and host interactions.

  • Biosafety and Dual-Use: Although the team deliberately restricted training data to non-human viruses, the ability to program viral genomes raises "dual-use" concerns. Biosecurity experts warn that as these tools become more accessible, the barrier to creating harmful agents could lower, necessitating strict oversight on DNA synthesis and model access.

  • Regulatory Path: The study is a preprint and has not yet completed peer review. Translating AI viruses into approved drugs will require entirely new regulatory frameworks.

Peers in the AI Biology Race

Evo stands out for generating whole genomes that function in vivo.

  • ProGen (Salesforce): Generates functional proteins/enzymes but not whole genomes.

  • AlphaFold (Google DeepMind): Predicts protein structure (3D shape) but does not generate genetic sequence (DNA) or assemble genomes.

  • HyenaDNA: An earlier long-context DNA model, but one that primarily focused on prediction tasks rather than generation of viable organisms.

For now, the "Evo phages" serve as a vivid demonstration of a new reality: AI systems no longer just "read" biological sequences, they can write entirely new genomes that work in living cells.

3% Cover the Fee
TheDayAfterAI News

We are a leading AI-focused digital news platform, combining AI-generated reporting with human editorial oversight. By aggregating and synthesizing the latest developments in AI — spanning innovation, technology, ethics, policy and business — we deliver timely, accurate and thought-provoking content.

Previous
Previous

Grokipedia: Musk’s AI Encyclopedia Hits 1M+ Entries to Rival Wikipedia

Next
Next

Moonshot Launches Kimi K2 Thinking: 1T Model Rivals GPT-5