Metagenomi Slashes AI Costs by 56% Using AWS Custom Chips for Gene-Editing Research

AI-Powered Gene Editing Breakthrough

Gene editing startup Metagenomi has achieved substantial cost savings in its artificial intelligence operations by switching to Amazon Web Services’ custom silicon, according to company statements. Sources indicate the biotech firm reduced its AI computing expenses by 56 percent compared to previous Nvidia GPU-based systems while accelerating the discovery of potential life-saving therapies.

AI-Powered Gene Editing Breakthrough
CRISPR Technology and Therapeutic Potential
Protein Language Models Accelerate Discovery
AWS Inferentia 2 Versus Nvidia L40S
Improved Reliability and Research Output
Broader Implications for AI Workloads

CRISPR Technology and Therapeutic Potential

Founded in 2018, Metagenomi utilizes the Nobel prize-winning CRISPR technology developed by Jennifer Doudna and Emmanuelle Charpentier, which enables precise editing of gene sequences. According to Chris Brown, VP of discovery at Metagenomi, this approach represents “a new therapeutic modality aimed at treating disease by addressing the cause of disease at the genetic level.”

Analysts suggest this methodology represents a significant shift from treating symptoms to potentially curing diseases at their genetic roots. The company’s research focuses on identifying specialized enzymes that can target specific DNA sequences, cut them at precise locations, and fit within delivery mechanisms for therapeutic applications.

Protein Language Models Accelerate Discovery

To identify these rare enzymes, Metagenomi employs generative AI systems known as protein language models (PLMs), including Progen2. These models rapidly generate millions of potential enzyme candidates, dramatically increasing the odds of discovering viable therapeutic options., according to related coverage

“It’s about finding that one thing in a million. So if you’ve got access to twice as many, you’re doubling your chances of potentially getting a product at the end,” Brown explained in the report.

Developed by researchers at Salesforce, Johns Hopkins, and Columbia Universities in 2022, Progen2 functions similarly to text-generating AI models but synthesizes novel protein sequences instead of text. With approximately 800 million parameters, the model is considerably smaller than contemporary large language models, making it suitable for running on various accelerator types.

AWS Inferentia 2 Versus Nvidia L40S

For its comparative analysis, Metagenomi tested AWS’s Inferentia 2 accelerators against Nvidia’s L40S GPUs, which the company had previously used for running Progen2. The report states that while Nvidia’s L40S offers superior specifications on paper with 48GB of GDDR6 memory and 362 teraFLOPS of 16-bit performance, AWS’s Inferentia 2 delivered significantly better cost efficiency.

Amazon representatives reportedly attribute these savings to the chip’s optimized batch processing pipeline, AWS Batch, and strategic use of spot instances. According to Kamran Khan, head of business development for AWS’s Annapurna Labs machine learning team, “Spot Instances are generally 70-ish percent lower cost than on demand,” and the company’s workflow optimization allowed around-the-clock experimentation scheduling.

Improved Reliability and Research Output

Additional savings reportedly stemmed from greater instance availability. Sources indicate the interruption rate for AWS’s custom chips is approximately five percent, compared to 20 percent for Nvidia-based spot instances. This reliability improvement means fewer disrupted research batches, allowing for more consistent experimentation.

For Metagenomi’s research team, the reduced operating costs have translated directly into increased scientific output. Brown noted that projects that would have been annual undertakings can now be run “multiple times a day or a week,” significantly accelerating the discovery process for enzymes targeting various diseases.

Broader Implications for AI Workloads

The collaboration highlights an important trend in AI computing: for non-interactive workloads, the latest hardware isn’t necessarily the most cost-effective solution. According to analysts, older or specialized accelerators available at discounted rates may offer superior value for specific applications, particularly in research environments where immediate real-time processing isn’t required.

This case study suggests that cloud providers’ custom silicon solutions are becoming increasingly competitive for specialized AI workloads, potentially challenging Nvidia’s dominance in certain market segments. The reported 56 percent cost reduction could influence other research organizations and startups to evaluate alternative accelerator options for their AI inference workloads.