USPTO Patent Grant: Evolutionary thought caching for multi-stage language models

ChangeBridge: Patent Grants - AI & Computing (G06N)

Published March 24th, 2026

Detected March 25th, 2026

Summary

The USPTO has granted patent US12585882B1 to ATOBEAM TECHNOLOGIES INC. for an "Evolutionary thought caching for multi-stage language model systems." This patent describes a method for efficient natural language processing using a combination of large and small language models with a reasoning cache architecture.

View original document View source feed page

What changed

The United States Patent and Trademark Office (USPTO) has issued patent US12585882B1, titled "Evolutionary thought caching for multi-stage language model systems," to ATOBEAM TECHNOLOGIES INC. The patent details a novel system and method for natural language processing that employs a reasoning cache architecture combining large and small language models. This approach aims to enhance efficiency by caching structured thoughts generated by a large language model, which are then collaboratively evolved by specialized agents using genetic algorithms. The system retrieves similar cached or evolved thoughts based on latent representation similarity to route to a smaller language model for response generation, thereby reducing computational overhead and extending effective context.

This patent grant is primarily of interest to technology companies involved in AI and language model development. While it does not impose new regulatory obligations or compliance deadlines on regulated entities, it represents a significant development in the intellectual property landscape for AI technologies. Compliance officers in the technology sector should be aware of this patent as it pertains to innovations in efficient natural language processing and may influence future product development and intellectual property strategies.

Source document (simplified)

← USPTO Patent Grants

Evolutionary thought caching for multi-stage language model systems

Grant US12585882B1 Kind: B1 Mar 24, 2026

Assignee

ATOBEAM TECHNOLOGIES INC.

Inventors

Brian Galvin, Alan McCord

Abstract

A system and method for efficient natural language processing combines large and small language models with a reasoning cache architecture. Input data is processed by a first large language model to generate structured thoughts with associated latent representations, which are cached for future use. Specialized agents perform domain-specific operations on cached thoughts and collaboratively evolve them using genetic algorithms. When new input is received, similar cached or evolved thoughts are retrieved based on latent representation similarity. The input and retrieved thoughts are then routed to a second, smaller language model to generate a response. This architecture reduces computational overhead while preserving response quality, enables reuse of reasoning across sessions and devices, and extends effective context beyond traditional sequence limits. By leveraging prior reasoning, the system minimizes redundant computation and supports scalable deployment across diverse hardware environments.