Changeflow GovPing Telecom & Technology Bridging LLMs of Differing Sizes to Reduce Latency
Routine Notice Added Final

Bridging LLMs of Differing Sizes to Reduce Latency

Favicon for changeflow.com USPTO Patent Applications - AI & Computing (G06N)
Published
Detected
Email

Summary

USPTO published patent application US20260099528A1 by inventor Brett Barros for methods of reducing LLM latency by using a smaller LLM to generate immediate responses while a larger LLM produces refined content starting from the smaller model's output. The larger model generates a refined portion succeeding the initial content, which can then be rendered to the user. Alternative implementations use default text strings or predefined templates selected via natural language understanding of the user query.

Published by USPTO on changeflow.com . Detected, standardized, and enriched by GovPing. Review our methodology and editorial standards .

What changed

USPTO published patent application US20260099528A1 for LLM latency reduction technology. The application discloses methods where a smaller LLM generates initial content responsive to user queries, allowing immediate rendering of a portion as a response. A larger LLM then generates refined content beginning with that portion and including additional refined content. Alternative embodiments describe using default text strings or templates selected via natural language understanding instead of a smaller LLM.

Technology companies developing LLM-based applications or chatbots may benefit from reviewing this patent filing to understand potential claims around latency reduction techniques. The application has no immediate compliance implications as it represents a patent application rather than a granted patent.

Archived snapshot

Apr 18, 2026

GovPing captured this document from the original source. If the source has since changed or been removed, this is the text as it existed at that time.

← USPTO Patent Applications

LLM LATENCY REDUCTION VIA BRIDGING MULTIPLE LLMS OF DIFFERING SIZES

Application US20260099528A1 Kind: A1 Apr 09, 2026

Inventors

Brett Barros

Abstract

Implementations utilize a smaller LLM to generate content responsive to a user query and cause a portion of the generated content to be rendered as an immediate response to the user query. Implementations further utilize a larger LLM to generate content that starts with the portion of the generated content and that includes a refined portion succeeding the portion of the generated content. The refined portion can be rendered succeeding the portion of the generated content. In some implementations, instead of using the smaller LLM, alternatively, the portion of the generated content rendered as the immediate response can be generated based on a default text string or a template, where the template can be determined/selected from a plurality of predefined templates based on a natural language understanding of the user query.

CPC Classifications

G06F 16/3344 G06F 16/338 G06F 40/289 G06F 40/35 G06N 3/0475

Filing Date

2025-12-11

Application No.

19416474

View original document →

Get daily alerts for USPTO Patent Applications - AI & Computing (G06N)

Daily digest delivered to your inbox.

Free. Unsubscribe anytime.

About this page

What is GovPing?

Every important government, regulator, and court update from around the world. One place. Real-time. Free. Our mission

What's from the agency?

Source document text, dates, docket IDs, and authority are extracted directly from USPTO.

What's AI-generated?

The summary, classification, recommended actions, deadlines, and penalty information are AI-generated from the original text and may contain errors. Always verify against the source document.

Last updated

Classification

Agency
USPTO
Published
April 9th, 2026
Instrument
Notice
Legal weight
Non-binding
Stage
Final
Change scope
Minor
Document ID
US20260099528A1
Docket
19416474

Who this affects

Applies to
Technology companies Manufacturers
Industry sector
5112 Software & Technology
Activity scope
Patent application LLM technology
Geographic scope
United States US

Taxonomy

Primary area
Intellectual Property
Operational domain
Legal
Topics
Artificial Intelligence Software & Technology

Get alerts for this source

We'll email you when USPTO Patent Applications - AI & Computing (G06N) publishes new changes.

Free. Unsubscribe anytime.

You're subscribed!