Changeflow GovPing Telecom & Technology Method for Accelerating LLM Inference Procedures
Routine Notice Added Draft

Method for Accelerating LLM Inference Procedures

Favicon for changeflow.com ChangeBridge: Patent Apps - AI & Computing (G06N)
Published September 26th, 2025
Detected April 3rd, 2026
Email

Summary

USPTO published patent application US20260094028A1 by MEDIATEK INC. disclosing a method for accelerating large language model inference through draft token generation, rule-based determination, and matching operations. The invention aims to improve computational efficiency of LLM inference procedures using a two-stage drafting and matching approach. The application was filed September 26, 2025.

What changed

MEDIATEK INC. filed a patent application disclosing a method for accelerating LLM inference procedures. The method involves generating draft tokens through a first drafting procedure, determining whether an acceleration rule is met, generating additional draft tokens if needed, inputting formal draft tokens to the LLM, and performing matching operations between formal draft tokens and generated target tokens. The invention covers CPC classifications G06N 5/04 and G06F 40/284, with Application No. 19340885.

Technology companies developing or deploying large language models should monitor this patent's prosecution. While patent applications create no immediate compliance obligations, if granted, the technique may become relevant for companies implementing LLM inference acceleration. Patent prosecution typically spans 2-3 years before a grant or rejection decision. No action is required at this stage.

Source document (simplified)

← USPTO Patent Applications

METHOD FOR PERFORMING ACCELERATION PROCEDURE TO ACCELERATE INFERENCE PROCEDURE OF LARGE LANGUAGE MODEL

Application US20260094028A1 Kind: A1 Apr 02, 2026

Assignee

MEDIATEK INC.

Inventors

Yue-Ting Pan, Huai-Ting Li, Yi-Min Tsai, Ya-Lin Huang, I-Lin Chen

Abstract

A method for performing an acceleration procedure to accelerate an inference procedure of a large language model (LLM) includes: performing a first drafting procedure to generate multiple first draft tokens; according to first draft information related to the multiple first draft tokens, determining whether a first rule is met to generate a first determination result, wherein the first rule corresponds to the first acceleration procedure; in response to the first determination result indicating that the first rule is not met, performing a second drafting procedure to generate multiple second draft tokens; obtaining multiple formal draft tokens at least based on the multiple second draft tokens; inputting the multiple formal draft tokens to the LLM in order to generate multiple target tokens; and performing a matching operation upon the multiple formal draft tokens and the multiple target tokens to generate at least one output tokens of the LLM.

CPC Classifications

G06N 5/04 G06F 40/284

Filing Date

2025-09-26

Application No.

19340885

View original document →

Named provisions

Acceleration Procedure Drafting Procedure Matching Operation First Rule Determination

Classification

Agency
USPTO
Published
September 26th, 2025
Instrument
Notice
Legal weight
Non-binding
Stage
Draft
Change scope
Minor
Document ID
US20260094028A1

Who this affects

Applies to
Technology companies
Industry sector
5112 Software & Technology
Activity scope
Patent Applications AI/ML Technology Development
Geographic scope
United States US

Taxonomy

Primary area
Artificial Intelligence
Operational domain
Legal
Topics
Machine Learning Patent Applications Technology

Get Telecom & Technology alerts

Weekly digest. AI-summarized, no noise.

Free. Unsubscribe anytime.

Get alerts for this source

We'll email you when ChangeBridge: Patent Apps - AI & Computing (G06N) publishes new changes.

Optional. Personalizes your daily digest.

Free. Unsubscribe anytime.