Pyramid Key-Value Cache Compression for Transformer Models

USPTO

Changeflow GovPing Telecom & Technology Pyramid Key-Value Cache Compression for Transfo...

Routine Notice Added Final

Pyramid Key-Value Cache Compression for Transformer Models

USPTO Patent Applications - AI & Computing (G06N)

Published April 9th, 2026

Detected April 17th, 2026

Email

Summary

USPTO published patent application US20260099695A1 on April 9, 2026, for a method of operating transformer models with algorithmic key-value cache memory allocation across decoding layers. The invention allocates a fixed memory budget progressively across layers, with higher layers receiving smaller cache allocations. Each layer independently determines maximum key-value vector pairs based on its allocated cache.

Published by USPTO on changeflow.com . Detected, standardized, and enriched by GovPing. Review our methodology and editorial standards .

View original document View source feed page

What changed

USPTO published patent application US20260099695A1 for a pyramid key-value cache compression method in transformer models. The application claims a system for allocating a fixed cache memory budget across multiple decoding layers, with progressively higher layers receiving smaller cache allocations. Each layer independently caps the maximum number of key-value vector pairs it retains during token decoding operations.

This patent application affects AI researchers, machine learning engineers, and technology companies developing transformer-based models. If granted, the patent would provide intellectual property protection for cache compression techniques that may be relevant to optimizing large language model inference and deployment. The patent has no compliance deadlines or regulatory obligations associated with its publication.

Archived snapshot

Apr 17, 2026

GovPing captured this document from the original source. If the source has since changed or been removed, this is the text as it existed at that time.

← USPTO Patent Applications

PYRAMID KEY-VALUE CACHE COMPRESSION FOR TRANSFORMER MODELS

Application US20260099695A1 Kind: A1 Apr 09, 2026

Inventors

Wen XIAO, Wei XIONG, Abedelkader ASI, Zefan CAI

Abstract

A method for operating a transformer model includes algorithmically allocating a fixed budget for a key-value cache between multiple decoding layers per an allocation scheme that ensures progressively higher decoding layers in the transformer model are allocated progressively smaller quantities of cache memory. The method further includes configuring each of the multiple decoding layers of the transformer model to retain no more than a maximum number of key-value vector pairs in the key-value cache during a token decoding operation, the maximum number of key-value vector pairs being independently determined for each decoding layer of the multiple decoding layers based on the cache memory that is allocated to the decoding layer.

CPC Classifications

G06N 3/045

Filing Date

2024-10-09

Application No.

18910974

View original document →

Related changes

Universal Machine Learning Pipeline Execution System and Method

Routine Apr 15, 2026 • USPTO Patent Applications - AI & Computing (G06N) • Telecom & Technology

Asynchronous Quantum Information Processing System Reduces QIPU Dead Time

Routine Apr 15, 2026 • USPTO Patent Applications - AI & Computing (G06N) • Telecom & Technology

RAG Content Quality Evaluation Method Using Large Language Models

Routine Apr 17, 2026 • USPTO Patent Applications - AI & Computing (G06N) • Telecom & Technology

Get daily alerts for USPTO Patent Applications - AI & Computing (G06N)

Daily digest delivered to your inbox.

Free. Unsubscribe anytime.

Source

USPTO Patent Applications - AI & Computing (G06N) changeflow.com/changebridge/uspto-patent-applications/G06N

Telecom & Technology

About this page

What is GovPing?

Every important government, regulator, and court update from around the world. One place. Real-time. Free. Our mission

What's from the agency?

Source document text, dates, docket IDs, and authority are extracted directly from USPTO.

What's AI-generated?

The summary, classification, recommended actions, deadlines, and penalty information are AI-generated from the original text and may contain errors. Always verify against the source document.

Last updated

April 17, 2026

Press inquiries →

Classification

Agency

USPTO

Published

April 9th, 2026

Instrument

Notice

Legal weight

Non-binding

Stage

Final

Change scope

Minor

Document ID

US20260099695A1

Docket

18910974

Who this affects

Applies to

Technology companies Manufacturers

Industry sector

5112 Software & Technology

Activity scope

Patent filing AI model optimization

Geographic scope

United States US

Taxonomy

Primary area

Intellectual Property

Operational domain

Legal

Topics

Artificial Intelligence Software & Technology

Browse Categories

Agriculture & Food Safety 65 AI Regulation 3 Banking & Finance 332 Consumer Protection 63 Courts & Legal 361 Data Privacy & Cybersecurity 77 Defense & National Security 51 Education 44 Energy 100 Environment 86 Environmental & Energy 36 Environmental Regulation 7 Financial Regulation 2 Government & Legislation 278 Government Operations 107 Healthcare 136 Healthcare Compliance 6 Healthcare & Life Sciences 72 Housing 16 Immigration 8 Immigration & Border Control 2 Insurance 66 Labor & Employment 126 Legal & Judicial 29 Pharma & Drug Safety 101 Pharma & Healthcare 1 Privacy 1 Public Health 2 Real Estate & Housing 61 Sanctions & Export Controls 1 Securities & Investments 28 Securities & Markets 103 Securities Regulation 6 Tax 64 Tax & Revenue 9 Telecom & Technology 47 Trade & Commerce 3 Trade & Sanctions 135 Transportation 85

Pyramid Key-Value Cache Compression for Transformer Models

Summary

What changed

Archived snapshot

PYRAMID KEY-VALUE CACHE COMPRESSION FOR TRANSFORMER MODELS

Inventors

Abstract

CPC Classifications

Filing Date

Application No.

Related changes

Source

About this page

Classification

Who this affects

Taxonomy

Browse Categories

Get alerts for this source

Subscribed!