Changeflow GovPing Telecom & Technology Patent US12585934B2: Compressing tokens for tra...
Routine Rule Added Final

Patent US12585934B2: Compressing tokens for transformer models

Favicon for changeflow.com ChangeBridge: Patent Grants - AI & Computing (G06N)
Published March 24th, 2026
Detected March 24th, 2026
Email

Summary

The USPTO has granted patent US12585934B2 to Microsoft Technology Licensing, LLC, for a method of compressing tokens based on positions for training transformer models. This patent covers techniques for optimizing the training data used in AI models.

What changed

Patent US12585934B2, granted on March 24, 2026, to Microsoft Technology Licensing, LLC, details a method for compressing tokens based on their positions to improve the training of transformer models. The invention involves identifying duplicate tokens, combining their positional values, and removing redundant tokens to generate optimized training data. This aims to enhance the efficiency and effectiveness of training artificial intelligence models.

While this is a patent grant and not a regulatory rule imposing direct compliance obligations, it signifies a technological advancement in the field of AI model training. Companies developing or utilizing transformer models, particularly those in the technology sector, may need to be aware of this patented technology to avoid potential infringement. The patent's claims focus on specific methods of data preprocessing for AI training, which could influence future development and licensing strategies in the AI computing space.

Source document (simplified)

← USPTO Patent Grants

Compressing tokens based on positions for transformer models

Grant US12585934B2 Kind: B2 Mar 24, 2026

Assignee

Microsoft Technology Licensing, LLC

Inventors

Andy Wagner, Tiyasa Mitra, Marc Tremblay

Abstract

Embodiments of the present disclosure include systems and methods for compressing tokens based on positions for training data that is used to train transformer models. In some embodiments, a set of input data for training a transformer model is received. The set of input data comprises a set of tokens and a set of position values. A first token in the set of tokens that is the same as a second token in the set of tokens is identified. The position value representing the first token with the position value representing the second token are combined. The set of tokens is modified by removing the first token from the set of tokens. A set of training data is generated to comprise the modified set of tokens and the set of position values. The transformer model is trained using the set of training data.

CPC Classifications

G06F 16/243 G06F 18/214 G06F 40/20 G06F 40/284 G06N 3/045 G06N 3/08 G06N 3/084 G06N 3/049 G06N 3/00

Filing Date

2020-07-21

Application No.

16935089

Claims

20

View original document →

Classification

Agency
USPTO
Published
March 24th, 2026
Instrument
Rule
Legal weight
Binding
Stage
Final
Change scope
Minor
Document ID
US12585934B2

Who this affects

Applies to
Technology companies
Industry sector
5112 Software & Technology
Activity scope
AI Model Training
Geographic scope
United States US

Taxonomy

Primary area
Artificial Intelligence
Operational domain
IT Security
Topics
Machine Learning Data Compression

Get Telecom & Technology alerts

Weekly digest. AI-summarized, no noise.

Free. Unsubscribe anytime.

Get alerts for this source

We'll email you when ChangeBridge: Patent Grants - AI & Computing (G06N) publishes new changes.

Free. Unsubscribe anytime.