Attention Mechanism Adjustment Method Based on Attention Score and Computing Device
Summary
The USPTO published patent application US20260093988A1 by Industrial Technology Research Institute for an attention mechanism adjustment method in Transformer models. The invention performs cross-head column-wise aggregation on attention score matrices to determine token importance, then prunes less important tokens before softmax operations. This reduces computational complexity while maintaining model performance.
What changed
Industrial Technology Research Institute filed patent application US20260093988A1 for an attention mechanism adjustment method applicable to Transformer models. The method generates attention score matrices from query, key, and value matrices, then performs cross-head column-wise aggregation to create a token importance vector. This importance vector is compared against trained thresholds to identify tokens for pruning. Pruned attention score matrices are processed through softmax to zero out probability values of removed tokens.
Technology companies developing Transformer-based AI systems should monitor this patent for potential licensing considerations. Legal and IP professionals should evaluate the method's scope for freedom-to-operate analysis. The patent has no regulatory compliance requirements as it represents intellectual property protection rather than an enforceable regulation.
Archived snapshot
Apr 2, 2026GovPing captured this document from the original source. If the source has since changed or been removed, this is the text as it existed at that time.
ATTENTION MECHANISM ADJUSTMENT METHOD BASED ON ATTENTION SCORE AND COMPUTING DEVICE USING THE SAME
Application US20260093988A1 Kind: A1 Apr 02, 2026
Assignee
Industrial Technology Research Institute
Inventors
Yao-Hua Chen, Po-Hung Lin, Chih-Tsun Huang
Abstract
An attention mechanism adjustment method based on attention scores, applicable to Transformer models, is provided. The method includes: for the current Transformer block of the Transformer model, obtaining query matrix, key matrix, and value matrix based on the input sequence; using the self-attention module to generate multiple attention score matrices corresponding to multiple attention heads; before executing the softmax function, performing cross-head column-wise aggregation operation on the attention score matrices to obtain a token importance vector; comparing importance scores with the trained importance score threshold to determine if pruning is needed; executing pruning operations on target tokens that need pruning to obtain pruned attention score matrices; performing softmax function operations on the pruned attention score matrices to obtain a pruned attention probability matrix, where the probability values of the pruned tokens are zero.
CPC Classifications
G06N 3/082 G06N 3/045
Filing Date
2024-11-26
Application No.
18961430
Named provisions
Related changes
Get daily alerts for USPTO Patent Applications - AI & Computing (G06N)
Daily digest delivered to your inbox.
Free. Unsubscribe anytime.
Source
About this page
Every important government, regulator, and court update from around the world. One place. Real-time. Free. Our mission
Source document text, dates, docket IDs, and authority are extracted directly from USPTO.
The plain-English summary, classification, and "what to do next" steps are AI-generated from the original text. Cite the source document, not the AI analysis.
Classification
Who this affects
Taxonomy
Browse Categories
Get alerts for this source
We'll email you when USPTO Patent Applications - AI & Computing (G06N) publishes new changes.