← USPTO Patent Applications

ATTENTION MECHANISM ADJUSTMENT METHOD BASED ON ATTENTION SCORE AND COMPUTING DEVICE USING THE SAME

Application US20260093988A1 Kind: A1 Apr 02, 2026

Assignee

Industrial Technology Research Institute

Inventors

Yao-Hua Chen, Po-Hung Lin, Chih-Tsun Huang

Abstract

An attention mechanism adjustment method based on attention scores, applicable to Transformer models, is provided. The method includes: for the current Transformer block of the Transformer model, obtaining query matrix, key matrix, and value matrix based on the input sequence; using the self-attention module to generate multiple attention score matrices corresponding to multiple attention heads; before executing the softmax function, performing cross-head column-wise aggregation operation on the attention score matrices to obtain a token importance vector; comparing importance scores with the trained importance score threshold to determine if pruning is needed; executing pruning operations on target tokens that need pruning to obtain pruned attention score matrices; performing softmax function operations on the pruned attention score matrices to obtain a pruned attention probability matrix, where the probability values of the pruned tokens are zero.

CPC Classifications

G06N 3/082 G06N 3/045

Filing Date

2024-11-26

Application No.

18961430