Optimizing low precision inference models for deployment of deep neural networks
Summary
The USPTO granted Patent US12596917B2 to Intel Corporation covering systems and methods for optimizing low precision inference models using asymmetric quantization in deep neural networks. The patent includes claims for per-input channel quantization and mixed-precision auto-tuning techniques. The patent names six inventors and contains 25 claims.
What changed
The USPTO issued a patent grant (B2 kind code indicating second grant/reissue) to Intel Corporation for neural network quantization optimization technology. The invention provides methods for generating quantized neural networks with asymmetric quantization where model weights are signed integers and input layers use unsigned integers, along with weights accumulation tables and output restoration functions.
Technology companies developing AI inference models and manufacturers of AI accelerators/chips should monitor this patent portfolio. The 25 granted claims provide Intel exclusive rights to specific quantization optimization techniques that may be relevant to deploying deep neural networks in edge computing, data centers, or specialized AI hardware.
What to do next
- Monitor for updates
Source document (simplified)
Optimizing low precision inference models for deployment of deep neural networks
Grant US12596917B2 Kind: B2 Apr 07, 2026
Assignee
Intel Corporation
Inventors
Jiong Gong, Yong Wu, Haihao Shen, Xiao Dong Lin, Guoming Zhang, Feng Yuan
Abstract
Systems, apparatuses and methods may provide technology for optimizing an inference neural network model that performs asymmetric quantization by generating a quantized neural network, wherein model weights of the neural network are quantized as signed integer values, and wherein an input layer of the neural network is configured to quantize input values as unsigned integer values, generating a weights accumulation table based on the quantized model weights and a kernel size for the neural network, and generating an output restoration function for an output layer of the neural network based on the weights accumulation table and the kernel size. The technology may also perform per-input channel quantization. The technology may also perform mixed-precision auto-tuning.
CPC Classifications
G06N 3/0495 G06N 3/08 G06N 3/045 G06N 3/063
Filing Date
2020-03-13
Application No.
17929023
Claims
25
Related changes
Get daily alerts for ChangeBridge: Patent Grants - AI & Computing (G06N)
Daily digest delivered to your inbox.
Free. Unsubscribe anytime.
Source
Classification
Who this affects
Taxonomy
Browse Categories
Get alerts for this source
We'll email you when ChangeBridge: Patent Grants - AI & Computing (G06N) publishes new changes.