DNN Inference Optimization Using Practical Early Exit Networks
Summary
USPTO published patent application US20260086912A1 disclosing methods and systems for optimizing DNN inference using early exit networks. The invention enables dynamic splitting of machine learning models based on processing load forecasts and adaptive batch sizing to improve computational efficiency. Application No. 19400394 was filed November 25, 2025.
What changed
The patent application discloses methods for optimizing deep neural network inference through practical early exit networks. The system receives load forecasts for processing requests and dynamically splits the ML model into multiple portions based on that forecast. Batch sizes are determined for each model portion, and available computational resources are allocated to execute the portions and generate inferences efficiently.
This is a patent application publication with no immediate compliance obligations. Technology companies developing ML inference systems, cloud computing platforms, or edge AI devices may find this relevant for understanding prior art in optimization techniques. No regulatory deadlines, penalties, or required actions apply. The application remains pending until examined and potentially granted by USPTO.
Source document (simplified)
DEEP NEURAL NETWORKS (DNN) INFERENCE USING PRACTICAL EARLY EXIT NETWORKS
Application US20260086912A1 Kind: A1 Mar 26, 2026
Inventors
Anand PADMANABHA IYER, Swapnil Sunilkumar GANDHI
Abstract
The present disclosure relates to methods and systems for providing inferences using machine learning systems. The methods and systems receive a load forecast for processing requests by a machine learning model and split the machine learning model into a plurality machine learning model portions based on the load forecast. The methods and systems determine a batch size for the requests for the machine learning model portions. The methods and systems use one or more available resources to execute the plurality of machine learning model portions to process the requests and generate inferences for the requests.
CPC Classifications
G06F 11/3442 G06F 9/505 G06N 5/043 G06N 20/00
Filing Date
2025-11-25
Application No.
19400394
Named provisions
Related changes
Source
Classification
Who this affects
Taxonomy
Browse Categories
Get Telecom & Technology alerts
Weekly digest. AI-summarized, no noise.
Free. Unsubscribe anytime.
Get alerts for this source
We'll email you when ChangeBridge: Patent Apps - AI & Computing (G06N) publishes new changes.