Changeflow GovPing Telecom & Technology DNN Inference Optimization Using Practical Earl...
Routine Notice Added Draft

DNN Inference Optimization Using Practical Early Exit Networks

Favicon for changeflow.com ChangeBridge: Patent Apps - AI & Computing (G06N)
Published November 25th, 2025
Detected April 1st, 2026
Email

Summary

USPTO published patent application US20260086912A1 disclosing methods and systems for optimizing DNN inference using early exit networks. The invention enables dynamic splitting of machine learning models based on processing load forecasts and adaptive batch sizing to improve computational efficiency. Application No. 19400394 was filed November 25, 2025.

What changed

The patent application discloses methods for optimizing deep neural network inference through practical early exit networks. The system receives load forecasts for processing requests and dynamically splits the ML model into multiple portions based on that forecast. Batch sizes are determined for each model portion, and available computational resources are allocated to execute the portions and generate inferences efficiently.

This is a patent application publication with no immediate compliance obligations. Technology companies developing ML inference systems, cloud computing platforms, or edge AI devices may find this relevant for understanding prior art in optimization techniques. No regulatory deadlines, penalties, or required actions apply. The application remains pending until examined and potentially granted by USPTO.

Source document (simplified)

← USPTO Patent Applications

DEEP NEURAL NETWORKS (DNN) INFERENCE USING PRACTICAL EARLY EXIT NETWORKS

Application US20260086912A1 Kind: A1 Mar 26, 2026

Inventors

Anand PADMANABHA IYER, Swapnil Sunilkumar GANDHI

Abstract

The present disclosure relates to methods and systems for providing inferences using machine learning systems. The methods and systems receive a load forecast for processing requests by a machine learning model and split the machine learning model into a plurality machine learning model portions based on the load forecast. The methods and systems determine a batch size for the requests for the machine learning model portions. The methods and systems use one or more available resources to execute the plurality of machine learning model portions to process the requests and generate inferences for the requests.

CPC Classifications

G06F 11/3442 G06F 9/505 G06N 5/043 G06N 20/00

Filing Date

2025-11-25

Application No.

19400394

View original document →

Named provisions

Abstract CPC Classifications Filing Date Application No. Inventors

Classification

Agency
USPTO
Published
November 25th, 2025
Instrument
Notice
Legal weight
Non-binding
Stage
Draft
Change scope
Minor
Document ID
US20260086912A1 / Application No. 19400394

Who this affects

Applies to
Technology companies Manufacturers
Industry sector
3341 Computer & Electronics Manufacturing 5112 Software & Technology 3254 Pharmaceutical Manufacturing
Activity scope
Patent Filing ML Model Optimization DNN Inference
Geographic scope
United States US

Taxonomy

Primary area
Artificial Intelligence
Operational domain
Legal
Topics
Machine Learning Neural Networks

Get Telecom & Technology alerts

Weekly digest. AI-summarized, no noise.

Free. Unsubscribe anytime.

Get alerts for this source

We'll email you when ChangeBridge: Patent Apps - AI & Computing (G06N) publishes new changes.

Optional. Personalizes your daily digest.

Free. Unsubscribe anytime.