Changeflow GovPing Telecom & Technology Reward Feedback for Learning Control Policies U...
Routine Notice Added Final

Reward Feedback for Learning Control Policies Using Natural Language and Vision Data

Favicon for changeflow.com ChangeBridge: Patent Grants - AI & Computing (G06N)
Published April 7th, 2026
Detected April 7th, 2026
Email

Summary

Hitachi, Ltd. has been granted US Patent 12596955B2 for systems and methods of providing rewards to machine learning algorithms using natural language task descriptions and vision data. The patented invention receives an image and text-based task description, slices the image into sub-images, and uses an embedding model to generate a relevance distribution for generating rewards. The patent contains 9 claims with a filing date of July 20, 2022.

What changed

Hitachi, Ltd. received US Patent 12596955B2 for a reward feedback system in machine learning. The invention relates to methods for providing rewards to ML algorithms by receiving an image paired with a natural language task description, slicing the image into sub-images, and using an embedding model to generate a relevance distribution between sub-images and the task description for reward generation.

Affected parties including technology companies developing ML-based control systems, robotics manufacturers, and automation firms should review the patent scope to assess licensing needs, freedom-to-operate considerations, and potential competitive implications in the reinforcement learning and robotic control spaces.

What to do next

  1. Monitor patent databases for competing or complementary patents in ML reward systems
  2. Review potential licensing opportunities if developing similar ML control systems
  3. Conduct freedom-to-operate analysis for implementations involving NLP and vision-based ML

Source document (simplified)

← USPTO Patent Grants

Reward feedback for learning control policies using natural language and vision data

Grant US12596955B2 Kind: B2 Apr 07, 2026

Assignee

HITACHI, LTD.

Inventors

Andrew James Walker, Joydeep Acharya

Abstract

Example implementations described herein involve systems and methods for providing a reward to a machine learning algorithm, which can include receiving an image, and a task description defined in text; slicing the image into a plurality of sub-images; executing an embedding model to embed the text of the task description and the sub-images to generate a distribution for the sub-images based on relevance to the task description; and generating the reward from the distribution for the sub-images.

CPC Classifications

G06N 3/092 G06N 20/00 B25J 9/163

Filing Date

2022-07-20

Application No.

17869528

Claims

9

View original document →

Named provisions

Abstract Claims

Get daily alerts for ChangeBridge: Patent Grants - AI & Computing (G06N)

Daily digest delivered to your inbox.

Free. Unsubscribe anytime.

Classification

Agency
USPTO
Published
April 7th, 2026
Instrument
Notice
Legal weight
Binding
Stage
Final
Change scope
Minor
Document ID
US12596955B2

Who this affects

Applies to
Technology companies Manufacturers Investors
Industry sector
5112 Software & Technology
Activity scope
Patent grant Machine learning algorithms Robotic control systems
Geographic scope
United States US

Taxonomy

Primary area
Intellectual Property
Operational domain
Legal
Topics
Artificial Intelligence Robotics

Get alerts for this source

We'll email you when ChangeBridge: Patent Grants - AI & Computing (G06N) publishes new changes.

Optional. Personalizes your daily digest.

Free. Unsubscribe anytime.