Reward Feedback for Learning Control Policies Using Natural Language and Vision Data
Summary
Hitachi, Ltd. has been granted US Patent 12596955B2 for systems and methods of providing rewards to machine learning algorithms using natural language task descriptions and vision data. The patented invention receives an image and text-based task description, slices the image into sub-images, and uses an embedding model to generate a relevance distribution for generating rewards. The patent contains 9 claims with a filing date of July 20, 2022.
What changed
Hitachi, Ltd. received US Patent 12596955B2 for a reward feedback system in machine learning. The invention relates to methods for providing rewards to ML algorithms by receiving an image paired with a natural language task description, slicing the image into sub-images, and using an embedding model to generate a relevance distribution between sub-images and the task description for reward generation.
Affected parties including technology companies developing ML-based control systems, robotics manufacturers, and automation firms should review the patent scope to assess licensing needs, freedom-to-operate considerations, and potential competitive implications in the reinforcement learning and robotic control spaces.
What to do next
- Monitor patent databases for competing or complementary patents in ML reward systems
- Review potential licensing opportunities if developing similar ML control systems
- Conduct freedom-to-operate analysis for implementations involving NLP and vision-based ML
Source document (simplified)
Reward feedback for learning control policies using natural language and vision data
Grant US12596955B2 Kind: B2 Apr 07, 2026
Assignee
HITACHI, LTD.
Inventors
Andrew James Walker, Joydeep Acharya
Abstract
Example implementations described herein involve systems and methods for providing a reward to a machine learning algorithm, which can include receiving an image, and a task description defined in text; slicing the image into a plurality of sub-images; executing an embedding model to embed the text of the task description and the sub-images to generate a distribution for the sub-images based on relevance to the task description; and generating the reward from the distribution for the sub-images.
CPC Classifications
G06N 3/092 G06N 20/00 B25J 9/163
Filing Date
2022-07-20
Application No.
17869528
Claims
9
Named provisions
Related changes
Get daily alerts for ChangeBridge: Patent Grants - AI & Computing (G06N)
Daily digest delivered to your inbox.
Free. Unsubscribe anytime.
Source
Classification
Who this affects
Taxonomy
Browse Categories
Get alerts for this source
We'll email you when ChangeBridge: Patent Grants - AI & Computing (G06N) publishes new changes.