← USPTO Patent Grants

Reward feedback for learning control policies using natural language and vision data

Grant US12596955B2 Kind: B2 Apr 07, 2026

Assignee

HITACHI, LTD.

Inventors

Andrew James Walker, Joydeep Acharya

Abstract

Example implementations described herein involve systems and methods for providing a reward to a machine learning algorithm, which can include receiving an image, and a task description defined in text; slicing the image into a plurality of sub-images; executing an embedding model to embed the text of the task description and the sub-images to generate a distribution for the sub-images based on relevance to the task description; and generating the reward from the distribution for the sub-images.

CPC Classifications

G06N 3/092 G06N 20/00 B25J 9/163

Filing Date

2022-07-20

Application No.

17869528

Claims

9