MULTI-MODAL MULTI-TASK FOUNDATIONAL MODELS FOR MEDICAL IMAGE MANIPULATION AND INFORMATION RETRIEVAL

Application US20260094693A1 Kind: A1 Apr 02, 2026

Inventors

Alexandru Constantin Serban, Mehmet Akif Gulsun, Puneet Sharma, Dorin Comaniciu

Abstract

Systems and methods for automatically performing one or more actions on one or more medical applications are provided. Text-based instructions are received. The text-based instructions are encoded into text features using a machine learning based text encoder network. One or more instructions for performing by one or more medical applications are determined using a policy module based on the text features. The one or more instructions are performed by the one or more medical applications to generate a response to the text-based instructions. The response to the text-based instructions is output.

CPC Classifications

G16H 30/40 G06F 40/279 G06V 10/7715 G06V 10/82 G06V 10/945 G06V 2201/03 G10L 15/22 G10L 2015/223

Filing Date

2024-09-27

Application No.

18898763

View original document →