Multimodal Retrieval Augmented Generation for Visually Rich Documents

USPTO

Changeflow GovPing Telecom & Technology Multimodal Retrieval Augmented Generation for V...

Routine Notice Added Final

Multimodal Retrieval Augmented Generation for Visually Rich Documents

USPTO Patent Applications - AI & Computing (G06N)

Published April 9th, 2026

Detected April 13th, 2026

Email

Summary

USPTO published patent application US20260099698A1 filed by JPMorgan Chase Bank, N.A. The application covers multimodal retrieval augmented generation (RAG) methods for visually rich documents using a page-wise chunking algorithm. The system embeds text, spatial, and visual features from document pages into vectors, retrieves relevant chunks based on query similarity, and generates responses via a generative model.

View original document View source feed page

What changed

USPTO published patent application US20260099698A1 assigned to JPMorgan Chase Bank, N.A. The application discloses methods and systems for multimodal retrieval augmented generation (RAG) that process visually rich documents using a page-wise chunking algorithm. The system generates vectors representing text, spatial, and visual features for each page chunk, retrieves relevant chunks based on query similarity, and generates responses using a generative model.

Patent application publications are informational filings that do not create compliance obligations for third parties. The publication notifies the public of the patent claim, allowing for prior art searches and opposition preparation. No regulatory action or compliance requirements are imposed by this document.

What to do next

Monitor for updates

Archived snapshot

Apr 13, 2026

GovPing captured this document from the original source. If the source has since changed or been removed, this is the text as it existed at that time.

← USPTO Patent Applications

SYSTEM AND METHOD FOR MULTIMODAL RETRIEVAL AUGMENTED GENERATION FOR VISUALLY RICH DOCUMENTS

Application US20260099698A1 Kind: A1 Apr 09, 2026

Assignee

JPMorgan Chase Bank, N.A.

Inventors

Simerjot KAUR, Zhiqiang MA, Mathieu SIBUE, Farima FARMAHINIFARAHANI, Lawrence YONG, Dongsheng WANG, Armineh NOURBAKHSH, Lucas CECCHI

Abstract

Various methods and processes, apparatuses/systems, and media for multimodal retrieval augmented generation for visually rich documents are disclosed. A processor implements a page-wise chunking algorithm to chunk a visual document into a plurality of page chunks; inputs the plurality of page chunks onto a trained embedding model; generates one vector for each page chunk. Each vector represents corresponding text, spatial and visual feature of each page of the visual document; inputs, in response to receiving a prompt of a query corresponding to the visual document, the vectors of the embedded chucks of pages retrieved from the database; identifies most relevant chunks of pages based on the similarities between the query prompt and chucks of pages; and generates, in response to inputting the most relevant chunks of pages onto a generative model, a response to the prompt corresponding to the visual document based on the identified most relevant chunks of pages.

CPC Classifications

G06N 3/0455 G06F 16/93 G06F 40/289 G06F 40/30

Filing Date

2024-10-04

Application No.

18906898

View original document →

Related changes

Universal Machine Learning Pipeline Execution System and Method

Routine Apr 15, 2026 • USPTO Patent Applications - AI & Computing (G06N) • Telecom & Technology

Asynchronous Quantum Information Processing System Reduces QIPU Dead Time

Routine Apr 15, 2026 • USPTO Patent Applications - AI & Computing (G06N) • Telecom & Technology

Flexible Prompt Guardrails System for Generative AI

Routine Apr 15, 2026 • USPTO Patent Applications - AI & Computing (G06N) • Telecom & Technology

Get daily alerts for USPTO Patent Applications - AI & Computing (G06N)

Daily digest delivered to your inbox.

Free. Unsubscribe anytime.

Source

USPTO Patent Applications - AI & Computing (G06N) changeflow.com/changebridge/uspto-patent-applications/G06N

Telecom & Technology

About this page

What is GovPing?

Every important government, regulator, and court update from around the world. One place. Real-time. Free. Our mission

What's from the agency?

Source document text, dates, docket IDs, and authority are extracted directly from USPTO.

What's AI-generated?

The summary, classification, recommended actions, deadlines, and penalty information are AI-generated from the original text and may contain errors. Always verify against the source document.

Last updated

April 13, 2026

Press inquiries →

Classification

Agency

USPTO

Published

April 9th, 2026

Instrument

Notice

Legal weight

Non-binding

Stage

Final

Change scope

Minor

Document ID

US20260099698A1

Who this affects

Applies to

Public companies

Industry sector

5112 Software & Technology

Activity scope

Patent filing AI/ML technology Document processing

Geographic scope

United States US

Taxonomy

Primary area

Intellectual Property

Operational domain

Legal

Topics

Artificial Intelligence Software & Technology Data Privacy

Multimodal Retrieval Augmented Generation for Visually Rich Documents

Summary

What changed

What to do next

Archived snapshot

SYSTEM AND METHOD FOR MULTIMODAL RETRIEVAL AUGMENTED GENERATION FOR VISUALLY RICH DOCUMENTS

Assignee

Inventors

Abstract

CPC Classifications

Filing Date

Application No.

Related changes

Source

About this page

Classification

Who this affects

Taxonomy

Browse Categories

Get alerts for this source

Subscribed!