Intel Corporation Multimodal LLM Audio Video Tokenization Patent Application
Summary
USPTO published patent application US20260099522A1 by Intel Corporation inventors Kuba Lopatka, Adam Kupryjanow, and Tomasz Szmelczynski for power-efficient tokenization and long-context storage of audio and video data for multimodal large language models. The application covers systems with specialized subsystems for receiving input signals, generating discrete tokens, and buffering tokens for durations ranging from seconds to hours.
What changed
Intel Corporation filed patent application US20260099522A1 with USPTO on December 10, 2025, covering systems and methods for power-efficient, continuous tokenization and long-context storage of audio and video data for multimodal large language models. The claimed architecture includes specialized subsystems configured to receive input signals, generate discrete tokens, and buffer tokens for durations ranging from seconds to hours.
For affected parties in the AI and semiconductor industries, this patent application represents potential future intellectual property claims in the multimodal AI tokenization space. Competitors developing similar tokenization methods for audio, video, image, and text processing in multimodal LLMs should monitor the prosecution of this application and assess potential freedom-to-operate implications upon any future patent grant.
What to do next
- Monitor for patent grant status
- Review application claims for competitive intelligence
Archived snapshot
Apr 12, 2026GovPing captured this document from the original source. If the source has since changed or been removed, this is the text as it existed at that time.
AUDIO AND VIDEO TOKENIZATION FOR MULTIMODAL LARGE LANGUAGE MODELS
Application US20260099522A1 Kind: A1 Apr 09, 2026
Assignee
Intel Corporation
Inventors
Kuba Lopatka, Adam Kupryjanow, Tomasz Szmelczynski
Abstract
Systems and methods for power-efficient, continuous tokenization and long-context storage of audio and video data for use with multimodal large language models (LLMs). The systems include specialized subsystems configured to receive input signals, generate discrete tokens representing the input, and buffer the tokens for durations ranging from seconds to hours. Upon receiving a trigger to initiate communication with a multimodal LLM, at least a subset of the buffered tokens is transmitted to an inference dispatcher, which determines the distribution of the tokens to one or more inference engines for processing. The architecture supports tokenization and buffering for multiple modalities, including audio, video, image, and text, and enables context-rich, privacy-preserving, and low-latency AI interactions on client devices. By utilizing efficient token-based data encoding and performing the tokenization at low-power hardware, power consumption and bandwidth usage are significantly reduced, thereby allowing seamless, always-on multimodal AI experiences on battery-powered platforms.
CPC Classifications
G06F 16/33295 G06N 3/0455
Filing Date
2025-12-10
Application No.
19414835
Related changes
Get daily alerts for USPTO Patent Applications - AI & Computing (G06N)
Daily digest delivered to your inbox.
Free. Unsubscribe anytime.
Source
About this page
Every important government, regulator, and court update from around the world. One place. Real-time. Free. Our mission
Source document text, dates, docket IDs, and authority are extracted directly from USPTO.
The summary, classification, recommended actions, deadlines, and penalty information are AI-generated from the original text and may contain errors. Always verify against the source document.
Classification
Who this affects
Taxonomy
Browse Categories
Get alerts for this source
We'll email you when USPTO Patent Applications - AI & Computing (G06N) publishes new changes.
Subscribed!
Optional. Filters your digest to exactly the updates that matter to you.