METHOD AND SYSTEM FOR DEPLOYMENT OF LARGE LANGUAGE MODELS (LLM) IN CLOUD INSTANCES

USPTO

Changeflow GovPing Telecom & Technology METHOD AND SYSTEM FOR DEPLOYMENT OF LARGE LANGU...

Routine Notice Added Final

METHOD AND SYSTEM FOR DEPLOYMENT OF LARGE LANGUAGE MODELS (LLM) IN CLOUD INSTANCES

USPTO Patent Applications - AI & Computing (G06N)

Published April 9th, 2026

Detected April 15th, 2026

Email

Summary

Tata Consultancy Services Limited filed USPTO patent application US20260099706A1 for a method and system to deploy LLMs in cloud instances. The system evaluates cloud instance feasibility based on LLM model size and available storage, determines latency values for batch sizes across LLM-accelerator pairs, and generates deployment recommendations based on latency, cost, workload, application type, and performance metrics.

View original document View source feed page

What changed

Tata Consultancy Services Limited filed USPTO patent application US20260099706A1 for a method and system enabling LLM deployment across multiple cloud instances. The system evaluates feasibility of cloud instances for hosting LLMs based on model size and storage capacity, determines latency values for batch sizes across LLM-accelerator pairs using performance modeling, and generates recommendations based on latency, deployment cost, user workload, application type, latency constraints, and evaluated performance.

Technology companies deploying or developing LLM infrastructure may benefit from reviewing this patent's approach to cloud instance selection and optimization. The patent describes methods for evaluating cost-performance tradeoffs when hosting large language models across distributed cloud environments.

What to do next

Monitor for updates

Archived snapshot

Apr 15, 2026

GovPing captured this document from the original source. If the source has since changed or been removed, this is the text as it existed at that time.

← USPTO Patent Applications

METHOD AND SYSTEM FOR DEPLOYMENT OF LARGE LANGUAGE MODELS (LLM) IN CLOUD INSTANCES

Application US20260099706A1 Kind: A1 Apr 09, 2026

Assignee

Tata Consultancy Services Limited

Inventors

Ashwin KRISHNAN, Venkatesh PASUMARTI, Samarth Sudarshan INAMDAR, Arghyajoy MONDAL, Manoj Karunakaran NAMBIAR, Rekha SINGHAL

Abstract

Existing model deployment approaches have the disadvantage that they do not consider feasibility of cloud instances for hosting a given LLM model. Embodiments disclosed herein provide a method and system for deployment of LLMs in a plurality of cloud instances. The system checks feasibility of the plurality of cloud instances for hosting an LLM, based on size of the LLM and storage space in each of the cloud instances. Further, a latency value for a plurality of batch sizes is determined for a plurality of LLM-accelerator pairs, in each of the plurality of cloud instances identified as feasible based on the feasibility check, using a performance model. Furthermore, a recommendation of one of the plurality of cloud instances identified as feasible is generated, based on the determined latency, a measured cost of deployment, a user workload, an application type, a plurality of latency constraints, and an evaluated performance.

CPC Classifications

G06N 3/08

Filing Date

2025-09-15

Application No.

19328296

View original document →

Related changes

Flexible Prompt Guardrails System for Generative AI

Routine Apr 15, 2026 • USPTO Patent Applications - AI & Computing (G06N) • Telecom & Technology

LLM Generating Obfuscated Employee Feedback Summaries and Modification Suggestions

Routine Apr 14, 2026 • USPTO Patent Applications - Business Methods (G06Q) • Banking & Finance

SAFETY ALIGNMENT FOR LANGUAGE MODELS USING MODEL-GENERATED SAFETY CATEGORIES

Routine Apr 15, 2026 • USPTO Patent Applications - AI & Computing (G06N) • Telecom & Technology

Get daily alerts for USPTO Patent Applications - AI & Computing (G06N)

Daily digest delivered to your inbox.

Free. Unsubscribe anytime.

Source

USPTO Patent Applications - AI & Computing (G06N) changeflow.com/changebridge/uspto-patent-applications/G06N

Telecom & Technology

About this page

What is GovPing?

Every important government, regulator, and court update from around the world. One place. Real-time. Free. Our mission

What's from the agency?

Source document text, dates, docket IDs, and authority are extracted directly from USPTO.

What's AI-generated?

The summary, classification, recommended actions, deadlines, and penalty information are AI-generated from the original text and may contain errors. Always verify against the source document.

Last updated

April 15, 2026

Press inquiries →

Classification

Agency

USPTO

Published

April 9th, 2026

Instrument

Notice

Legal weight

Non-binding

Stage

Final

Change scope

Minor

Document ID

US20260099706A1

Who this affects

Applies to

Technology companies

Industry sector

5112 Software & Technology

Activity scope

Patent filing Software licensing Technology R&D

Geographic scope

United States US

Taxonomy

Primary area

Intellectual Property

Operational domain

Legal

Topics

Artificial Intelligence Data Privacy

METHOD AND SYSTEM FOR DEPLOYMENT OF LARGE LANGUAGE MODELS (LLM) IN CLOUD INSTANCES

Summary

What changed

What to do next

Archived snapshot

METHOD AND SYSTEM FOR DEPLOYMENT OF LARGE LANGUAGE MODELS (LLM) IN CLOUD INSTANCES

Assignee

Inventors

Abstract

CPC Classifications

Filing Date

Application No.

Related changes

Source

About this page

Classification

Who this affects

Taxonomy

Browse Categories

Get alerts for this source

Subscribed!