Language Models Having Reduced Size While Maintaining Performance and Reducing Hallucinations
Summary
USPTO published patent application US20260099727A1 titled 'Language Models Having a Reduced Size While Maintaining Performance and Reducing Hallucinations' filed January 30, 2025 by inventors Jeffrey Daniel Esposito, Henry Svendsgaard, Aishwarya Dharani Arul, and Tabor Scott. The application discloses a computer program product that iteratively trains a language model by adjusting hyperparameters such as number of layers, hidden units, and parameters, selecting the smallest model configuration that meets a predetermined performance threshold to reduce hallucinations.
What changed
USPTO published patent application US20260099727A1 disclosing a method for training language models with reduced hyperparameter values while maintaining performance above a predetermined threshold and reducing hallucinations. The invention trains a language model with selected architecture using supervised word embeddings, tests performance on a validation dataset, and iteratively reduces hyperparameter values (layers, hidden units, parameters) until performance falls below the threshold. The smallest compliant model configuration is then selected for deployment.
Technology companies and AI developers researching efficient language model architectures may find this patent relevant for understanding approaches to model compression and hallucination mitigation. Patent applicants and intellectual property professionals should note the filing date and application number for prior art and freedom-to-operate analyses.
Archived snapshot
Apr 17, 2026GovPing captured this document from the original source. If the source has since changed or been removed, this is the text as it existed at that time.
LANGUAGE MODELS HAVING A REDUCED SIZE WHILE MAINTAINING PERFORMANCE AND REDUCING HALLUCINATIONS
Application US20260099727A1 Kind: A1 Apr 09, 2026
Inventors
Jeffrey Daniel Esposito, Henry Svendsgaard, Aishwarya Dharani Arul, Tabor Scott
Abstract
A computer program product causes a processor to perform various operations. The operations include training a language model (LM) with a selected architecture using a training dataset focused on a specific content domain using pre-trained, supervised word embeddings and current values for a plurality of hyperparameters, such as a number of layers, hidden units, and/or parameters. The operations further include testing the trained LM on a validation dataset to obtain a performance of the trained LM, and, in response to the performance measurement being greater than the predetermined performance threshold, reducing the values of one or more of the hyperparameters and repeating the training. In addition, the operations include, in response to the performance not being greater than the threshold, selecting one of the previously trained LM that was trained using the smallest set of hyperparameter values and had a performance greater than the threshold and deploying the selected LM.
CPC Classifications
G06N 3/0985 G06N 3/09
Filing Date
2025-01-30
Application No.
19041065
Related changes
Get daily alerts for USPTO Patent Applications - AI & Computing (G06N)
Daily digest delivered to your inbox.
Free. Unsubscribe anytime.
Source
About this page
Every important government, regulator, and court update from around the world. One place. Real-time. Free. Our mission
Source document text, dates, docket IDs, and authority are extracted directly from USPTO.
The summary, classification, recommended actions, deadlines, and penalty information are AI-generated from the original text and may contain errors. Always verify against the source document.
Classification
Who this affects
Taxonomy
Browse Categories
Get alerts for this source
We'll email you when USPTO Patent Applications - AI & Computing (G06N) publishes new changes.
Subscribed!
Optional. Filters your digest to exactly the updates that matter to you.