TRAINING-FREE ERROR COMPENSATION FOR A COMPRESSED LARGE LANGUAGE MODEL

Application US20260080230A1 Kind: A1 Mar 19, 2026

Inventors

Min-Hung Chen, Shih-Yang Liu, Pavlo Molchanov, Maksim Khadkevich, Charbel Sakr, Chien-Yi Wang, Saurav Muralidharan, Hongxu Yin, Huck Yang, Jan Kautz, Frank Wang

Abstract

Large language models (LLMs) learn via machine learning to understand and generate human-like text, and thus are power when used for various language-based tasks, such as text summarization, translation, and content generation. However, to provide superior performance, the LLM is often of a considerable model size and requires high inference costs. To mitigate the size and execution costs of LLMs, methods have been developed to specifically compress LLMs. However, most existing methods either incur significant accuracy degradation compared to uncompressed models or have high training time, while their adaptability is often constrained by a limited range of hardware-supported compression formats. The present disclosure provides error compensation for a compressed LLM in a training free manner that provides flexibility for diverse performance needs.

CPC Classifications

G06N 3/0495

Filing Date

2025-06-05

Application No.

19229953

View original document →