FINE-TUNING MULTI-HEAD NETWORK FROM A SINGLE TRANSFORMER LAYER OF PRE-TRAINED LANGUAGE MODEL

Application US20260080864A1 Kind: A1 Mar 19, 2026

Assignee

Oracle International Corporation

Inventors

Thanh Tien Vu, Tuyen Quang Pham, Omid Mohamad Nezami, Mark Edward Johnson, Thanh Long Duong, Cong Duy Vu Hoang

Abstract

Techniques are provided for customizing or fine-tuning a pre-trained version of a machine-learning model that includes multiple layers and is configured to process audio or textual language input. Each of the multiple layers is configured with a plurality of layer-specific pre-trained parameter values corresponding to a plurality of parameters, and each of the multiple layers is configured to implement multi-head attention. An incomplete subset of the multiple layers is identified for which corresponding layer-specific pre-trained parameter values are to be fine-tuned using a client data set. The machine-learning model is fine-tuned using the client data set to generate an updated version of the machine-learning model, where the layer-specific pre-trained parameter values configured for each layer of one of more of the multiple layers not included in the incomplete subset are frozen during the fine-tuning. Use of the updated version of the machine-learning model is facilitated.

CPC Classifications

G10L 15/063 G06F 40/20 G06N 20/00 G10L 15/22 G10L 2015/0635 G10L 2015/223

Filing Date

2025-11-26

Application No.

19402418

View original document →