SYSTEM TO REDUCE MULTI-MODAL EMBEDDING DIMENSIONS
Inventors
Omkar Anil Gune, Purnaprajna Mangsuli
Abstract
A large language model (LLM) generates a synthetic query. A first-dimensional embedding model generates a query embedding of the synthetic query. A training dataset of first-dimensional embeddings, within a similarity threshold of the query embedding is retrieved. An auto-encoder is trained with the training dataset. The auto-encoder includes an input layer, a bottleneck encoder layer, and a decoder layer. A second-dimensional embedding model, including the first-dimensional embedding model, and the bottleneck encoder layer of the auto-encoder, is configured. An output of the first-dimensional embedding model is connected to an input of the bottleneck encoder layer, to obtain the second-dimensional embedding model. The second-dimensional embedding model is used to generate second-dimensional embeddings having a second dimension. The second dimension is lower than the first dimension.
CPC Classifications
Filing Date
2025-09-15
Application No.
19328726