System and method for short text matching

Grant US12585873B1 Kind: B1 Mar 24, 2026

Assignee

Intuit Inc.

Inventors

Aleksandr Kim, Rineke van Noort, Yuting Lu, Ben Yi

Abstract

A short text matching system and method includes pre-generated dictionary of n-gram tokens having a selected length and corresponding embeddings produced by a fine-tuned transformer model and further includes a one-layer transformer model for inference. The dictionary is produced by fine-tuning a pretrained transformer model based on a domain specific short text training dataset. The length of the n-gram tokens is selected based on the dependency of the variance of embeddings on the n-gram length for embeddings produced by the fine-tuned transformer model. Domain specific input text, including query text and target text, are received and n-gram tokens of the selected length are produced. Embeddings corresponding to each of the n-gram tokens are determined from the dictionary along with corresponding positional embeddings. The n-gram embeddings and positional embeddings are provided to the one-layer transformer model, which produces a text matching result, such as similarity score or classification.

CPC Classifications

G06F 40/242 G06F 40/284 G06N 3/04

Filing Date

2025-05-30

Application No.

19224535

Claims

View original document →