USPTO Patent US12585765B2 for Robust Natural Language Classification

ChangeBridge: Patent Grants - AI & Computing (G06N)

Published March 24th, 2026

Detected March 25th, 2026

Summary

The USPTO has granted patent US12585765B2 to Barracuda Networks, Inc. for a system and method for robust natural language classification under character encoding. This patent describes a novel approach to train NLP models using synthetic text corpora to detect and classify character swap attacks.

View original document View source feed page

What changed

The United States Patent and Trademark Office (USPTO) has issued patent US12585765B2 to Barracuda Networks, Inc. The patent covers a system and method designed to enhance the robustness of natural language classification, particularly against character encoding manipulation and "character swap attacks." The invention involves creating synthetic text corpora by analyzing character similarity probabilities across various encoding schemes and then using this synthetic data to train NLP models capable of identifying and classifying malicious electronic messages.

While this is a patent grant and not a regulatory rule imposing direct compliance obligations, it signifies technological advancement in AI and cybersecurity. Companies developing or utilizing NLP technologies, especially those focused on security and message filtering, may find this patent relevant to their intellectual property strategy and competitive landscape. Compliance officers in technology sectors should be aware of such patented innovations as they may influence future product development and industry standards.

Source document (simplified)

← USPTO Patent Grants

System and method for robust natural language classification under character encoding

Grant US12585765B2 Kind: B2 Mar 24, 2026

Assignee

Barracuda Networks, Inc.

Inventors

Christopher L. Sawtelle

Abstract

A new approach is proposed to support robust natural language classification under character encoding. A plurality of images that represent a plurality of characters under various language encoding schemes for a target language character are accepted and utilized to create a distribution of text similarity probabilities for the plurality of characters likely to be swapped/replaced/substituted with the target language character to trick a human user. The distribution of text similarity probabilities is then applied against a true text corpus comprising a set of real/actual texts to generate a synthetic text corpus that further includes a set of characters being swapped with one or more of the plurality of characters based on the distribution of text similarity probabilities. The synthetic text corpus is then utilized to train one or more NLP models, which are then utilized to correctly classify and recognize an incoming electronic message that contains a character swap attack.