USPTO Patent US12585765B2 for Robust Natural Language Classification
Summary
The USPTO has granted patent US12585765B2 to Barracuda Networks, Inc. for a system and method for robust natural language classification under character encoding. This patent describes a novel approach to train NLP models using synthetic text corpora to detect and classify character swap attacks.
What changed
The United States Patent and Trademark Office (USPTO) has issued patent US12585765B2 to Barracuda Networks, Inc. The patent covers a system and method designed to enhance the robustness of natural language classification, particularly against character encoding manipulation and "character swap attacks." The invention involves creating synthetic text corpora by analyzing character similarity probabilities across various encoding schemes and then using this synthetic data to train NLP models capable of identifying and classifying malicious electronic messages.
While this is a patent grant and not a regulatory rule imposing direct compliance obligations, it signifies technological advancement in AI and cybersecurity. Companies developing or utilizing NLP technologies, especially those focused on security and message filtering, may find this patent relevant to their intellectual property strategy and competitive landscape. Compliance officers in technology sectors should be aware of such patented innovations as they may influence future product development and industry standards.
Source document (simplified)
System and method for robust natural language classification under character encoding
Grant US12585765B2 Kind: B2 Mar 24, 2026
Assignee
Barracuda Networks, Inc.
Inventors
Christopher L. Sawtelle
Abstract
A new approach is proposed to support robust natural language classification under character encoding. A plurality of images that represent a plurality of characters under various language encoding schemes for a target language character are accepted and utilized to create a distribution of text similarity probabilities for the plurality of characters likely to be swapped/replaced/substituted with the target language character to trick a human user. The distribution of text similarity probabilities is then applied against a true text corpus comprising a set of real/actual texts to generate a synthetic text corpus that further includes a set of characters being swapped with one or more of the plurality of characters based on the distribution of text similarity probabilities. The synthetic text corpus is then utilized to train one or more NLP models, which are then utilized to correctly classify and recognize an incoming electronic message that contains a character swap attack.
CPC Classifications
G06F 21/554 G06F 40/126 G06F 40/279 G06F 2221/034 G06N 20/00
Filing Date
2023-09-22
Application No.
18371878
Claims
17
Named provisions
Related changes
Source
Classification
Who this affects
Taxonomy
Browse Categories
Get Telecom & Technology alerts
Weekly digest. AI-summarized, no noise.
Free. Unsubscribe anytime.
Get alerts for this source
We'll email you when ChangeBridge: Patent Grants - AI & Computing (G06N) publishes new changes.