Changeflow GovPing Telecom & Technology Hardening Machine Learning Models Against Promp...
Routine Notice Added Draft

Hardening Machine Learning Models Against Prompt Input Attacks That Trigger Trojans

Favicon for changeflow.com USPTO Patent Applications - AI & Computing (G06N)
Published
Detected
Email

Summary

Sophos Limited has filed USPTO Application US20260111541A1 for a method that hardens machine learning models against prompt injection attacks. The method involves identifying Trojan triggers in neural networks by comparing neuron activity levels against a baseline in response to known test tokens, then selectively modifying neuron weights to suppress malicious responses below a threshold likelihood. The application names five inventors and was published April 23, 2026, after a December 30, 2024, filing date.

“The method further includes modifying the respective weights of one or more neurons in the subset of the neurons in the LLM such that, after the modifying, a likelihood that the LLM generates the resulting malicious response to the malicious prompt is below a threshold likelihood value.”

USPTO , verbatim from source
Published by USPTO on changeflow.com . Detected, standardized, and enriched by GovPing. Review our methodology and editorial standards .

About this source

USPTO classification G06N covers computer systems based on specific computational models: neural networks, knowledge representation, fuzzy logic, expert systems, evolutionary algorithms. With the AI patent boom, this is one of the most-filed application classes in the office. Every newly published application in G06N lands in this feed, around 230 a month. Patent applications publish 18 months after filing, so this feed reveals what AI labs and companies were working on in the prior year and a half. Watch this if you compete in machine learning, file freedom-to-operate analyses, scout acquisition targets in AI infrastructure, or track which research groups are converting publications to patents. GovPing pulls each application with the filing number, title, applicant, and abstract.

What changed

Sophos Limited filed USPTO Patent Application US20260111541A1 for a method to harden pre-trained LLMs against prompt input attacks that trigger neural network Trojans. The method adjusts neuron weights to cause the LLM to generate a known malicious response to a test prompt, identifies a subset of neurons by comparing activity levels against a baseline, and modifies the weights of neurons in that subset to reduce the likelihood of a malicious response below a threshold value.

Technology companies developing or deploying LLMs should monitor this application as it describes a potential defensive methodology against adversarial prompt injection. The patent covers a testing-and-modification approach to identifying and neutralising Trojan triggers within neural network architectures, which may inform future security hardening practices for AI systems.

Archived snapshot

Apr 24, 2026

GovPing captured this document from the original source. If the source has since changed or been removed, this is the text as it existed at that time.

← USPTO Patent Applications

Hardening Machine Learning Models Against Prompt Input Attacks That Trigger Trojans

Application US20260111541A1 Kind: A1 Apr 23, 2026

Assignee

Sophos Limited

Inventors

Tamás Vörös, Sean Paul Bergeron, Ben Uri Gelman, Adarsh Dinesh Kyadige, Tamas Bence Nyiri

Abstract

A method includes obtaining a pre-trained LLM that generates a resulting malicious response to a malicious prompt input to the LLM. The method further includes adjusting a respective weight of neurons of the LLM to cause the LLM to generate a known malicious response to a test prompt input to the LLM, where the test prompt includes a plurality of known test tokens. The method further includes identifying a subset of the neurons based on comparing a respective activity level of each neuron in response to the test prompt with a baseline activity level. The method further includes modifying the respective weights of one or more neurons in the subset of the neurons in the LLM such that, after the modifying, a likelihood that the LLM generates the resulting malicious response to the malicious prompt is below a threshold likelihood value.

CPC Classifications

G06F 21/554 G06N 3/0475 G06N 3/094 G06F 2221/033

Filing Date

2024-12-30

Application No.

19005933

View original document →

Get daily alerts for USPTO Patent Applications - AI & Computing (G06N)

Daily digest delivered to your inbox.

Free. Unsubscribe anytime.

About this page

What is GovPing?

Every important government, regulator, and court update from around the world. One place. Real-time. Free. Our mission

What's from the agency?

Source document text, dates, docket IDs, and authority are extracted directly from USPTO.

What's AI-generated?

The summary, classification, recommended actions, deadlines, and penalty information are AI-generated from the original text and may contain errors. Always verify against the source document.

Last updated

Classification

Agency
USPTO
Published
April 23rd, 2026
Instrument
Notice
Branch
Executive
Legal weight
Non-binding
Stage
Draft
Change scope
Minor

Who this affects

Applies to
Technology companies Manufacturers
Industry sector
5112 Software & Technology
Activity scope
Patent filing AI security
Geographic scope
United States US

Taxonomy

Primary area
Intellectual Property
Operational domain
Legal
Topics
Artificial Intelligence Cybersecurity

Get alerts for this source

We'll email you when USPTO Patent Applications - AI & Computing (G06N) publishes new changes.

Free. Unsubscribe anytime.

You're subscribed!