Ad-hoc graph processing for security explainability
Assignee
MICROSOFT TECHNOLOGY LICENSING, LLC
Inventors
Leo Moreno Betthauser, Andrew White Wicker, Bryan (Ning) Xia
Abstract
Disclosed is a machine learning model architecture that leverages existing large language models to analyze log files for security vulnerabilities. In some configurations, log files are processed by an encoder machine learning model to generate embeddings. Embeddings generated by the encoder model are used to construct graphs. The graphs are in turn used to train a graph classifier model for identifying security vulnerabilities. The encoder model may be an existing general-purpose large language model. In some configurations, the nodes of the graphs are the embedding vectors generated by the encoder model while edges represent similarities between nodes. Graphs constructed in this way may be pruned to highlight more meaningful node topologies. The graphs may then be labeled based on a security analysis of the corresponding log files. A graph classifier model trained on the labeled graphs may be used to identify security vulnerabilities.
CPC Classifications
Filing Date
2023-05-04
Application No.
18312159
Claims
20