Flexible deterministic finite automata (DFA) tokenizer for AI-based malicious traffic detection
Assignee
Intel Corporation
Inventors
Kun Qiu, Hao Chang, Ying Wang, Wenjun Zhu, Xiahui Yu, Yingqi Liu, Baoqian Li, Weigang Li
Abstract
Methods and apparatus for a flexible Deterministic Finite Automata (DFA) tokenizer for AI-based malicious traffic detection. A DFA compiler is used to process profiles, such as SQLi, HTML5 and XSS profiles, as well as user-defined profiles, to generate corresponding DFA transition tables. The DFA tokenizer includes a DFA engine that employs the DFA transition table(s) to generate token sequences derived from input strings. The token sequences are converted into feature vectors using a feature extraction engine, and the feature vectors are used for training a machine learning/Artificial Intelligence (AI) model configured to perform binary classification (benign or malicious). During run-time, strings are extracted from input received via a network and tokenized with the DFA tokenizer to generate token sequences that are converted into feature vectors. The feature vectors are then classified using the AI model to determine whether the input is benign or malicious.
CPC Classifications
Filing Date
2022-05-13
Application No.
17744463
Claims
20