Systems and methods of data selection for iterative training using zero knowledge clustering
Assignee
SentinelOne, Inc.
Inventors
Idan Ludmir, Moshe Strenger, Shlomi Salem, Tzlil Gonen
Abstract
A method may select, from a training data repository comprising a plurality of samples with known classifications, an initial training dataset comprising a second plurality of samples. A method may provide, as an input to a classification model, feature vectors associated with the initial training dataset and may train the classification model using the feature vectors. A method may determine a classification of each sample of a third plurality of samples using the classification model. A method may determine a difference between the determined and the known classification for each sample. A method may determine a selection weighting for each sample for based on the difference between the determined classification and the known classification. A method may select a subset from the from the third plurality of samples based on the determined selection weighting. A method may train the classification model using feature vectors associated with the subset.
CPC Classifications
Filing Date
2023-12-15
Application No.
18542041
Claims
20