FOUNDATION MODEL PRE-TRAINING USING SELF-SUPERVISED LEARNING FOR AUTONOMOUS AND SEMI-AUTONOMOUS SYSTEMS AND APPLICATIONS
Inventors
Sinclair Strachan Groskorth Hudson, David Ambrose Wehr, Deepak Ravishankar, Ke Chen
Abstract
In various examples, self-supervised learning may be used to pre-train an encoder network of a masked prediction model to reconstruct masked regions of an input representation of 3D detections such as LiDAR point cloud(s). Spatial and/or temporal masking may be applied to a projected representation of 3D detections (e.g., a two-dimensional (2D) projection image), and the masked prediction model (e.g., a masked auto-encoder or joint-embedding predictive architecture) may be used to reconstruct a representation of the masked regions (e.g., reflection characteristic(s) stored in corresponding pixels or cells of the projected representation, a latent representation of the reflection characteristic(s)) during iterations of self-supervised learning. As such, the pre-trained encoder network of the masked prediction model may be used as a foundation model and fine-tuned with a task-specific output head or its pre-trained weights may be used to initialize a task-specific model.
CPC Classifications
Filing Date
2025-04-09
Application No.
19174628