USING CACHED EXPERTS IN MIXTURE OF EXPERTS (MOE)

Application US20260087386A1 Kind: A1 Mar 26, 2026

Inventors

Andrii SKLIAR, Babak EHTESHAMI BEJNORDI, Ties Jehan VAN ROZENDAAL, Marinus Willem VAN BAALEN, Markus NAGEL, Paul Nicholas WHATMOUGH

Abstract

Systems and techniques are described herein for processing tokens. For instance, a method for processing tokens is provided. The method may include processing a token at a router model to generate a recommendation a subset of expert models from a plurality of expert models to use for further processing of the token; selecting a number of expert models to use for the further processing of the token based on the recommendation of the subset of expert models and based on cached expert models of the plurality of expert models stored in a cache memory; and processing the token using the selected number of expert models.

CPC Classifications

G06N 5/043 G06N 20/00

Filing Date

2025-01-10

Application No.

19017238

View original document →