EFFICIENT DECODING OF OUTPUT SEQUENCES USING PARAMETER SHARING
Inventors
Adam Joshua Fisch, Tal Schuster, Hrayr Harutyunyan, Ziwei Ji, Seungyeon Kim, Sangmin Bae
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing a machine learning task. One of the methods includes generating an output sequence by, at each of a plurality of output time steps: generating a current input sequence from at least the tokens at output time steps that precede the output time step in the output sequence; generating a respective embedding for each input in the current input sequence; and processing the respective embeddings for the inputs in the current input sequence through one or more layer blocks in the sequence of layer blocks until a termination criterion is satisfied.
CPC Classifications
Filing Date
2025-10-01
Application No.
19347594