Overview

This architecture is standard for sequence-to-sequence tasks like machine translation or summarization.

Components

  • Encoder: Reads the input sequence (e.g., a French sentence) and compresses it into a 'hidden representation.'
  • Decoder: Takes that representation and generates the target sequence (e.g., an English translation) one token at a time.

Evolution

While originally used with RNNs, the most famous implementation today is the Transformer, which uses attention to connect the encoder and decoder.

Related Terms