Overview
This architecture is standard for sequence-to-sequence tasks like machine translation or summarization.
Components
- Encoder: Reads the input sequence (e.g., a French sentence) and compresses it into a 'hidden representation.'
- Decoder: Takes that representation and generates the target sequence (e.g., an English translation) one token at a time.
Evolution
While originally used with RNNs, the most famous implementation today is the Transformer, which uses attention to connect the encoder and decoder.