Revisiting Karpathy's RNN Revolution: How 2015's 'Unreasonable Effectiveness' Foreshadowed the LLM Era
A deep dive into Andrej Karpathy's landmark 2015 RNN demonstration reveals striking parallels and critical divergences from modern transformer architectures. By reimplementing the byte-level Shakespeare generator in PyTorch, we uncover why fixed-context bottlenecks doomed RNNs while attention mechanisms unlocked the LLM revolution.