A developer retests top coding LLMs after 12 months to evaluate progress in generating complex Swift/SwiftUI applications. While frontier models show significant improvement in handling audio synthesis and UI integration, fundamental reliability issues and outdated coding patterns persist across the board.