OpenAI's first open-weight LLMs since GPT-2, gpt-oss-120b and gpt-oss-20b, reveal strategic shifts in transformer design—embracing Mixture-of-Experts, MXFP4 quantization, and sliding window attention. We dissect how these choices stack against Alibaba's Qwen3 and what they signal for efficient, locally deployable AI. Source analysis shows surprising trade-offs in width vs. depth and expert specialization that redefine developer possibilities.