Google's SLED: Tapping Every Layer to Combat LLM Hallucinations
Share this article
Large language models frequently stumble over facts, confidently asserting inaccuracies—a flaw known as hallucination. While retrieval-augmented generation has been the go-to fix, it introduces complexity and isn't foolproof. Google Research's new paper at NeurIPS 2024 offers an elegant alternative: Self Logits Evolution Decoding (SLED), which realigns LLM outputs using the model's own internal knowledge.
The Layer Whisperer: How SLED Works
Traditional decoding relies solely on a transformer's final layer to predict tokens. SLED revolutionizes this by:
1. Extracting logits (prediction scores) from every layer
2. Applying the final projection matrix to early-exit layers
3. Computing a weighted average of all layer distributions
SLED corrects the popular misconception that Vancouver is British Columbia's capital by amplifying signals for 'Victoria' across layers
This approach captures nuanced reasoning patterns lost in final-layer sampling. For example, when solving "Ash buys 6 toys at 10 tokens each with 10% discount for 4+ items," standard decoding often outputs "6 × 10 = 60"—ignoring the discount. SLED detects intermediate layers suggesting "×" instead of "=" after "6 × 10," steering toward the correct "6 × 10 × 0.9 = 54."
Benchmark Breakthroughs
Tested across Gemma, GPT-OSS, and Mistral models, SLED demonstrated:
- 16% accuracy gains on FACTOR and TruthfulQA benchmarks
- Consistent improvement in multiple-choice, open-ended, and chain-of-thought tasks
- 4% latency overhead vs. prior state-of-the-art DoLa
SLED's accuracy improvements across models and datasets (Source: Google Research)
Crucially, SLED requires no external knowledge bases or fine-tuning. It even combines synergistically with other decoding methods—researchers boosted accuracy further by pairing it with DoLa.
The Future of Faithful AI
SLED represents a paradigm shift: instead of treating transformers as black boxes, it harnesses their internal deliberation. As lead researchers Cyrus Rashtchian and Da-Cheng Juan note, this approach could extend to:
- Visual question answering
- Code generation
- Long-form content creation
The technique is now open-sourced on GitHub, inviting developers to implement it in existing pipelines. In an era where AI accuracy impacts everything from healthcare to legal systems, SLED offers a surgical strike against hallucinations—using the model's own wisdom against its flaws.
Source: Google Research Blog