LegalPwn: How Buried Legalese Becomes an LLM Jailbreaking Tool
Security researchers at Pangea have uncovered 'LegalPwn,' a novel attack exploiting AI models' deference to legal language. By embedding malicious instructions within verbose legal disclaimers, attackers can bypass guardrails in popular LLMs like GPT-4o and Gemini, tricking them into approving harmful code execution. This vulnerability highlights critical risks as AI integrates deeper into security-sensitive systems.