SVG or Not SVG? How LLMs Reimagine Art Through Code
Share this article
Forget pixel-perfect reproductions. A new online gallery exposes a fascinating, lesser-known capability of large language models (LLMs): their struggle and occasional triumph in translating the essence of famous paintings into the precise, mathematical language of Scalable Vector Graphics (SVG).
While specialized image generation models like DALL-E or Stable Diffusion excel at creating raster images from prompts, this project focuses on a different challenge. It asks general-purpose LLMs – models primarily trained on text and code – to generate the actual SVG code that would render a simplified vector version of masterpieces like Van Gogh's "The Starry Night" or Hokusai's "The Great Wave off Kanagawa".
"When a new model is released, one of my first tests is whether it can generate a pelican on a bicycle," explains Simon Willison, whose creative benchmark inspired the project.
This seemingly whimsical test, and its artistic extension, probes deeper capabilities:
- Multimodal Understanding (Without Vision): Can the LLM, trained largely on text descriptions of the world, grasp the core visual elements, composition, and style of a famous artwork described in a prompt?
- Technical Translation: Can it accurately convert that abstract understanding into the specific, structured syntax of SVG – defining shapes, paths, colors, and layers using XML?
- Abstraction & Simplification: Can it distill the complex, nuanced imagery of a painting into the clean lines and geometric forms inherent to vector graphics?
The results, displayed on the project site pelican.koenvangilst.nl, are a captivating mix of the impressive, the bizarre, and the utterly nonsensical. Some models produce surprisingly coherent outlines or recognizable color blocks. Others generate syntactically valid SVG that bears little resemblance to the target, or code that is fundamentally flawed. These outputs are far more revealing than standard image generation because they expose the model's reasoning process – or lack thereof – in bridging the gap between concept and concrete implementation.
Why This Benchmark Matters for Developers:
- Testing Reasoning & Code Synthesis: It moves beyond simple code completion, testing an LLM's ability to conceptualize a complex output and generate the entire code structure to achieve it.
- Understanding Limitations: The frequent failures highlight the current boundaries of LLMs in spatial reasoning, abstraction, and precise technical execution.
- Creative Coding Potential: Successes hint at future applications where LLMs could assist in generating foundational vector assets or prototypes based on descriptive prompts, potentially speeding up design workflows.
- Evaluating Model Nuance: It provides a more nuanced benchmark than pass/fail coding tests, showing how models interpret and attempt to solve a multifaceted problem involving art and code.
The gallery serves as a stark reminder that while LLMs exhibit remarkable linguistic fluency, their grasp on translating complex visual-spatial concepts into precise technical instructions remains an evolving frontier. It's a compelling snapshot of where the technology shines and where it stumbles, offering developers and researchers a uniquely creative lens to assess the practical intelligence of these powerful models. The next time you hear about a breakthrough LLM, perhaps ask: can it draw me Van Gogh... in SVG?