A new open-source library that intelligently extracts the most representative color from images using advanced clustering and color science techniques.
The Problem with Simple Color Extraction
You have an image but you want a color. Your app has a card with an image, and you want the back of the card to be a solid color that's somewhat representative of the image and also visually pleasant. How would you do that?
The most common approach is to resize the entire image to 1x1 pixel and use that single pixel's color. This is super popular! However, the colors are often dull and muddy even when the original image has vivid colors. It irked me, so I spent a weekend searching for prior art and trying a few tricks to do better. Then, I wrote a library.
Inspired by Oklab's naming, it's called Okmain because it looks for an OK main color.
The Core Challenges
Most images have multiple clusters of colors, so simply averaging all colors into one doesn't work well. Take this image: while the green of the field and the blue of the sky are beautiful colors, simply averaging the colors produces a much less exciting color.
Instead, we can find groups of similar colors and average inside the group. K-means is a well-known algorithm for exactly that. We can run it on all pixels, clustering their colors and ignoring the pixel positions (for now).
For Okmain, I decided to only allow up to four clusters. In my testing, it was enough for decent quality, and limiting the number was handy to make clustering more performant.
Why Oklab Color Space Matters
Another reason for the muddiness is the resizing library operating directly on sRGB colors. In either clustering or resizing, colors need to be averaged. In a naïve implementation, this is done in the same color space the image is in, which is most likely to be sRGB: red, green, and blue subpixel values with gamma correction applied.
This is not ideal for two reasons. First, gamma correction is non-linear, and applying linear operations over the correction leads to incorrect results. Second, perceived color intensity is also non-linear, which is why a sweep through all colors without correcting for perceptual differences produces vertical strips in the gradient.
To solve both problems at once, Okmain operates in the Oklab color space. The result of averaging colors in Oklab is smoother mixing with fewer muddy browns.
Determining Color Prominence
After colors are clustered in Oklab, the clusters need to be sorted by their visual prominence. After all, the user likely wants the more prominent, dominant color and not just four colors with no idea which one is more prominent.
I came up with three heuristics for how prominent a cluster is:
- How many pixels are in the cluster?
- How central the pixels are?
- How visually prominent the color is in itself?
Okmain combines the first two heuristics into one and calculates the number of pixels per cluster, discounting pixels that are closer to the periphery using a mask that looks like this (by default).
Intuitively, pixels that are closer to the center of the image are more prominent, but only to an extent. If a pixel is central enough, it doesn't matter where it is exactly.
Finally, Okmain tries to guess how visually prominent a particular color is. This is tricky because prominence depends on how much a color contrasts with other colors. However, using Oklab chroma (saturation) as a proxy for prominence seems to help on my test set, so it's now a factor in Okmain.
Performance Optimizations
I wanted Okmain to not just produce nice colors but also be reasonably fast, ideally comparable to a simple 1x1 resize. I spent some time optimizing it.
The simplest optimization is to reduce the amount of data. Okmain downsamples the image by a power of two until the total number of pixels is below 250,000, simply averaging pixel values in Oklab. This also helps to remove noise and "invisible colors": on a photo of an old painting, paint cracks can create their own color cluster, but ideally they should be ignored.
The downsampling is also an opportunity to de-interleave the pixels from an RGBRGBRGB… array into a structure-of-arrays (three separate arrays of L, a, and b floats), which helps to make a lot of downstream code trivially auto-vectorizable.
Having a low fixed number of clusters that fits into a SIMD register (f32x4) seems to help, too. One of the biggest hurdles for auto-vectorization is Rust's insistence on correct floating point math. This is a great default, but it's impossible to opt out on stable Rust yet.
For now, Okmain is fast enough, extracting dominant colors from multi-megapixel images in around 100ms.
The AI Development Experience
I was curious how LLM agents would work on this project. It felt like a good fit for agentic development: a small, well-constrained problem, greenfield development, and a lot of pre-existing data in the training set since k-means is a very popular algorithm.
Armed with Opus (4.5 and 4.6) and sprites.dev for sandboxed, accept-everything autonomous development, I tried to retrace Mitchell Hashimoto's steps. The results are mixed, but I learned a lot.
With a good explanation and the planning mode, the very first version was ready really quickly. Unfortunately, it was subtly wrong in several places, and the code was awkward and hard to read.
Additionally, I tried to make the code autovectorization-friendly, and Opus seems confidently wrong about autovectorization more often than it's right. Closing the loop with cargo asm helped, but the loop ate tokens frighteningly fast, and Opus was still struggling to be both idiomatic and verifiably vectorized.
After a few evenings and many tokens of trying to make Opus write as cleanly as I wanted, I gave up and rewrote the most crucial parts from scratch. In my opinion, the manual rewrite is cleaner and clearer, and this is a part where readability matters, since it's the hottest part of the library.
It seems that even frontier LLMs are struggling with intentful abstraction. LLMs split things out in the most mechanical way possible, instead of trying to communicate the intent with how things are split.
On the other hand, with the core API settled, Opus saved me a lot of time working autonomously on "debug" binaries that are easy to read through and don't need to be developed any further.
Throughout this experience, Sprites' stability was a thorn in my side. The UX and the idea are great when it works, but I had my sprite slow down to a crawl every few days. Once it went completely down and was unconnectable for most of the day.
The Final Result
I'm pretty satisfied with how this project turned out. You all got a decent library, and I learned more about k-means, SIMD, releasing mixed Python/Rust libraries, productive greenfield LLM use, and general performance.
Now go and extract all the main colors!
Comments
Please log in or register to join the discussion