Raja Koduri Launches Oxmiq Labs: RISC-V GPU Startup Targets Nvidia's CUDA Dominance with Software-First Approach
Share this article
Raja Koduri, the storied architect behind pivotal GPU innovations at Apple, AMD, and Intel, has stepped into the spotlight with Oxmiq Labs—a startup positioning itself as Silicon Valley's first new GPU-focused company in over 25 years. Emerging from stealth with $20 million in seed funding from investors like MediaTek, Oxmiq isn't building traditional graphics cards. Instead, it’s crafting a RISC-V-based GPU IP and a software layer designed to crack open Nvidia’s stranglehold on AI development by enabling CUDA compatibility on alternative hardware. For developers drowning in proprietary ecosystems, this could be a seismic shift.
The Hardware Blueprint: RISC-V Cores and Chiplets
At its core, Oxmiq’s hardware strategy revolves around OxCore, a modular GPU IP built on the open RISC-V ISA. Unlike conventional GPUs, OxCore integrates scalar, vector, and tensor engines into a unified architecture, supporting near-memory and in-memory computing for AI, graphics, and multimodal workloads. Koduri emphasizes this isn’t a consumer-grade solution; OxCore lacks standard GPU features like ray tracing or display pipelines, requiring licensees to customize it for specific use cases—think edge AI inferencing or data-center training.
Complementing this is OxQuilt, a chiplet-based SoC builder that lets clients assemble systems by mixing compute (CCB), memory (MCB), and interconnect (ICB) chiplets. Need a compact edge accelerator? Combine one CCB with an ICB. Building a training behemoth? Scale to dozens of chiplets. While Oxmiq hasn’t clarified if OxQuilt supports monolithic designs, the focus on modularity promises cost efficiency and rapid iteration—critical for AI’s breakneck evolution.
Software as the Great Equalizer
Here’s where Oxmiq’s vision gets disruptive: its software stack, not hardware, is the linchpin. OXCapsule serves as a unified runtime layer, abstracting hardware complexity through 'heterogeneous containers.' These allow applications to run across CPUs, GPUs, or accelerators without code changes, simplifying deployment in fragmented environments. But the crown jewel is OXPython, a compatibility layer that translates Python-based CUDA workloads to run unmodified on non-Nvidia hardware—no recompilation needed.
In a strategic coup, OxPython will debut on Tenstorrent’s AI accelerators (Wormhole and Blackhole), not Oxmiq’s own IP. Tenstorrent CEO Jim Keller endorsed the move: "OXPython's ability to bring Python workloads for CUDA to AI platforms [...] is great for developer portability. It aligns with our goal of letting developers open and own their entire AI stack." This hardware-agnostic approach directly challenges Nvidia’s ecosystem lock-in, offering a lifeline to developers trapped by CUDA’s dominance.
Why This Matters: Beyond the Hype
Koduri’s pedigree invites scrutiny—comments on Tom’s Hardware debate his legacy, with critics citing struggles at AMD and Intel. Yet Oxmiq’s asset-light model (IP licensing over chip fabrication) mitigates risk, and early software revenue signals traction. For the industry, the implications are profound:
- AI Democratization: OXPython could reduce dependency on Nvidia hardware, lowering barriers for startups and researchers.
- RISC-V Momentum: OxCore advances RISC-V beyond CPUs into high-performance compute, fostering open-source hardware innovation.
- Ecosystem Fragmentation: If successful, Oxmiq’s software might finally unify disparate accelerators, much like OneAPI or OpenCL aspired to—but with CUDA compatibility as a killer feature.
Oxmiq isn’t crafting the next gaming GPU, but in an AI-driven world, its software-centric gamble could redefine how we build accelerators. As Koduri noted, "GPUs are not easy"—but with Tenstorrent’s partnership and MediaTek’s backing, Oxmiq is betting that flexibility, not raw power, will win the next compute war.
Source: Tom's Hardware