A practical tool that tells you exactly which AI models your hardware can run, from tiny 1B models to massive 400B+ parameter systems.
CanIRun.ai is a straightforward web tool that answers a simple but increasingly important question: can your computer actually run that AI model you're interested in? The site cuts through the marketing hype around AI models by showing you exactly what will work on your hardware, from tiny 1B parameter models to massive 400B+ parameter systems.
How it works
The tool analyzes your device's specifications - GPU VRAM, CPU capabilities, and available RAM - then compares them against a database of hundreds of AI models. It grades each model as:
- S (Runs great): Models that run smoothly with good performance
- A (Runs well): Models that work well with minor limitations
- B (Decent): Models that run but may be slower than ideal
- C (Tight fit): Models that barely fit in memory
- D (Barely runs): Models that struggle to run
- F (Too heavy): Models that won't run due to insufficient resources
What makes it useful
Unlike benchmarks that show peak performance on ideal hardware, CanIRun.ai gives you practical, real-world guidance. It factors in actual VRAM requirements, context window sizes, and token generation speeds. For example, it might tell you that your M1 MacBook with 8GB RAM can run Llama 3.2 1B at ~70 tokens/second but would struggle with anything larger.
The model database
The site tracks an impressive range of models:
Tiny models (0.5-2GB): Perfect for older hardware or basic tasks. Models like Qwen 3.5 0.8B and Gemma 3 1B run great even on integrated graphics.
Small models (3-8B): The sweet spot for many users. Phi-3.5 Mini, Qwen 2.5 7B, and Mistral 7B offer good performance without requiring high-end GPUs.
Medium models (12-32B): These start pushing hardware limits. Models like Qwen 2.5 14B and DeepSeek R1 Distill 14B need dedicated GPUs with 12GB+ VRAM.
Large models (70B+): Only the most powerful systems can handle these. Even a top-tier RTX 4090 with 24GB VRAM can't run models like Llama 3.1 405B.
Technical details
The site uses WebGPU for browser-based model loading where possible, though many models still require native applications like Ollama or LM Studio. It shows key specs for each model including:
- VRAM requirements
- Context window size
- Token generation speed
- Whether it supports vision or other modalities
- Release date and provider
Why this matters now
As AI models proliferate, the gap between what's theoretically available and what's practically runnable on consumer hardware has grown enormous. A model with 400B parameters sounds impressive, but if it needs 200GB VRAM that most users don't have, it's not useful for them.
CanIRun.ai helps users avoid wasting time downloading models their hardware can't handle, and helps them find the best models that will actually work well on their specific setup.
Limitations
The tool relies on published model specifications, which aren't always accurate. Actual performance can vary based on implementation, quantization, and system load. It also can't account for every possible hardware configuration, especially custom setups or unusual driver combinations.
For anyone interested in running AI models locally - whether for privacy, offline use, or just experimentation - CanIRun.ai provides a practical first step in figuring out what's actually possible on your hardware.

Comments
Please log in or register to join the discussion