Virtual Ontologies: How Claude Code Unlocks Natural Language to SQL Without the Palantir Overhead
Share this article
When Palantir CEO Alex Karp reportedly declared, "We f*cking won because of ontologies," it wasn't just Silicon Valley bravado—it underscored a seismic shift in data analytics. Ontologies, once confined to philosophy departments as the study of existence, have become the backbone of systems that make data intelligible to AI. But replicating Palantir's heavyweight infrastructure is impractical for most. Enter Michael Fitzgerald, who hacked together a "Palantir-lite" using Claude Code and virtual ontologies, proving that natural language to SQL doesn't require military-grade tooling.
The Ontology Problem: From Philosophy to Data Chaos
Ontologies in computer science formalize real-world entities, their properties, and relationships—a semantic layer that gives raw data meaning. Palantir's success hinges on this, but Fitzgerald's initial attempt mirrored their complexity: he used Web Ontology Language (OWL) to convert data into RDF triples, then generated SPARQL queries for analysis. "It worked incredibly well," he notes, but the overhead was staggering. Triple stores and obscure query languages alienate teams drowning in SQL databases. The breakthrough came when Fitzgerald asked: Why not virtualize the ontology? Leave data in SQL and let the LLM handle semantic reasoning.
Claude Code: The Natural Language Catalyst
Fitzgerald's solution leverages Claude Code, an AI agent that orchestrates tools, manages context, and persists sessions. The magic lies in pairing two elements in its context:
1. An ontology specification: A formal description of business entities (e.g., "Equipment" or "DowntimeEvent") and their relationships.
2. A database schema: The actual SQL structure, mapped directly to the ontology.
This pairing transforms Claude into a semantic engine. Users ask questions in plain English, like "How do upstream failures affect downtime?" Claude traverses the ontology, understands cascading impacts (e.g., material starvation rules), and generates precise SQL. Results are retrieved and analyzed in a loop—no data migration needed.
Case Study: Manufacturing Insights Without the Mess
Fitzgerald tested this with synthetic manufacturing execution system (MES) data, mimicking real production lines. Here's a snippet of the ontology-schema pairing that enabled it:
# Ontology Specification (Truncated)
ontology:
classes:
Equipment:
attributes: efficiency, upstream_dependencies
DowntimeEvent:
attributes: reason_code, duration
relationships:
is_upstream_of:
properties: cascade_delay, impact_correlation
# Database Schema (Truncated)
tables:
mes_data:
equipment_id:
ontology_class: Equipment
downtime_reason:
ontology_class: DowntimeEvent.reason_code
With this, Claude could interpret queries like "Find downtime events caused by upstream failures" and generate SQL that accounts for ontological rules—e.g., mapping "UNP-MAT" reason codes to equipment dependencies. Fitzgerald emphasizes: "The LLM interleaves natural language and ontology traversal to understand real-world phenomena, then translates it to SQL."
Why This Matters for Developers
Virtual ontologies sidestep the friction of traditional methods, making semantic AI accessible. Developers can now:
- Avoid infrastructure lock-in: No need for graph databases or OWL converters; use existing SQL stores.
- Enhance accuracy: Ontological context reduces hallucinations in text-to-SQL generation.
- Scale reasoning: Claude's tool orchestration handles iterative analysis, from querying to insight generation.
Fitzgerald's experiment—documented on GitHub and explained in a 30-minute video—hints at a future where ontologies aren't exclusive to tech giants. As AI agents evolve, virtual semantic layers could turn any database into a Palantir-grade asset, proving that sometimes, the smartest solutions are the ones that do more with less.
Source: Adapted from Michael Fitzgerald's article on Medium, Whither Ontologies?, with insights into Claude Code and ontology implementations.