Article illustration 1

In the rapidly evolving landscape of Retrieval-Augmented Generation (RAG), a provocative new framework is challenging conventional approaches. Knowledge as Geometry (KAG), an experimental prototype by researcher H.Kiriyama, treats facts not as text snippets but as points in a multidimensional coordinate space where distance equals semantic relationship and density equals influence.

Beyond Embeddings: Coordinates as Meaning

Traditional RAG relies on vector similarity in embedding spaces, but KAG introduces three radical shifts:

  1. Geometric Positioning: Hypothetical Facts (HFs) inferred from observations become points in ℝᴰ space, with axes representing fundamental dimensions like time, location, or semantic concepts
  2. First-Class Classes: Patterns like "Declaration of Independence" or "18th-century Paris" become geometric entities with measurable influence ranges and power densities
  3. Contextual Containment: Retrieval answers not just "what's similar" but "what contains this?" – where being inside a Class's influence region defines contextual relevance
# Defining a knowledge space with explicit dimensions
ks = KnowledgeSpace(dim=3, axes=["time", "space", "semantics"])

# Anchoring foundational concepts
ks.add_class("time_18C", "18th century", vec=[-1.0, 0.0, 0.0], is_anchor=True)
ks.add_class("place_paris", "Paris", vec=[-0.8, 0.6, 0.1], is_anchor=True)

Two Modes of Geometric Retrieval

KAG's breakthrough lies in its dual retrieval paradigms:

Objective (God's-Eye View)

"We search through meaning geometry, not mere lexical overlap"
Centers on conceptual nuclei – like the convergence point of "Declaration of Independence" and "United States" – then retrieves HFs occupying that semantic neighborhood. This reveals what the collective knowledge structure deems relevant.

Subjective (Borrowed Eyes)
Places an observer at specific coordinates (e.g., "18th-century Parisian") to discover:
- Near HFs: Immediate lived reality
- Context Classes: The "room" containing the observer
- Peripheral signals: Emerging relevance just beyond reach

# Subjective search from 18th-century Paris viewpoint
center = np.mean([ks.classes["time_18C"].vec, ks.classes["place_paris"].vec], axis=0)
view = ks.subjective_search(observer_pos=center, radius=0.6)

Why Engineers Should Pay Attention

KAG addresses critical RAG limitations:

Traditional RAG KAG Approach
Lexical/vector similarity Geometric positioning
Opaque relevance scoring Legible spatial relationships
Context as afterthought Containment as first-class context
Single perspective Objective + subjective modes

Early benchmarks suggest 22% better context accuracy for historical queries, though the prototype emphasizes conceptual clarity over optimization.

The Roadmap Ahead

While currently a research artifact (), KAG's integration points with production systems are compelling:

  • Hybrid pipelines blending vector search with geometric constraints
  • LLM-assisted HF extraction from raw documents
  • Visual knowledge navigation interfaces
  • Confidence-aware position weighting

The MIT-licensed prototype () invites collaboration, already demonstrating how geometric intuition could reshape knowledge retrieval:

"Distance is relationship; density is influence" – KAG's core axiom

As RAG systems evolve beyond keyword matching, KAG's geometry-first approach offers a provocative lens for rethinking how machines – and humans – navigate the universe of facts.

Source: KAG GitHub Repository