Building Intelligent Agentic Workflows in Ruby: Routing, RAG, and Multi-Agent Coordination

Agentic workflows represent the next evolution in AI application design—moving beyond singular models to coordinated systems of specialized agents. For Ruby developers, new patterns are emerging to orchestrate these workflows efficiently. Here’s how to implement them using the RubyLLM toolkit.

Dynamic Model Routing: Match Tasks to Specialized AI

Different AI models excel at distinct tasks. A model router dynamically analyzes requests and delegates to optimal models:

class ModelRouter < RubyLLM::Tool
  description "Routes requests to the optimal model"
  param :query, desc: "The user's request"

  def execute(query:)
    task_type = classify_task(query)

    case task_type
    when :code
      RubyLLM.chat(model: 'claude-3-5-sonnet').ask(query).content
    when :creative
      RubyLLM.chat(model: 'gpt-4o').ask(query).content
    when :factual
      RubyLLM.chat(model: 'gemini-1.5-pro').ask(query).content
    else
      RubyLLM.chat.ask(query).content
    end
  end

  private

  def classify_task(query)
    classifier = RubyLLM.chat(model: 'gpt-4o-mini')
                     .with_instructions("Classify: code, creative, or factual. One word only.")
    classifier.ask(query).content.downcase.to_sym
  end
end

This pattern leverages Claude 3.5 Sonnet for code, GPT-4o for creative tasks, and Gemini 1.5 Pro for factual queries—demonstrating how routing improves accuracy while optimizing costs.

Production RAG with PostgreSQL & pgvector

Retrieval Augmented Generation (RAG) gains enterprise-grade durability when paired with PostgreSQL:

Database Setup:

# Migration for vector storage
class CreateDocuments < ActiveRecord::Migration[7.1]
  def change
    create_table :documents do |t|
      t.text :content
      t.string :title
      t.vector :embedding, limit: 1536 # OpenAI embedding size
      t.timestamps
    end

    add_index :documents, :embedding, using: :hnsw, opclass: :vector_l2_ops
  end
end

Embedding Generation:

class Document < ApplicationRecord
  has_neighbors :embedding

  before_save :generate_embedding, if: :content_changed?

  private

  def generate_embedding
    response = RubyLLM.embed(content)
    self.embedding = response.vectors
  end
end

Query Execution:

class DocumentSearch < RubyLLM::Tool
  # ...
  def execute(query:)
    embedding = RubyLLM.embed(query).vectors
    documents = Document.nearest_neighbors(:embedding, embedding, distance: "euclidean").limit(3)
    documents.map { |doc| "#{doc.title}: #{doc.content.truncate(500)}" }.join("

---

")
  end
end

By indexing vectors with PostgreSQL's HNSW algorithm, developers achieve sub-second retrieval at scale—critical for production RAG systems.

Multi-Agent Orchestration

Specialized Agent Teams:

# Researcher Agent
class ResearchAgent < RubyLLM::Tool
  def execute(topic:)
    RubyLLM.chat(model: 'gemini-1.5-pro').ask("Research #{topic}. List key facts.").content
  end
end

# Writer Agent
class WriterAgent < RubyLLM::Tool
  def execute(research:)
    RubyLLM.chat(model: 'claude-3-5-sonnet').ask("Write an article:
#{research}").content
  end
end

# Coordinator
coordinator = RubyLLM.chat.with_tools(ResearchAgent, WriterAgent)
article = coordinator.ask("Create an article about Ruby 3.3 features")

Parallel Execution with Async:

require 'async'

class ParallelAnalyzer
  def analyze(text)
    results = {}
    Async do |task|
      task.async { results[:sentiment] = analyze_sentiment(text) }
      task.async { results[:summary] = generate_summary(text) }
      task.async { results[:keywords] = extract_keywords(text) }
    end
    results
  end
end

Supervisor Pattern for Synthesis:

class CodeReviewSystem
  def review_code(code)
    Async do |task|
      reviews = {}
      task.async { reviews[:security] = audit_security(code) }
      task.async { reviews[:performance] = audit_performance(code) }
      task.async { reviews[:style] = audit_style(code) }
      task.wait # Synchronize
      synthesize_reviews(reviews) # Final summarization
    end
  end
end

These patterns enable concurrent execution while maintaining coordination—critical for complex workflows like real-time code analysis or content generation pipelines.

Why This Matters

Agentic workflows transform AI from a reactive tool into proactive systems. By combining:
1. Intelligent routing to match tasks with specialized models
2. RAG with durable storage for knowledge-intensive tasks
3. Parallel agent orchestration for complex operations

Ruby developers can build AI systems that outperform monolithic approaches in accuracy, efficiency, and scalability. The patterns shown here—implemented in production-ready Ruby—provide the blueprint.

Source: RubyLLM Agentic Workflows Guide

#AgenticWorkflows #RubyLLM #RAG