V5.0 — Agentic RAG Engine

Forge

The most advanced RAG system you can run
on a single GPU.

Get Started

View Source

RAG Techniques

<10s

Agentic Queries

<2s

Cached Response

10/10

Tests Passing

Live Demo

Watch the agent think.

A real agentic query: tool selection, CRAG evaluation, ColBERT reranking, token streaming, and claim verification — all visible in real-time.

forge query --mode agentic --stream

❯"What is the military leave accrual rate for active duty members?"

Click Run Demo to see an agentic query in action

14 Techniques

Every failure mode, handled.

Each technique solves a specific retrieval failure. Together they form the most comprehensive RAG pipeline assembled for a single-GPU system.

Retrieval

Agentic RAG

LangGraph ReAct loop — the LLM autonomously decides which retrieval tools to invoke, iterating until it has sufficient evidence to answer.

→

Indexing

Contextual Retrieval

Anthropic's technique: LLM prepends chunk-specific context before embedding. 49% fewer retrieval failures, 67% with reranking.

→

Retrieval

CRAG Quality Gate

Cross-encoder evaluates every retrieved document as CORRECT, AMBIGUOUS, or INCORRECT before it reaches generation. Re-retrieves on failure.

→

ColBERT Late Interaction

Token-level MaxSim scoring catches specific facts that dense vectors miss.

→

Indexing

Proposition Indexing

Atomic factual claims indexed as standalone searchable units for precision retrieval.

→

Indexing

Quality

Self-Verification

Post-generation claim-by-claim audit against source documents.

→

Quality

6-Signal Confidence

Multi-dimensional reliability scoring across retrieval, CRAG, and verification.

→

Quality

Query Decomposition

Complex questions split into targeted sub-queries for parallel retrieval.

→

Quality

HyDE

Generate a hypothetical ideal answer, embed it, search for real matches.

→

Retrieval

Multi-Hop Reasoning

Follow cross-references iteratively across document boundaries.

→

Quality

Parent Expansion

Match a chunk, return the parent section for full context fidelity.

→

Comparison

Standard RAG vs Forge

Standard RAG

Forge V5

Retrieval

Single-shot retrieve-then-read

Agentic multi-step with tool use

Chunking

Fixed-size character splits

Hierarchical 4-level (doc/section/chunk/proposition)

Embeddings

Single dense vector

BGE-M3 tri-modal (dense + sparse + ColBERT)

Quality Gate

None — trust whatever retrieves

CRAG cross-encoder evaluation with re-retrieval

Verification

None — hope for the best

Claim-by-claim source verification

Context

Raw chunks, no surrounding info

Contextual enrichment (49% fewer failures)

Precision

Chunk-level granularity

Proposition-level atomic claims

Relationships

Vector similarity only

Knowledge graph with entity traversal

Ready to forge?

14 techniques. One GPU. No open-source system combines all of these.

Get Started

View Source

Built by hollowed_eyes · Forge V5 is a portfolio project demonstrating world-class RAG engineering.

Forge

Watch the agent think.

Every failure mode, handled.

Agentic RAG

Contextual Retrieval

CRAG Quality Gate

ColBERT Late Interaction

Proposition Indexing

Hierarchical 4-Level

BGE-M3 Tri-Modal

Knowledge Graph

Self-Verification

6-Signal Confidence

Query Decomposition

HyDE

Multi-Hop Reasoning

Parent Expansion

Standard RAG vs Forge

Ready to forge?