Context Pool
Open source · Self-hosted · No vector DB

Document Q&A without
embeddings or guesswork

Context Pool exhaustively scans every chunk of every document, pools positive hits, and synthesizes a final answer with verbatim citations — all running on your own infrastructure.

Get started →See how it's different
terminal
# 3 commands to get started
$git clone https://github.com/steve958/Context-Pool.git
$cp config.example.yaml config/config.yaml
$docker-compose -f docker-compose.hub.yml up
✓ backend ready http://localhost:8000
✓ frontend ready http://localhost:3000
4
LLM providers
8
File formats
0
Vector DBs needed
100%
Self-hosted
🔒Self-hosted — your data never leaves your infra
No vector DB required
📄MIT Licensed — open source forever
The problem with prefiltering

Why not vector RAG?

Vector RAG prefilters chunks by similarity score before your LLM ever sees them. If the relevant passage scores low, it's silently dropped. Context Pool never prefilters — it reads every chunk.

Scenario: Legal contract — buried indemnification clause
Standard Vector RAG
Query
"Does this contract limit our liability for data breaches?"
Similarity prefilter
Embedding model scores all 47 chunks. Top-5 retrieved by cosine similarity.
📄Retrieved chunks (top-5)
0.91§3.1 — Services overview and delivery timeline
0.88§7.2 — Payment terms and invoice schedule
0.85§12.4 — Governing law and jurisdiction
0.82§2.1 — Scope of engagement and deliverables
0.79§9.1 — Confidentiality obligations
⚠️Missed (score: 0.41)
§18.3 — Liability cap: In no event shall either party be liable for indirect, incidental, or consequential damages arising from data loss or security breaches, including but not limited to…
Wrong — key clause missed
Based on the retrieved sections, the contract does not appear to contain explicit liability limits for data breaches.
Context Pool
Query
"Does this contract limit our liability for data breaches?"
🔍Exhaustive scan
Reads all 47 chunks sequentially. No prefiltering. No chunk is skipped.
Positive hit found
§18.3 — Liability cap: In no event shall either party be liable for indirect, incidental, or consequential damages arising from data loss or security breaches, including but not limited to…
🗂️Pooled with 2 other hits
§18.3, §19.1 (force majeure carve-out), §21.2 (mutual indemnification) — synthesized together.
Correct — with verbatim citation
Yes. §18.3 explicitly caps liability for data breach damages. Confirmed by cross-reference in §19.1 and §21.2.
§18.3 — “In no event shall either party be liable for indirect, incidental, or consequential damages arising from data loss or security breaches…”
💡
The tradeoff is deliberate
Context Pool is slower than vector RAG because it reads every chunk. In domains where missing a single passage is unacceptable — legal, compliance, finance, medical — that slowness is the point. You get exhaustive recall, not probabilistic retrieval.
Reproducible results

Benchmarks

We measured Context Pool against vector RAG baselines on a synthetic legal contract dataset. The results confirm what the architecture predicts: exhaustive scanning finds answers that similarity prefiltering misses.

📊Recall Benchmark Results
MethodRecallChunks ExaminedEst. Tokens
Context Pool (exhaustive)
100%19 / 19~116K
Vector RAG (top-5)
70%5 / 19~10K
100% Recall
Context Pool examines every chunk. By design, it cannot miss an answer that exists in the document.
Prefiltering Risk
Vector RAG missed 3 of 10 answers due to keyword mismatches and similarity scoring thresholds.
The Tradeoff
Speed vs. certainty. Vector RAG is faster and cheaper. Context Pool is exhaustive.
Run the benchmark yourself on your own documents.
View Full Report
What's New

New in Context Pool

v1.3.0 · March 2026

Stay up to date with the latest features and improvements. Every release makes document analysis more powerful.

💾

Query History & Persistence

Every query you run is now automatically saved to disk. Review past questions, compare results over time, and re-run with a single click.

  • Automatic persistence with gzip compression (~80% savings)
  • Browse complete query history per workspace
  • Re-run any historical query against current documents
  • Full detail view with citations and token usage
Architecture

How Context Pool works

Four deterministic phases. No semantic shortcuts. Every document, every chunk, every time.

STEP 01

Parse

Each file is converted to clean Markdown — PDF text layers, DOCX headings, HTML content, EML bodies and attachments, or OCR for scanned images.

PyMuPDF · python-docx · BeautifulSoup · OCR.space
STEP 02

Chunk

Markdown is split into token-bounded segments that respect heading boundaries and page markers. Chunk size is fully configurable.

Heading-aware · Token-windowed · Page-marker preserved
STEP 03

Scan

Every chunk is sent to the LLM with a strict extractive prompt. Positive hits are pooled; empty chunks are discarded. No skipping, no shortcuts.

{"has_answer": true, "evidence_quotes": ["..."]}
STEP 04

Synthesize

All pooled hits are sent to the LLM in a single synthesis call. The result is a final answer with full citations: document, page, heading, verbatim quote.

{"final_answer": "...", "citations": [...]}
Deployment Flow
📄
Documents
PDF
DOCX
EML
HTML
Images
⚙️
Parse
Text extraction
OCR
Normalization
🧩
Chunk
Heading-aware split
Token windows
🔍
Scan
LLM per chunk
Hit detection
Pool building
📝
Synthesize
Evidence pooling
Cited answer
💡The key difference: Every chunk is checked individually. No semantic prefiltering. The LLM sees every segment of the document before synthesizing the final answer.
🔍
Exhaustive by design
Unlike vector-search RAG, Context Pool never prefilters chunks. Every segment of every document is evaluated against your question. If the answer exists somewhere in your documents, Context Pool will find it — even when the vocabulary in the question differs from the document.
Capabilities

Everything you need

Batteries included. From OCR to citations to a production-ready Docker setup.

🔎
Exhaustive scanning
Every chunk of every document is evaluated. No prefiltering, no semantic shortcuts, no missed passages.
📌
Verbatim citations
Every claim is backed by an exact quote from the source, with document name, page number, and heading path.
🏠
Fully self-hosted
Run on your own machine or server. Documents stay in your Docker volume. Your infrastructure, your data.
🔌
4 LLM providers
OpenAI, Anthropic, Google Gemini, and Ollama. Switch without changing code — just update config.yaml.
📄
8 file formats
PDF (text + scanned), DOCX, TXT, Markdown, HTML, EML (with attachments), PNG, and JPEG.
👁
OCR built in
Scanned PDFs and images are processed via OCR.space. Toggle per query — no permanent setup needed.
📧
Email-aware parsing
.eml files are parsed intelligently: body, attachments, or both — individually chunked and cited.
Real-time progress
WebSocket events stream chunk-by-chunk progress to the UI as the scan runs. No polling required.
🧩
REST + WebSocket API
Every feature is available programmatically. The UI is just a client. Build your own integration.
🗂
Workspaces
Organise documents into named workspaces. Query a single document or the entire workspace at once.
🎛
Configurable chunking
Control chunk size, overlap strategy, and token limits. Tune the accuracy vs. cost trade-off for your use case.
🔐
Production security
API key auth middleware, CORS env config, non-root Docker user, file MIME validation, and input bounds checking.
LLM Providers

Your model, your choice

Switch providers by changing one line in config.yaml. No code changes needed.

OpenAIRecommended
gpt-4ogpt-4o-minigpt-4-turbo
provider: openai api_key: "ENV:OPENAI_API_KEY" model: "gpt-4o-mini" context_window_tokens: 128000 max_chunk_tokens: 24000
💡 gpt-4o-mini is the best cost/quality starting point.
AnthropicBest reasoning
claude-3-5-sonnetclaude-3-5-haikuclaude-3-opus
provider: anthropic api_key: "ENV:ANTHROPIC_API_KEY" model: "claude-3-5-haiku-20241022" context_window_tokens: 200000 max_chunk_tokens: 32000
💡 200K context window means fewer, larger chunks.
Google GeminiLargest context
gemini-2.0-flashgemini-1.5-progemini-1.5-flash
provider: google api_key: "ENV:GOOGLE_API_KEY" model: "gemini-2.0-flash" context_window_tokens: 1000000 max_chunk_tokens: 48000
💡 1M context window. Very large chunk sizes possible.
Ollama100% offline
llama3.2mistralphi3deepseek-r1
provider: ollama api_key: "" model: "llama3.2" context_window_tokens: 8192 max_chunk_tokens: 3000 ollama_base_url: "http://host.docker.internal:11434"
💡 Nothing leaves your machine. Requires Ollama running locally.
Installation

Up and running in minutes

Docker Compose is the fastest path. Local dev and API-only modes also supported.

1Clone the repo
git clone https://github.com/steve958/Context-Pool.git
cd Context-Pool
2Create config
mkdir -p config
cp config.example.yaml config/config.yaml
# Edit config/config.yaml — set provider + model
3Set your API key
# Create .env at the project root
echo "OPENAI_API_KEY=sk-proj-..." > .env

# Optional: enable API authentication
echo "API_KEY=your-secret-here" >> .env
4Start (pulls pre-built images — no build needed)
docker-compose -f docker-compose.hub.yml up

# UI  → http://localhost:3000
# API → http://localhost:8000/docs
REST API

First-class programmatic access

Every feature available in the UI is accessible via REST API and WebSocket. Build your own integrations.

WS /ws/query/{run_id}
Real-time events: chunk_progress · synthesis_started · synthesis_finished · error
Request
{
  "name": "Q3 Contracts"
}
Response
{
  "ws_id": "550e8400-e29b-41d4-a716-446655440000",
  "name": "Q3 Contracts",
  "document_count": 0
}
Use cases

Built for high-stakes document work

Wherever missing a relevant passage is not an option, exhaustive scanning pays off.

⚖️Legal

Contract review

QUESTION
"What does each contract say about termination clauses and notice periods?"
RESULT
Found 7 relevant clauses across 12 contracts. Page and heading citations included.
🔬Research

Literature review

QUESTION
"Which papers discuss transformer attention mechanisms in the context of long documents?"
RESULT
Extracted relevant passages from 34 PDFs, cited by author, section, and page.
📊Finance

Due diligence

QUESTION
"Are there any contingent liabilities or pending litigation mentioned in the disclosure documents?"
RESULT
3 disclosures flagged. Verbatim evidence quotes with page references.
📧Discovery

Email archive search

QUESTION
"Find all emails that discuss the merger timeline and list the mentioned dates."
RESULT
Scanned 240 .eml files including attachments. 18 positive hits extracted.
🏥Healthcare

Clinical document review

QUESTION
"What contraindications are mentioned for Drug X across all patient records?"
RESULT
Scanned scanned PDFs via OCR. 9 contraindications found across 15 records.
🛠Engineering

Technical spec analysis

QUESTION
"What are the stated load-bearing limits in each structural report?"
RESULT
Extracted 22 numeric values with units, pages, and table headings cited.
FAQ

Common questions