Open source · Self-hosted · No vector DB

Document Q&A without
embeddings or guesswork

Context Pool exhaustively scans every chunk of every document, pools positive hits, and synthesizes a final answer with verbatim citations — all running on your own infrastructure.

Get started →View on GitHub

terminal

# 3 commands to get started

$git clone https://github.com/steve958/Context-Pool.git

$cp config.example.yaml config/config.yaml

$docker-compose -f docker-compose.hub.yml up

✓ backend ready http://localhost:8000

✓ frontend ready http://localhost:3000

LLM providers

File formats

Vector DBs needed

100%

Self-hosted

Architecture

How Context Pool works

Four deterministic phases. No semantic shortcuts. Every document, every chunk, every time.

STEP 01

Parse

Each file is converted to clean Markdown — PDF text layers, DOCX headings, HTML content, EML bodies and attachments, or OCR for scanned images.

PyMuPDF · python-docx · BeautifulSoup · OCR.space

STEP 02

Chunk

Markdown is split into token-bounded segments that respect heading boundaries and page markers. Chunk size is fully configurable.

Heading-aware · Token-windowed · Page-marker preserved

STEP 03

Scan

Every chunk is sent to the LLM with a strict extractive prompt. Positive hits are pooled; empty chunks are discarded. No skipping, no shortcuts.

{"has_answer": true, "evidence_quotes": ["..."]}

STEP 04

Synthesize

All pooled hits are sent to the LLM in a single synthesis call. The result is a final answer with full citations: document, page, heading, verbatim quote.

{"final_answer": "...", "citations": [...]}

🔍

Exhaustive by design

Unlike vector-search RAG, Context Pool never prefilters chunks. Every segment of every document is evaluated against your question. If the answer exists somewhere in your documents, Context Pool will find it — even when the vocabulary in the question differs from the document.

Capabilities

Everything you need

Batteries included. From OCR to citations to a production-ready Docker setup.

🔎

Exhaustive scanning

Every chunk of every document is evaluated. No prefiltering, no semantic shortcuts, no missed passages.

📌

Verbatim citations

Every claim is backed by an exact quote from the source, with document name, page number, and heading path.

🏠

Fully self-hosted

Run on your own machine or server. Documents stay in your Docker volume. Your infrastructure, your data.

🔌

4 LLM providers

OpenAI, Anthropic, Google Gemini, and Ollama. Switch without changing code — just update config.yaml.

📄

8 file formats

PDF (text + scanned), DOCX, TXT, Markdown, HTML, EML (with attachments), PNG, and JPEG.

👁

OCR built in

Scanned PDFs and images are processed via OCR.space. Toggle per query — no permanent setup needed.

📧

Email-aware parsing

.eml files are parsed intelligently: body, attachments, or both — individually chunked and cited.

⚡

Real-time progress

WebSocket events stream chunk-by-chunk progress to the UI as the scan runs. No polling required.

🧩

REST + WebSocket API

Every feature is available programmatically. The UI is just a client. Build your own integration.

🗂

Workspaces

Organise documents into named workspaces. Query a single document or the entire workspace at once.

🎛

Configurable chunking

Control chunk size, overlap strategy, and token limits. Tune the accuracy vs. cost trade-off for your use case.

🔐

Production security

API key auth middleware, CORS env config, non-root Docker user, file MIME validation, and input bounds checking.

LLM Providers

Your model, your choice

Switch providers by changing one line in config.yaml. No code changes needed.

OpenAIRecommended

gpt-4ogpt-4o-minigpt-4-turbo

provider: openai
api_key: "ENV:OPENAI_API_KEY"
model: "gpt-4o-mini"
context_window_tokens: 128000
max_chunk_tokens: 24000

💡 gpt-4o-mini is the best cost/quality starting point.

AnthropicBest reasoning

claude-3-5-sonnetclaude-3-5-haikuclaude-3-opus

provider: anthropic
api_key: "ENV:ANTHROPIC_API_KEY"
model: "claude-3-5-haiku-20241022"
context_window_tokens: 200000
max_chunk_tokens: 32000

💡 200K context window means fewer, larger chunks.

Google GeminiLargest context

gemini-2.0-flashgemini-1.5-progemini-1.5-flash

provider: google
api_key: "ENV:GOOGLE_API_KEY"
model: "gemini-2.0-flash"
context_window_tokens: 1000000
max_chunk_tokens: 48000

💡 1M context window. Very large chunk sizes possible.

Ollama100% offline

llama3.2mistralphi3deepseek-r1

provider: ollama
api_key: ""
model: "llama3.2"
context_window_tokens: 8192
max_chunk_tokens: 3000
ollama_base_url: "http://host.docker.internal:11434"

💡 Nothing leaves your machine. Requires Ollama running locally.

Installation

Up and running in minutes

Docker Compose is the fastest path. Local dev and API-only modes also supported.

1Clone the repo

git clone https://github.com/steve958/Context-Pool.git
cd Context-Pool

2Create config

mkdir -p config
cp config.example.yaml config/config.yaml
# Edit config/config.yaml — set provider + model

3Set your API key

# Create .env at the project root
echo "OPENAI_API_KEY=sk-proj-..." > .env

# Optional: enable API authentication
echo "API_KEY=your-secret-here" >> .env

4Start (pulls pre-built images — no build needed)

docker-compose -f docker-compose.hub.yml up

# UI  → http://localhost:3000
# API → http://localhost:8000/docs

REST API

First-class programmatic access

Every feature available in the UI is accessible via REST API and WebSocket. Build your own integrations.

WS /ws/query/{run_id}

Real-time events: chunk_progress · synthesis_started · synthesis_finished · error

Request

{
  "name": "Q3 Contracts"
}

Response

{
  "ws_id": "550e8400-e29b-41d4-a716-446655440000",
  "name": "Q3 Contracts",
  "document_count": 0
}

"What are the stated load-bearing limits in each structural report?"

RESULT

Extracted 22 numeric values with units, pages, and table headings cited.

FAQ

Document Q&A withoutembeddings or guesswork

How Context Pool works

Parse

Chunk

Scan

Synthesize

Everything you need

Your model, your choice

Up and running in minutes

First-class programmatic access

Built for high-stakes document work

Contract review

Literature review

Due diligence

Email archive search

Clinical document review

Technical spec analysis

Common questions

Document Q&A without
embeddings or guesswork