All work
Enterprise

Quantal AI

A RAG assistant over millions of documents with evaluation and guardrails.

Client
Quantal
Year
2025
Duration
4 months
Our role
AI Strategy · Engineering
Quantal · 2025
Overview

The project

Quantal's teams were drowning in documents. They needed an AI assistant that could answer accurately from millions of internal files — safely and economically.

The challenge

Generic AI hallucinated and leaked context. Quantal needed grounded, cited answers, strict access control, and predictable cost at enterprise scale.

Our solution

A retrieval-augmented assistant with hybrid search, citations, an automated evaluation harness, guardrails, and per-team access control — deployed privately on Quantal's cloud.

Services

AI SolutionsCustom Software Development

Technologies

PythonLangChainpgvectorNext.jsAWSOpenAI
Technology

The stack behind it

Frontend

Next.js · React · TypeScript

Backend

Python · FastAPI

Database

PostgreSQL · pgvector

Cloud

AWS (private VPC)

DevOps

Docker · GitHub Actions

AI Integration

OpenAI · LangChain · evals

Third-party APIs

Internal document stores

Security

Per-team access · no public training

Architecture

RAG with hybrid search + guardrails

Case study

How it came together

01

Business problem

Teams couldn't find answers across millions of documents — and generic AI wasn't trustworthy.

02

Research

We profiled the document corpus, access rules and the questions teams actually asked.

03

Strategy

Ground every answer in retrieval, cite sources, and measure quality continuously.

04

Design process

A chat UI built around trust — inline citations and clear sourcing.

05

Development

Hybrid RAG, guardrails, an evaluation harness and per-team access control.

06

Testing

Automated evals on a labelled question set, plus red-teaming for safety.

07

Deployment

Private deployment on Quantal's AWS VPC with monitoring.

08

Results

99.2% eval pass, sub-500ms answers across 4M+ documents.

09

Lessons learned

Evaluation-as-code turned 'it feels right' into a measurable, improvable number.

UI/UX highlights
  • A trustworthy chat UI with inline citations
  • Clear sourcing so users can verify every answer
  • Admin view for evaluation and cost monitoring
Architecture

Documents chunked and embedded into pgvector; hybrid (vector + keyword) retrieval feeds the model with cited context; guardrails and evals run on every response; access is enforced per team.

Business impact

Quantal teams now get accurate, cited answers in seconds — with measured quality and controlled cost.

99.2%

Eval pass rate

<500ms

Median response

4M+

Documents indexed

4 mo

To production

Gallery

Inside the product

Assistant with citations
Evaluation dashboard
Cost monitoring
Admin controls
Their AI work is the real thing: evaluated, guard-railed and in production.
Sofia MartínVP Product, Quantal

Have a project like Quantal AI?

Let's talk — a senior engineer replies within one business day.