← Pharma Intelligence Copilot

How It Works

Two-corpus RAG with a live Box integration. Here's what's actually happening under the hood.

The pipeline

01

Scrape FDA enforcement corpus

870+ CDER/CBER warning letters (2019–present) scraped from FDA.gov, categorized into 10 violation areas using Claude, chunked, and embedded into Pinecone.

02

Connect to Box via JWT

Internal quality documents live in a Box folder. The server-to-server JWT connector downloads files on demand — no migration, no export.

03

Embed internal documents

Each document is chunked with section context preserved and embedded into a separate Pinecone namespace. Box webhooks trigger automatic re-ingestion when files change.

04

Cross-corpus retrieval

For each violation category, semantic search runs against both corpora in parallel — retrieving the most relevant warning letter passages and internal document sections.

05

Risk signal generation

Claude analyzes the enforcement patterns and document evidence to produce a structured signal: enforcement frequency, document coverage assessment, and a specific review prompt for the team.

06

Stream results in real time

Signals appear as they complete — 10 categories processed in parallel batches, streamed via SSE so users see results progressively rather than waiting for the full scan.

Architecture

Corpus 1
FDA Warning Letters
870+ letters · CDER/CBER · 2019–2025
Pinecone namespace: fda-warning-letters
Corpus 2
Internal Quality Docs
Box JWT · 8 SOPs & policies
Pinecone namespace: internal-docs
Engine
Cross-Corpus Analysis
Claude Sonnet · 10 categories
Parallel batches · SSE streaming

Who uses it and why

Role
VP Quality
Need
Know what FDA is currently focused on before an inspection
Value
Trend Q&A over real enforcement data, not analyst summaries
Role
QA Director
Need
Identify procedure gaps relative to enforcement patterns
Value
Cross-corpus scan maps your SOPs to active citation areas
Role
Regulatory Affairs
Need
Understand the enforcement context for a filing decision
Value
Ask specific questions: 'What FDA language appears around stability trend analysis?'

Same engine. Different use cases.

The two-corpus RAG architecture adapts to any domain where external reference data needs to be cross-referenced against internal documents.

Pharma IntelligenceCompliance CopilotRules Expert
Document corpusFDA warning letters + quality SOPs21 CFR Part 11 + policy documents2023 Rules of Golf
RetrievalTwo-namespace cross-corpusSingle-namespace requirement matchingHybrid vector + BM25
OutputRisk signals with coverage assessmentGap analysis with requirement statusCited rule answers
External integrationBox JWT connectorStatic document uploadNone
StreamingSSE (signal-by-signal)SSE (requirement-by-requirement)UI message stream

Frequently asked questions

What does the Pharma Intelligence Copilot do?+

It analyzes FDA warning letters and cross-references them against a company's internal quality documents, surfacing where the issues regulators are citing elsewhere might apply to you — with the supporting citations and enforcement-trend context.

What is two-corpus RAG?+

Most retrieval systems search one body of documents. This system reasons across two at once — the external corpus of FDA warning letters and your internal quality documents — so it can connect an external enforcement pattern to the specific internal document it bears on. Both sides of every finding are cited.

Can this be adapted to our documents and regulators?+

Yes. The demo uses FDA warning letters and a sample company's documents, but the same two-corpus architecture applies to other regulators, standards bodies, or external sources cross-referenced against your internal material. Discovery scopes the corpora and integrations.

How does it connect to where our documents live?+

The demo integrates with Box to read documents from a real document store. We build against the systems your documents actually live in rather than requiring you to export and upload everything by hand.

Ready to explore?

Start with enforcement trends or run a full risk scan against Meridian's documents.