Skip to content
Back
Planning Consultancy

Planit AI, semantic search across a UK planning corpus

Planit ConsultingVisit website

The challenge

Planit Consulting is a UK planning consultancy with an archive of 18,429 .docx documents: planning statements, design & access, heritage, appeal, pre-app, cover letters and conditions. Consultants had to manually search through this archive for precedents and relevant passages. Slow, error-prone Ctrl+F work that cost hours per research task.

The analysis

The corpus is heterogeneous, 14 document types spread across 3 tiers from primary to metadata-only. Not every type demands the same chunk strategy.

Document classification was step one. Without reliable typing, retrieval could never become relevant. Of the 18,429 documents, 514 remained unclassified and were handled separately.

Embedding costs at 95,000+ chunks are substantial. We chose OpenAI text-embedding-3-large (3072 dimensions) over cheaper alternatives for precision, with cost tracking at the chunk level.

RAG without conversation history is unusable for consultants who iterate on questions. SQLite persistence for conversations and messages was part of the architecture from day one.

Our solution

  • Document classification into 14 types across 3 tiers via a Python pipeline
  • Semantic chunking with section detection: 95,592 chunks from 15,801 docs
  • Embedding via OpenAI text-embedding-3-large with separate handling for oversized chunks
  • ChromaDB vector store with persistent storage
  • FastAPI backend with REST endpoints and CORS
  • React TypeScript chat UI (Planit AI) with sidebar, conversation grouping, rename and delete
  • RAG via Claude with source citations back to original documents
  • SQLite persistence for conversations and messages

The Blueprint

01Classification
02Chunking
03Embedding
04Vector DB
05RAG Chat

Results

Search time
Seconds
From Ctrl+F through loose .docx to semantic retrieval
Coverage
15,801 docs
95,592 searchable chunks with metadata
Source citations
Per answer
Consultants can verify back to the original document
Janet Long

We are genuinely amazed at how well this works. Roughly 40% of our daily consultant work is now automated. Finding relevant precedents that used to take hours now takes seconds, with source citations back to the original document so we can always verify.

Janet Long · Founder, Planit Consulting

Related articles

BBS Advocaten

BBS Advocaten

Legal

Automatically converting court RSS feeds into publishable blog posts, social media content and summaries. Including PDF upload pipeline.

5+ hrs/week
Time saved
10x more content
Output
Read case study
Mastone

Mastone

Real Estate

Complete platform with WWS points calculation, online brochures, account management and investor portal. Over 100 conditions processed in one system.

100+ conditions
Complexity
Full platform
Type
Read case study
Simply

Simply

Recruitment & Sales

Automated outbound lead generation with niche targeting, DMU identification, multichannel outreach and smart Salesforce detection.

LinkedIn + Email
Channels
6+ integrations
Systems
Read case study
Contact

Ready to build?

Schedule a no-obligation conversation. We analyze your situation and show you what is possible, whether it is automation, a custom platform or a website.

Remo Vloet, founder van Managium

You'll speak directly with Remo, founder of Managium.

[email protected]