Skip to content
K Kashif Ullah
← All projects
2025 Sole developer

AI Document Intelligence System

A FastAPI service that ingests PDFs, chunks them, runs LangChain pipelines for summarization, key-insight extraction, and translation.

The problem

Knowledge workers drown in PDFs — reports, contracts, papers. They need focused summaries and the ability to ask questions of the document, in their own language.

The approach

  • PDF parsing with layout awareness (so tables and headings survive).
  • Recursive chunking sized to the model’s context, with overlap to keep cross-paragraph references intact.
  • Three composable LangChain chains: summarize, extract_insights, translate(target_lang).
  • Single FastAPI surface so the same backend powers a web UI, a CLI, and an internal Slack bot.

Outcome

  • Average 95% reduction in reading time for the target users.
  • Translation pipeline supports any language the underlying model handles — particularly valuable for Urdu/Pashto documents that desktop tools mangle.

Need something similar?

If this is the kind of problem you're working on, I can help.

Get in touch →