AI Engineer
Vypar TaxOne
Chartered accountants deal with thousands of pages of GST circulars, notifications, and legal amendments — documents that are dense, frequently updated, and genuinely hard to navigate quickly. I was brought in to build an AI system that could answer compliance questions directly from this corpus.
- Built the GST Compliance Assistant: a domain-specific RAG pipeline with long-term memory via MongoDB, deployed serverlessly on Google Cloud Run with auto-scaling for unpredictable query loads.
- Rebuilt the OCR-to-LLM pipeline from scratch — scanned GST documents had poor OCR quality, which made chunking and embedding unreliable. Increased extraction accuracy from 40% to 80%.
- Benchmarked four next-gen RAG architectures — HyDE, CAG, KAG, and Agentic RAG — against our baseline to identify which retrieval strategies held up on legal text specifically.