About I’m an AI Engineer at ProRata AI, where I work on the Inference Engine that powers gist.ai and Gist Answers. My work focuses on building scalable Retrieval-Augmented Generation (RAG) and LLM inference systems that make enterprise knowledge search faster, more accurate, and contextually grounded.
Core focus areas include: • Optimizing RAG and semantic retrieval architectures to improve search relevance and query understanding. • Designing strategies for multi-turn grounding, context augmentation, and hallucination mitigation. • Addressing production constraints around latency, cost, and effective context management. • Building and deploying resilient, production-grade RESTful inference services.
Previously, I’ve worked on Machine Translation, Natural Language Understanding, Multimodal Learning, and LLM Interpretability.
I’m passionate about bridging deep learning research and real-world deployment, translating modeling advances into measurable user impact at scale.
Always open to conversations about RAG optimization, inference scaling, and LLM productization.