Building the infrastructure that makes AI run fast, cheap, and reliably at scale.
I'm a Senior MLOps / AI Platform Engineer with 5+ years shipping production systems across two deep specializations:
- π LLM Inference Infrastructure β vLLM, SGLang, MCP-based agents, RAG architectures.
- ποΈ Computer Vision Pipelines β Real-time object detection, multi-object tracking, segmentation at millions of frames per week.
I've led teams of 4+ engineers, contributed to open-source SDKs increasing downloads by 10x, and pushed models to top throughput performance.
| Achievement | Detail |
|---|---|
| π LLM Inference Optimization | Low Latency, High Throughput Models |
| π¦ 10x SDK Growth | Contributed to Clarifai Python SDK driving 10x download increase |
| π― 0.97 mAP | Car Dent Detection & Segmentation for insurance client |
| π§ 3x Latency Reduction | Multi-modal RAG platform over 1M+ document knowledge base |
| π NVIDIA Hackathon | Smart City Hackathon (Asia-Pacific) β Pothole Detection with RT-DETR |
| βοΈ 80% GPU Memory Savings | LoRA/PEFT adapters enabling cost-efficient production fine-tuning |
π· Senior MLOps Engineer β Clarifai (Mar 2024 β Present)
Built and optimized Clarifai's LLM inference engine to world-class benchmark performance.
- Experience in vLLM, SGLang, MCP-based agents, RAG architectures
- Architected MCP-based agent serving with LLM-as-a-judge evaluation and HITL feedback loops
- Built sports analytics pipelines (Object Detection, MOT) achieving 90% MOTA at millions of frames/week
- Implemented LoRA/PEFT reducing GPU memory by up to 80% vs full-parameter fine-tuning
- Drove 10x SDK download growth via Clarifai Python SDK & CLI improvements
π· MLOps Engineer β Clarifai (Apr 2023 β Feb 2024)
- Designed end-to-end multi-modal RAG platform (LLMs + VLMs) for document Q&A β 3x latency reduction over 1M+ document KB
- Built scalable vector search pipelines with Qdrant for semantic retrieval at production scale
Built the Kandula.ai cognitive computer vision SaaS platform from the ground up.
| Project | Result |
|---|---|
| π Car Dent Detection & Segmentation | 0.97 mAP using YOLO-based models for insurance client |
| π₯ Fire Detection Droid | Real-time YOLO_v3 deployment |
| π³οΈ Pothole Detection | 95% accuracy on 100k-image dataset with RT-DETR; NVIDIA Hackathon finalist |
| ποΈ Gaze Tracking | R&D of SOTA algorithms, heatmap data products for retail analytics |
| π₯ Crowd Detection | Hospital infection rate reduction β threshold alerting system |
| βοΈ Model Deployment | Exported 50+ models in ONNX/TensorRT with INT8/FP16 quantization |
| Degree | Institution | Year | Grade |
|---|---|---|---|
| M.Sc. Data Science | Loyola College, Chennai | 2019β2021 | 8.3 CGPA β First Class with Distinction |
| B.Sc. Mathematics | Vivekananda College, Chennai | 2016β2019 | 6.1 CGPA |
- π Machine Learning β Coursera
- π Python Data Structures β Coursera
- π₯ Building Real-Time Video AI Applications β NVIDIA
- π€ Mastering LLMs β Analytics Vidhya
| Project | Description | Stack |
|---|---|---|
| Docwhisper | Doc Question answering using RAG | Python, FastAPI, Ollama |
| PyTorch Object Detect & Track | Real-time multi-object detection and tracking in video | Python, PyTorch, YOLO |
π Most production work lives in private/org repos at Clarifai β benchmarks and results linked in experience above.


