sanjaychelliah sanjaychelliah

Senior ML Engineer · LLM Infrastructure · AI Platform

🧠 About Me

Building the infrastructure that makes AI run fast, cheap, and reliably at scale.

I'm a Senior MLOps / AI Platform Engineer with 5+ years shipping production systems across two deep specializations:

🚀 LLM Inference Infrastructure — vLLM, SGLang, MCP-based agents, RAG architectures.
👁️ Computer Vision Pipelines — Real-time object detection, multi-object tracking, segmentation at millions of frames per week.

I've led teams of 4+ engineers, contributed to open-source SDKs increasing downloads by 10x, and pushed models to top throughput performance.

⚡ Career Highlights

Achievement	Detail
🏆 LLM Inference Optimization	Low Latency, High Throughput Models
📦 10x SDK Growth	Contributed to Clarifai Python SDK driving 10x download increase
🎯 0.97 mAP	Car Dent Detection & Segmentation for insurance client
🧠 3x Latency Reduction	Multi-modal RAG platform over 1M+ document knowledge base
🏅 NVIDIA Hackathon	Smart City Hackathon (Asia-Pacific) — Pothole Detection with RT-DETR
⚙️ 80% GPU Memory Savings	LoRA/PEFT adapters enabling cost-efficient production fine-tuning

🛠️ Tech Stack

LLM & GenAI

Models I've Worked With

Computer Vision

MLOps & Infrastructure

Data & Storage

💼 Experience

🔷 Senior MLOps Engineer — Clarifai (Mar 2024 – Present)

Built and optimized Clarifai's LLM inference engine to world-class benchmark performance.

Experience in vLLM, SGLang, MCP-based agents, RAG architectures
Architected MCP-based agent serving with LLM-as-a-judge evaluation and HITL feedback loops
Built sports analytics pipelines (Object Detection, MOT) achieving 90% MOTA at millions of frames/week
Implemented LoRA/PEFT reducing GPU memory by up to 80% vs full-parameter fine-tuning
Drove 10x SDK download growth via Clarifai Python SDK & CLI improvements

🔷 MLOps Engineer — Clarifai (Apr 2023 – Feb 2024)

Designed end-to-end multi-modal RAG platform (LLMs + VLMs) for document Q&A — 3x latency reduction over 1M+ document KB
Built scalable vector search pipelines with Qdrant for semantic retrieval at production scale

🔷 AI Engineer — Pavo & Tusker Innovations (Jun 2021 – Mar 2023)

Built the Kandula.ai cognitive computer vision SaaS platform from the ground up.

Project	Result
🚗 Car Dent Detection & Segmentation	0.97 mAP using YOLO-based models for insurance client
🔥 Fire Detection Droid	Real-time YOLO_v3 deployment
🕳️ Pothole Detection	95% accuracy on 100k-image dataset with RT-DETR; NVIDIA Hackathon finalist
👁️ Gaze Tracking	R&D of SOTA algorithms, heatmap data products for retail analytics
🏥 Crowd Detection	Hospital infection rate reduction — threshold alerting system
⚙️ Model Deployment	Exported 50+ models in ONNX/TensorRT with INT8/FP16 quantization

🎓 Education

Degree	Institution	Year	Grade
M.Sc. Data Science	Loyola College, Chennai	2019–2021	8.3 CGPA — First Class with Distinction
B.Sc. Mathematics	Vivekananda College, Chennai	2016–2019	6.1 CGPA

📜 Certifications

🎓 Machine Learning — Coursera
🐍 Python Data Structures — Coursera
🎥 Building Real-Time Video AI Applications — NVIDIA
🤖 Mastering LLMs — Analytics Vidhya

📊 GitHub Stats

🚀 Featured Projects

Project	Description	Stack
Docwhisper	Doc Question answering using RAG	Python, FastAPI, Ollama
PyTorch Object Detect & Track	Real-time multi-object detection and tracking in video	Python, PyTorch, YOLO

🔒 Most production work lives in private/org repos at Clarifai — benchmarks and results linked in experience above.

💬 Let's Build Something

Open to collaborations on LLM infrastructure, computer vision systems, and AI platform engineering.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly