Adarsh Ajay achuajays

Adarsh Ajay

Generative AI Engineer with 2+ years of hands-on experience designing, developing, and deploying production-grade GenAI solutions using Large Language Models (LLMs) and Retrieval Augmented Generation (RAG) pipelines. Expert in building scalable AI-powered backend services with strong expertise in Python, FastAPI, Google Cloud Platform (Vertex AI), vector databases, and agentic workflows. Proven track record of delivering enterprise-grade AI applications with focus on performance optimization, observability, and security.

🛠️ Core Expertise

Generative AI & LLMs

LLM Integration: Claude, Gemini, OpenAI API
RAG Pipelines: Embedding models, document chunking, indexing, retrieval optimization
AI Frameworks: LangChain, LlamaIndex, Hugging Face Transformers , Crewai , Agno
Agentic Systems: Multi-agent workflows, tool integration, autonomous reasoning
Prompt Engineering: Chain-of-thought, few-shot learning, output optimization

Backend & Infrastructure

Backend Development: Python, FastAPI, Django REST Framework, Flask
APIs & Integration: REST APIs, Webhooks, OAuth2, JWT, Microservices
Cloud Platforms: Google Cloud (Vertex AI, Cloud Run, Cloud Storage), AWS
Databases: PostgreSQL, MySQL, MongoDB , CasandraDB , Redis
Vector Databases: FAISS, Chroma, Pinecone, OpenSearch

MLOps & Optimization

Observability: Logging, monitoring, evaluation metrics (RAGAS)
Performance: Latency optimization, caching, batch processing, cost reduction
Security: Enterprise security, data governance, HIPAA compliance, IAM
Deployment: Docker, Kubernetes, CI/CD, Cloud Run, serverless

Tech Stack

💼 Professional Experience

Generative AI Engineer — Agentic AI Systems

Nuvae.ai | Aug 2025 – Feb 2026 | Remote

Designed and deployed production-grade GenAI solutions using GPT-4, Claude, and Gemini for healthcare automation workflows
Built end-to-end RAG pipelines with advanced embedding models, document chunking strategies, and optimized retrieval techniques
Deployed scalable applications on GCP using Vertex AI, Cloud Run, and Cloud Storage with 50,000+ daily requests at sub-200ms latency
Integrated vector databases (FAISS, Chroma) with hybrid retrieval, improving search accuracy by 35% and query performance by 50%
Implemented LangChain/LlamaIndex frameworks for agent-based workflows, tool integration, and multi-step reasoning systems
Established ML evaluation metrics (RAGAS) for retrieval accuracy, response quality, and hallucination detection
Applied prompt engineering techniques including chain-of-thought reasoning and systematic output optimization
Optimized cloud costs by 40% through efficient model selection, caching strategies, and batch processing
Implemented enterprise security and data governance with IAM policies, encryption, and privacy-compliant data handling

AI/ML Engineer — Backend Systems

AOT Technologies | Feb 2024 – Aug 2025 | Thiruvananthapuram, India

Developed RAG pipelines for document intelligence platforms with embedding models, chunking algorithms, and semantic retrieval
Built LLM-based applications using OpenAI and Gemini APIs serving 10,000+ daily users in production
Integrated vector databases (FAISS, Chroma) with efficient indexing and retrieval for large-scale document collections
Created scalable GenAI APIs using FastAPI with authentication, rate limiting, and comprehensive error handling
Implemented prompt engineering strategies reducing hallucinations by 40% and improving response consistency
Applied performance optimization including model quantization, caching, and async processing
Established ML observability with logging, monitoring, and alerting ensuring high availability
Worked with GCP services including Cloud Storage, Cloud Functions, and serverless processing

🚀 Featured Projects

Enterprise RAG Platform on GCP

Tech: Python, LangChain, Vertex AI, Cloud Run, FAISS, FastAPI

Architected comprehensive RAG pipeline on GCP with Vertex AI for LLM serving and Cloud Storage for document management
Implemented advanced chunking strategies (recursive, semantic) and embedding models (text-embedding-ada-002, GCP embeddings)
Built vector database integration with FAISS and Chroma using hybrid retrieval (semantic + keyword matching)
Developed scalable API deployment on Cloud Run with auto-scaling, load balancing, and IAM-based security
Applied prompt engineering with context optimization and evaluation using RAGAS metrics
Integrated LangChain for agent orchestration, tool calling, and multi-step reasoning with memory management
Achieved 45% cost reduction through batch processing, caching strategies, and optimized model selection

Healthcare AI Assistant with Agentic Workflows

Tech: LangChain, Vertex AI, Cloud Functions, Pinecone, FastAPI

Developed GenAI application using Vertex AI for model deployment and Cloud Functions for event-driven processing
Built RAG system with Pinecone vector database and advanced retrieval techniques for medical data
Created LangChain-based agents with tool integration enabling autonomous decision-making workflows
Implemented performance optimization achieving sub-500ms response times
Established HIPAA-compliant data governance ensuring privacy and secure handling of healthcare information

AI-Powered Eligibility & Pre-Authorization Service

Tech: Python, FastAPI, LLMs, REST APIs

Developed backend service orchestrating multi-step eligibility and pre-auth workflows using AI-assisted decision logic
Integrated multiple LLM providers with proper authentication, rate limiting, and fallback strategies
Focused on correctness, validation, and observability in production healthcare environments

🎓 Education

Amity University | 2024 – 2027
B.Sc. — Data Analysis

Central Polytechnic College | 2021 – 2024
Diploma — Computer Engineering

📜 Certifications

LangGraph for Agentic Workflows — Advanced AI agent development with state management
Practical Multi-Agent Systems (CrewAI) — Building collaborative AI workflows
Develop GenAI Apps with Gemini — Google Cloud Vertex AI and Gemini integration
Google Cloud Platform — Vertex AI, Cloud Run, Cloud Storage, IAM
Data Science with Python — ML evaluation and performance optimization

📊 GitHub Stats

📫 Let's Connect

Building GenAI solutions? Let's collaborate on RAG pipelines, agentic workflows, or LLM applications.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly