Results-oriented Data Scientist with 10+ years of experience driving enterprise-scale AI and data solutions, with strong expertise in Machine Learning, Generative AI (GenAI), and advanced analytics. Proven ability to lead the full lifecycle of AI/ML projects—from design and development to deployment and optimization—across cloud-native ecosystems.
- Led end-to-end GenAI solution development using Databricks, OpenAI, LangChain, and vector databases (Pinecone/FAISS), improving document search accuracy and reducing retrieval latency.
- Built scalable Retrieval Augmented Generation (RAG) pipelines integrated with Delta Lake and Unity Catalog to support secure, governed enterprise LLM use cases.
- Designed and deployed MLOps pipelines using MLflow, Databricks Workflows, and GitHub Actions, streamlining retraining cycles and enhancing model versioning.
- Migrated large-scale data platforms from Snowflake to Databricks Lakehouse, reducing cloud costs and improving data pipeline efficiency for Fortune 500 clients.
- Spearheaded “GenAI POC in a Month” accelerator, converting proofs-of-concept into production-ready AI solutions, accelerating client adoption.
- Developed CI/CD automation for ML models and data applications using GitLab, Jenkins, and Terraform across hybrid cloud platforms (AWS, Azure, GCP).
- Extensive hands-on experience with Spark, Kafka, PostgreSQL, MongoDB, and data lakes, optimizing batch and real-time processing pipelines.
- Collaborated across engineering, DevOps, and analytics teams to deliver robust, reproducible ML workflows with high operational reliability.
- Implemented governance and observability best practices using Prometheus, ELK Stack, and CloudWatch to ensure traceability, compliance, and performance.
- Mentored junior data engineers and cross-functional teams, contributing to multiple elite recognitions as a Databricks service partner.