Summary
Overview
Work History
Skills
Timeline
Generic

Vasanth Krishna

Summary

Senior Data Engineer with 8+ years of experience across the full SDLC, specializing in Big Data, Cloud, and GenAI solutions. Expert in building scalable data pipelines using Python, PySpark, Airflow, and Spark on AWS, Azure, and Hadoop ecosystems. Proven success in deploying multimodal GenAI apps (e.g., CLIP, BLIP, LLMs) using FastAPI, Docker, and CI/CD. Skilled in synthetic data generation, model evaluation (BLEU, ROUGE, SHAP), and responsible AI deployment. Hands-on experience with SQL across Redshift, Oracle, PostgreSQL, and Salesforce for high-performance ETL. Proficient in MLOps (SageMaker, Azure ML, MLflow), NoSQL (MongoDB, Cassandra), and Terraform for infrastructure automation. Strong background in data integration, quality, governance, and automation across cloud-native platforms.

Overview

8
8
years of professional experience

Work History

Gen AI Engineer

Honeywell International
Charlotte
01.2024 - Current
  • Designed and fine-tuned enterprise-grade LLMs (GPT-2/3, LLaMA, Falcon) using PEFT, LoRA, and Hugging Face to support RAG, Q&A, and summarization use cases.
  • Built and deployed intelligent GenAI assistants via FastAPI, LangChain, and vector DBs (FAISS, Pinecone) integrated with OpenAI and Anthropic APIs.
  • Engineered multi-modal AI systems (CLIP, BLIP), synthetic data pipelines (GANs, VAEs), and RLHF tuning for robust and aligned model outputs.
  • Developed safety layers with zero-shot moderation and delivered stakeholder-facing apps using Streamlit and Gradio.
  • Led reproducible GenAI experiments using DVC, W&B, and drove cross-functional adoption into digital products and copilots.

Gen AI Engineer

Amazon
Seattle
09.2022 - 12.2023
  • Built secure, chat-based GenAI assistants with OpenAI/GPT, integrating memory, tool use, and document retrieval for enterprise Q&A.
  • Led private LLM deployments with API gateways and RAG pipelines (LangChain, Haystack, LlamaIndex) for grounded, compliant outputs.
  • Developed function-calling agents and fine-tuned models via LoRA and prompt chaining for automation, summarization, and reasoning.
  • Integrated CLIP/BLIP for multimodal intelligence and engineered safety layers to ensure responsible LLM usage.
  • Enabled GenAI features in SaaS tools (legal, HR, compliance) and built internal evaluation frameworks for alignment and explainability

Data Engineer

Goldman Sachs Group
New York
03.2019 - 08.2021
  • Developed real-time ingestion pipelines using Kinesis and Kafka to process patient vitals and telehealth interactions.
  • Built ML models (PyTorch, Scikit-learn) on SageMaker to predict medication adherence and improve patient outcomes.
  • Optimized batch/streaming workflows on EMR with PySpark, Hive, and Spark SQL for large-scale healthcare data.
  • Automated CI/CD and infrastructure using CodePipeline, Jenkins, and Terraform; ensured compliance with secure encryption.
  • Delivered dashboards (QuickSight) and OLAP insights via Redshift Spectrum, collaborating with cross-functional Agile teams.

Data Engineer

IBM
Hyderabad
06.2017 - 11.2019
  • Led end-to-end SDLC for AWS-Hadoop applications, from requirements to deployment of scalable data pipelines.
  • Automated executive reporting via Python, Airflow, Tableau REST APIs, and AWS CodeBuild.
  • Built real-time and batch ETL workflows using PySpark, Hive, Kafka, NiFi, and EMR for structured/unstructured data.
  • Deployed ML models (K-Means, Spark MLlib) via SageMaker and Kubernetes; automated infra with Terraform.
  • Visualized insights across patient care and CRM using Tableau, Power BI, and integrated Salesforce with live external data.

Skills

  • Programming & Scripting:
    Python, SQL, Scala, Shell Scripting
  • Big Data & ETL:
    PySpark, Hive, Airflow, Spark Streaming, Kafka, Redshift, Snowflake, Databricks, EMR
  • Cloud & Infrastructure:
    AWS (EC2, RDS, S3, Lambda, Glue), Azure, Docker, Kubernetes, Terraform
  • Databases:
    PostgreSQL, MySQL, DynamoDB, MongoDB
  • GenAI & NLP:
    GPT-2 / GPT-3, LLaMA, Falcon, Hugging Face Transformers, LangChain, LlamaIndex, Sentence-BERT
  • Machine Learning & MLOps:
    PyTorch, Scikit-learn, MLflow, DVC, Weights & Biases
  • Dashboards & Visualization:
    Tableau, Power BI
  • Frameworks & APIs:
    FastAPI, Flask

Timeline

Gen AI Engineer

Honeywell International
01.2024 - Current

Gen AI Engineer

Amazon
09.2022 - 12.2023

Data Engineer

Goldman Sachs Group
03.2019 - 08.2021

Data Engineer

IBM
06.2017 - 11.2019