Summary
Overview
Work History
Education
Skills
Websites
Timeline
Generic

Yashwanth Reddy G

Hyderabad,TG

Summary

  • AI Engineer & Data Scientist with 4.5+ years of experience developing end-to-end Machine Learning, NLP, and Generative AI solutions across Banking, Retail, Healthcare, and E-Commerce. Experienced in building and fine-tuning LLMs and transformer models (BERT, RoBERTa, GPT-3.5/4, Mistral, LLaMA, T5), as well as developing production-grade RAG architectures using vector stores like FAISS, Pinecone, ChromaDB, and Weaviate.
  • Strong practical expertise in Python, TensorFlow, PyTorch, Scikit-learn, FastAPI, Flask, and data engineering frameworks such as Spark, Hadoop, Hive, Delta Lake, and Databricks. Skilled in MLOps with MLflow, Airflow, Docker, Kubernetes, CI/CD pipelines, Git, and model deployment on AWS (SageMaker, Lambda), Azure (Data Factory, Synapse, Databricks), and GCP (Vertex AI, BigQuery).
  • Experienced in SQL & NoSQL databases (PostgreSQL, MySQL, MongoDB, Cassandra), cloud data pipelines, and distributed model training. Knowledgeable in prompt engineering, evaluation frameworks, LLM optimization, and vector embeddings.
  • A collaborative communicator known for mentoring junior engineers, partnering with cross-functional teams, and delivering high-impact, scalable AI systems that drive measurable business outcomes. Highly motivated employee with a strong work ethic, adaptability, and exceptional interpersonal skills. Adept at working effectively unsupervised and quickly mastering new skills.

Overview

5
5
years of professional experience

Work History

Gen AI Engineer & Data Scientist

Wayfair
Boston, MA
01.2024 - Current
  • Developed end-to-end machine learning models for fraud detection, anomaly detection, and customer analytics using Python, SQL, XGBoost, LightGBM, and Random Forest, improving prediction accuracy and business decision quality.
  • Built NLP pipelines using BERT, RoBERTa, and Transformer-based architectures for text classification, sentiment analysis, and summarization, reducing manual analysis effort by 40%.
  • Designed and maintained distributed data pipelines using Spark, PySpark, Databricks, and Snowflake for large-scale data processing, feature engineering, and automated ETL/ELT workflows.
  • Created deep learning solutions with TensorFlow and PyTorch (including LSTMs, CNNs, and Autoencoders) for time series modeling, IoT signal analysis, and predictive maintenance.
  • Developed and deployed LLM-powered components such as text summarizers and semantic search using Hugging Face models, FAISS/Pinecone vector indexing, and retrieval-based techniques.
  • Implemented production ML pipelines using MLflow, Airflow, and Docker, enabling reproducible experiments, model tracking, and automated deployment workflows.
  • Built REST APIs using FastAPI and Flask to serve ML models in real time, ensuring low-latency performance under high data loads.
  • Conducted large-scale data engineering activities using Hadoop, Hive, Kafka, and Snowflake to support analytics and modeling use cases across business teams.
  • Created dashboards and visual reports using Tableau, Power BI, and Matplotlib, helping stakeholders understand model outputs, KPIs, and operational metrics.
  • Collaborated with cross-functional teams and mentored interns on ML fundamentals, data preprocessing, model evaluation, and best practices for production deployments.

Data Scientist and machine learning

Dell Technologies
Hyderabad, TS
02.2021 - 07.2023
  • Built end-to-end fraud detection, anomaly detection, and recommendation models using XGBoost, LightGBM, and Autoencoders, improving overall prediction accuracy by 22% across enterprise datasets.
  • Designed Transformer-based NLP pipelines using BERT, RoBERTa, and GPT-based embeddings for text classification, sentiment analysis, and summarization, reducing manual review workload by 40%.
  • Engineered distributed ML and data pipelines using Apache Spark, PySpark, Databricks, and Snowflake, enabling large-scale data processing and analytics across business-critical workloads.
  • Developed deep learning workflows with TensorFlow and Keras to integrate AI-driven features into Dell XPS 13 applications, achieving 200ms inference latency and enhancing user experience for Dell Mobile Connect.
  • Implemented edge ML solutions using Spark + Flask for near real-time IoT analytics, processing up to 1M events/sec and achieving 0.90 F1-score for factory efficiency monitoring.
  • Deployed scalable ML models on AWS SageMaker, Azure ML Studio, and serverless AWS Lambda, ensuring sub-50ms inference latency under high traffic.
  • Automated ETL and ML workflows using Airflow and Azure Data Factory, reducing data preparation and model refresh cycles by 25%.
  • Built interactive dashboards using Tableau and Matplotlib, and tracked 150+ ML experiments through TensorBoard to improve reproducibility and experimentation.
  • Worked with multi-cloud environments (AWS, Azure, GCP) to support model deployment, experimentation, and data pipelines for cross-team analytics.
  • Mentored interns on machine learning fundamentals, feature engineering, model evaluation, and best practices for production-ready ML systems.

Education

Master' S - Science And Technology Management

Lindsey Wilson College
Columbia, Kentucky, KY
01-2023

Bachelor' s - Computer Sciences

St Martins Engineering College
Hyderabad
01-2021

Skills

Programming & Scripting

Python, SQL, PySpark, Bash

Machine Learning

Regression, Classification, XGBoost, LightGBM, Random Forest, Clustering, PCA, Time Series Modeling

Deep Learning

TensorFlow, PyTorch, CNNs, RNNs, LSTM, Autoencoders, Transformer Architectures

Large Language Models (LLMs) & Generative AI

BERT, RoBERTa, GPT-35/4, Mistral, LLaMA2, Hugging Face, LoRA, PEFT, Prompt Engineering, RAG Pipelines

Natural Language Processing

Tokenization, NER, Summarization, Topic Modeling, Sentiment Analysis, Conversational AI

Vector Databases & LLM Tooling

FAISS, Pinecone, PGVector, LangChain, LlamaIndex, OpenAI API

Cloud Platforms

AWS (S3, EC2, SageMaker),
Azure (Databricks, ML Studio, Data Factory),
GCP (Vertex AI, BigQuery)

MLOps & DevOps

MLflow, Airflow, Docker, Kubernetes, GitHub Actions, GitLab CI/CD

Big Data & Data Engineering

Spark, Databricks, Hadoop, Hive, Kafka, Snowflake, ETL/ELT Pipelines

Databases

PostgreSQL, MySQL, SQL Server, MongoDB, Vector DBs (Pinecone, PGVector)

Visualization & Application Development

Tableau, Power BI, Matplotlib, Seaborn, Streamlit, Flask

Timeline

Gen AI Engineer & Data Scientist

Wayfair
01.2024 - Current

Data Scientist and machine learning

Dell Technologies
02.2021 - 07.2023

Master' S - Science And Technology Management

Lindsey Wilson College

Bachelor' s - Computer Sciences

St Martins Engineering College
Yashwanth Reddy G