Summary
Overview
Work History
Education
Skills
Timeline
Generic

Chaitra Tadaga Prabhukumara

Thousand Oaks,CA

Summary

Accomplished AI/ML Engineer with 6+ years of experience designing and deploying end-to-end Generative AI and Machine Learning solutions across pharmaceutical, fintech, and enterprise domains. Proven expertise building Custom GPTs with Actions, RAG systems, and LLM-powered analytics on Amazon Bedrock, SageMaker, Azure OpenAI, and Azure AI Foundry, alongside production ML models using XGBoost, Random Forest, LSTM, and BERT. Track record of measurable business impact - 15% churn reduction, 20% engagement lift, 85% model accuracy, and 14% query performance gains on migrated data platforms.

Overview

11
11
years of professional experience

Work History

Associate IS Business System Analyst

AMGEN
Thousand Oaks
10.2023 - Current
  • Designed and deployed a Custom GPT with Actions integrated with the EDF enterprise database, enabling 50+ non-technical business users to query structured data in natural language and reducing average analyst turnaround time on ad-hoc data pulls by approximately 60%.
  • Engineered GPT Action schemas (OpenAPI specifications) exposing 15+ curated database operations to the LLM, enforcing role-based access controls, row-level security, and audit logging for pharmaceutical compliance.
  • Integrated the Custom GPT with internal analytics and reporting tools, automating dashboard generation and formatted output reports and eliminating ~20 hours/week of manual reporting effort across DQMS teams.
  • Authored and iteratively refined prompt templates and system instructions to optimize response accuracy, consistency, and domain-specific tone across pharmaceutical data workflows.
  • Performed data mapping between Trackwise and Veeva systems across multiple integration tracks, ensuring 100% data integrity, lineage, and traceability during DQMS integration initiatives.
  • Gathered requirements, developed use cases, and performed system analysis to translate business problems into technical specifications.
  • Skills: Veeva QMS, Open AI, Databrick, MySQL, AI Workbench

Data Engineer

VISA
Palo Alto
02.2022 - 07.2023
  • Designed and implemented a real-time data pipeline using PySpark on distributed Spark clusters, processing semi-structured data from 5+ source systems and producing analytics-ready datasets for downstream product and risk teams.
  • Led the migration from SQL to PostgreSQL using the Tusker migration tool, improving query performance by 14% and reducing average report generation latency across downstream analytics workloads.
  • Configured spark-submit with speculative execution and tuned shuffle, memory, executor, and partition parameters, stabilizing 100+ production Spark jobs handling high-volume payment transaction data.
  • Automated ingestion and transformation workflows using Python and Unix shell scripting, replacing manual processes and reducing operational toil by an estimated 30% for the data platform team.
  • Tuned ETL component performance and optimized Teradata queries to consistently meet SLAs on high-volume batch processing.
  • Built Tableau dashboards and pivot-table reports for business stakeholders, translating raw transaction data into actionable insights for product and risk teams.
  • Wrote scripts and processes for data integration, data validation, and bug fixes across multiple production environments; authored Hive queries on large analytical datasets.
  • Skills: Spark, Java, Scala, Python, JSON, SQL, Linux Shell Scripting, Parquet, Avro, AWS, Tableau

Data Scientist

Agile data Inc
Katy, TX
06.2021 - 12.2021
  • Built predictive churn models using random forest and gradient boosting on 1M+ customer records, achieving 85% model accuracy and driving a 15% reduction in customer churn rate across target segments.
  • Developed an LSTM and BERT deep-learning pipeline for topic classification on a large customer feedback dataset, benchmarked against classical models (SVC, Logistic Regression, Linear SVC, XGBoost, Logistic Regression CV) across accuracy, latency, and interpretability.
  • Contributed to a recommendation system that drove a 20% increase in customer engagement through personalized content delivery.
  • Conducted large-scale data cleaning, preprocessing, and feature engineering to ensure model input quality and maximize predictive power.
  • Performed EDA to identify trends, patterns, and actionable signals for marketing and product teams.
  • Created Tableau visualizations to present model outputs and insights to executive stakeholders, improving data-driven decision-making processes.
  • Skills: Python, Tableau, Scikit-learn, Linear SVC, XGBoost, Logistic Regression, LSTM, BERT, TensorFlow, PyTorch

Data Scientist

SM Search System LLP
Bangalore, India
10.2015 - 12.2019
  • Improved data quality & performance by 40% by analyzing, verifying and modifying SAS & Python scripts.
  • Conducted statistical analysis on customer data to identify patterns and trends, leading to a 15% reduction in churn rate.
  • Analyzed large-scale data sets using Python and SQL to extract actionable insights and identify trends.
  • Used AWS EMR to transform and move large volumes of data across AWS data stores including Amazon S3 and DynamoDB, and designed partitioning / bucketing schemas enabling faster analytical query retrieval.
  • Developed predictive models using machine learning algorithms to optimize marketing campaigns, resulting in a 20% increase in customer acquisition.
  • Developed and maintained dashboards and reports using Tableau for real-time monitoring of key performance indicators.
  • Analyzed high-volume log data using Splunk SPL to extract insights and support operational decision-making.
  • Performed end-to-end architecture and implementation assessments across AWS services including EMR, EC2, Redshift, RDS, Lambda, and S3; configured AWS CLI auto-scaling and CloudWatch monitoring, alerting, and operational dashboards.
  • Built Kibana dashboards on Logstash data and integrated source and target systems into Elasticsearch for near real-time log analysis and end-to-end transaction monitoring.
  • Continuously monitored and managed Hadoop clusters through Cloudera Manager; worked with multiple file formats (Parquet, Avro, DAT, JSON) and compression codecs (Gzip).
  • Implemented UNIX scripts to define use case workflows, process data files, and automate recurring jobs; delivered agreed user stories on time every sprint as part of a Scrum team.
  • Skills: Python, Spark RDD, Spark SQL, AWS S3, EMR, Redshift, RDS, Lambda, Hadoop, Cloudera, Elasticsearch, Kibana, Logstash, UNIX

Education

Doctor of Business Administration -

Westcliff University
01.2028

Master of Science - Data Science

University of New Haven
01.2021

Bachelor of Engineering - Computer Science & Engineering

Visvesvaraya Technological University (VTU)
01.2015

Skills

  • Generative AI & LLMs
  • Custom GPTs
  • GPT Actions
  • OpenAPI Schemas
  • Prompt Engineering
  • RAG
  • LangChain
  • Vector Embeddings
  • Foundation Models
  • Claude
  • Titan
  • GPT-4
  • Amazon Bedrock
  • Bedrock Knowledge Bases
  • Bedrock Guardrails
  • Bedrock Agents
  • Machine Learning
  • XGBoost
  • Gradient Boosting
  • Random Forest
  • Logistic Regression
  • SVC
  • Linear SVC
  • Classification
  • Regression
  • Clustering
  • Feature Engineering
  • Hyperparameter Tuning
  • Cross-Validation
  • Deep Learning & NLP
  • LSTM
  • BERT
  • Transformers
  • Neural Networks
  • Transfer Learning
  • Text Classification
  • Topic Modeling
  • Sentiment Analysis
  • Tokenization
  • AWS AI/ML
  • SageMaker
  • SageMaker Studio
  • SageMaker Pipelines
  • Textract
  • Comprehend
  • Polly
  • Transcribe
  • Rekognition
  • Lambda
  • S3
  • EMR
  • Redshift
  • RDS
  • DynamoDB
  • EC2
  • CloudWatch
  • IAM
  • Data Engineering
  • Apache Spark
  • PySpark
  • Spark SQL
  • Spark Streaming
  • Kafka
  • Kubernetes
  • Hadoop
  • Hive
  • Cloudera
  • Parquet
  • Avro
  • JSON
  • ETL
  • Real-Time Pipelines
  • Snowflake
  • Teradata
  • MLOps & DevOps
  • Model Deployment
  • Model Versioning
  • CI/CD
  • Git
  • SVN
  • Docker
  • Workflow Automation
  • Unix Shell Scripting
  • Agile
  • Scrum
  • Programming
  • Python
  • SQL
  • R
  • Java
  • Scala
  • HTML
  • CSS
  • ML Libraries
  • PyTorch
  • TensorFlow
  • Keras
  • Scikit-learn
  • Pandas
  • NumPy
  • Matplotlib
  • Seaborn
  • Databases
  • PostgreSQL
  • MySQL
  • SQL Server
  • MongoDB
  • Elasticsearch
  • Analytics & BI
  • EDA
  • Hypothesis Testing
  • A/B Testing
  • Time Series Analysis
  • Tableau
  • Kibana
  • Power BI
  • Logstash

Timeline

Associate IS Business System Analyst

AMGEN
10.2023 - Current

Data Engineer

VISA
02.2022 - 07.2023

Data Scientist

Agile data Inc
06.2021 - 12.2021

Data Scientist

SM Search System LLP
10.2015 - 12.2019

Doctor of Business Administration -

Westcliff University

Master of Science - Data Science

University of New Haven

Bachelor of Engineering - Computer Science & Engineering

Visvesvaraya Technological University (VTU)
Chaitra Tadaga Prabhukumara