Driving innovation at the intersection of AI/ML and Data Engineering to deliver scalable and impactful solutions.
Deployed and maintained scalable data pipelines and cloud-based AI/ML solutions on AWS, leveraging DevOps and MLOps best practices for efficient model training and deployment.
Led a team of 4 to enable an efficient data pre-processing pipeline for online training and inference of a novel BERT based channel affinity model, increasing successful student engagement by 22%.
Migrated and Transformed over 18 legacy ETL and ELT pipelines from an on-prem Airflow server to AWS Glue, effectively driving down the monthly cost by 26%.
Built a robust CI/CD pipeline using AWS CodeBuild and CodePipeline to efficiently upload large files (over 1 GB) to data lakes.
Collaborated with stakeholders to create an LLM fine-tuning cloud framework using RLHF, ensuring LLaMA-7B model's compliance with company messaging policies
Data Engineer Intern
University of Phoenix
05.2023 - 08.2023
Ingested, processed, and transformed data from many sources into centralized data repositories
Scaled ETL operations for processing extensive student data, contributing to improved data manipulation efficiency
Innovatively integrated LLMs and Transformers pipeline to explore potential learning aids for students
Conducted load tests on custom APIs handling data streaming, improving performance by 30%
Collaborated in the design of a graph database and initiated an ETL process using Apache Spark, successfully migrating 1TB of data from an SQL database
Data Systems and Strategy Intern
University of Phoenix
06.2022 - 01.2023
Collaborated with the team in an agile environment to develop and deploy ETL jobs to stream data from disparate data sources to AWS Redshift (SQL Database)
Developed an API to publish custom metrics surrounding data ingestion, to cloud watch for better analytical insight, aiding in a reduction of 18% in data loss
Implemented and maintained Apache Spark ETL jobs for backfilling historical data from the Neptune graph database
Assisted in ML model deployment architecture on AWS SageMaker and created API endpoint for real-time inference, predicting student risk factors
Optimized model inference using Numba, reducing inference time by 46%
Education
Master of Science - Computer Science
Rochester Institute of Technology
Rochester, NY
Bachelor of Technology - Computer Science and Engineering