Results-driven Data Engineer with 7+ years of experience in building scalable data pipelines, cloud architectures (GCP, AWS), and real-time processing solutions. Proficient in orchestration (Airflow), advanced analytics, and CI/CD automation. Experienced in applying GenAI (Vertex AI Gemini, OpenAI) for ETL modernization and SQL generation to enhance data accessibility, accuracy, and business impact.
LLM-Powered Automation and GenAI Projects:
Data Migration and Ingestion Frameworks:
Platform Automation, Machine Learning, and Orchestration:
Thesis: Evaluation of Machine Learning models in Prediction of 5-Year Cancer Survivability.
NSCLC Report De-identification:
REDCap PHI Integration:
Investigation of the Utility of Features in a Clinical De-identification Model: A Demonstration Using EHR Pathology Reports for Advanced NSCLC Patients