Experienced Data Engineer with 3+ years of proven expertise in architecting and developing strong data pipelines and scalable solutions. Capable in utilizing a comprehensive stack of programming languages, frameworks, and cloud services to optimize data processing, storage, and analytics in healthcare and enterprise environments. Demonstrated success in reducing processing times by 10% through the implementation of optimized ETL processes and real-time data streaming solutions. Engaged with cross-functional teams including data scientists, analysts, and stakeholders to deliver actionable insights and drive business outcomes.
Overview
5
5
years of professional experience
Work History
Molina Healthcare Data Engineer
, United States
01.2023 - Current
Architected and built high-throughput data pipelines with Apache Spark for healthcare data, reducing processing time and facilitating effective population health management and cost reduction analysis
Developed scalable data warehouses on AWS Redshift for secure, accessible storage, resulting in a 15% increase in data accessibility for analysts and actuaries to generate reports and conduct risk assessments
Collaborated with healthcare data scientists and analysts to design data solutions improving member risk profiling, care coordination, and fraud detection accuracy
Implemented automated data quality checks and cleansing routines achieving a data accuracy rate, ensuring reliable insights for decision-making
Applied Git for code management, reducing development cycles by 20% and enabling seamless collaboration on data engineering projects
Optimized data storage costs and scalability with AWS Redshift, reducing data storage expenditures to accommodate growing data volumes
Employed real-time data streaming solutions using Apache Kafka for healthcare systems, ensuring near- instantaneous data updates and improving patient care responsiveness
Conducted performance tuning and optimization of ETL processes, resulting in a 30% reduction in processing time and enhanced overall efficiency of data pipelines.
Data Engineer
iSparrow
, India
08.2019 - 07.2021
Designed and implemented ETL pipelines, reducing data processing time by 30% and improving data accuracy by implementing data quality checks
Collaborated with stakeholders to define key performance indicators (KPIs) and developed data models that enhanced reporting efficiency, resulting in increase in report generation speed
Optimized database performance, achieving a 25% reduction in query response times and maintaining a uptime for critical systems
Conducted regular data quality assessments, achieving a data accuracy improvement of 15% by identifying and resolving inconsistencies in datasets
Facilitated the development of data governance policies, ensuring compliance with regulatory standards and achieving a adherence rate to data privacy guidelines
Led a team in migrating legacy ETL processes to cloud-based solutions, resulting in a 50% reduction in infrastructure costs and increased scalability for future data growth.