Designed and implemented scalable ETL processes for healthcare data integration across two major projects at Cognizant and Healthfirst. In the Data Integration for Healthcare Industry project, I used Informatica PowerCenter, SQL, and Oracle to integrate, transform, and clean large-scale healthcare data, ensuring high-quality datasets for downstream analytics and strategic decision-making. In the EVIC Project at Healthfirst, I led the migration of SAS-based processes to AWS Cloud, utilizing AWS Glue and PySpark to optimize ETL workflows, improve data quality, and streamline deployments through CI/CD pipelines integrated with GitHub. My work involved data preprocessing, maintaining data integrity, and leveraging tools like Power BI and Pandas to generate actionable insights, driving data-driven strategies in the healthcare sector.
· Strong proficiency in PySpark, SQL, and data processing
· Developed a local AWS Glue setup, resulting in significant cost savings for client
· Experience with AWS Cloud Services, including Glue and S3
· Skilled in ETL development and data quality assurance
· Excellent problem-solving and communication skills
· Demonstrated ability to work in cross-functional teams
. Statistical tests and probability theory
. Feature Engineering
. Deploying models on cloud platform AWS