
Certified AWS Cloud Practitioner with over 9 years of hands-on experience in data and quality engineering, specializing in on-premises and big data technologies. Expertise includes designing dimensional and fact models, developing data acquisition and quality frameworks, and executing cloud data migrations across commercial insurance, entertainment and media, and digital advertising sectors. Proven track record as a primary contact for analytics and business teams, successfully driving data quality initiatives while supporting senior management in making informed, data-driven decisions. Committed to fostering a collaborative environment by enhancing co-workers' business acumen and understanding of data, while adhering to agile methodologies for effective product and program delivery.
● Designed, implemented, and maintained data pipelines for efficient data ingestion, transformation, and loading using PySpark in EC2 instances after cluster configuration is completed in EMR environment.
● Developed Automated Data Validation framework to identify and resolve data quality issues during and after the ingestion process and before the data is being sent to business users.
● Experience in creating and reading DynamoDB configuration parameters for ETL execution and automation script execution.
● Monitored AWS infrastructure and pipelines to ensure smooth and reliable data flow, promptly addressing any issues or bottlenecks.
● Validated ETL data pipelines to ensure accuracy, completeness, and adherence to business requirements and data standards.
● Created event driven data pipelines using AWS Lambda.
● Created data models and data mappings for different stages of data within the data warehouse, including the creation of dimensional and fact tables using SQL and Python.
● Proficient in working with AWS Glue, Step Functions, EC2, and data pipelines for efficient data processing and management.
● Generated ad hoc reports using AWS Glue to meet the data needs of end users, providing them with actionable insights.
● Ensured data validation across all layers of the data ecosystem to maintain data integrity and consistency.
● Collaborated with other teams outside of the Commercial Data Platform (CDP) to support end-to-end testing, spanning application systems to Tableau system integration.
● Played a key role in the architectural design of the CDP data model, providing expertise and insights for optimal data management and performance.
● Prepared a comprehensive list of scenarios covering all layers of data to support thorough testing and validation processes.
● Conducted report validations in Tableau, ensuring the accuracy and integrity of the visualized data.
● Experience in creating and publishing live data sources and extracts in Tableau, utilizing custom SQL and data tables to meet the reporting requirements of end users.