
Over 5 years of experience as a Data Engineer with expertise in Python, ETL, Informatica, Spark, Hadoop Ecosystem, AWS, and Snowflake. Extensive experience deploying cloud-based applications using Amazon Web Services such as Amazon EC2, S3, RDS, IAM, Auto Scaling, CloudWatch, SNS, Athena, Glue, Kinesis, Lambda, EMR, Redshift, and DynamoDB. Developing, implementation and optimization of data plumbing systems and ETL processes. Worked on ETL Migration services by developing and deploying AWS Lambda functions for generating a serverless data pipeline which can be written to Glue Catalog and can be queried from Athena. Experience in Python programming in application development over several years. Preparing Python scripts by using Python, pandas, NumPy and SQL Alchemy for ETL. Experience in analyzing data using HiveQL, HBase and custom Map Reduce programs in Python. Good knowledge of Joins, group and aggregation concepts and resolved performance issues in Hive and Spark. Experience in working on Data modelling (Dimensional & Relational) concepts like Star-Schema Modelling, Schema Modelling, Fact and Dimension tables. Experience in working with NoSQL data stores like HBase, DynamoDB. Experience in writing test cases, static code analysis and CICD process using Git, Jenkins. Experience in Object Oriented Analysis Design (OOAD) and development.