3+ years of experience as a Data Engineer in the fields of Database Development, Data Warehousing and BigData Technologies.Engaged in development of an enterprise-level solution using batch processing Apache Hive and Apache Spark.Understanding of Big Data and algorithms using Hadoop, HDFS, Map Reduce, Hive QL and Apache Spark (PySpark).Strong knowledge of Data Modeling (Facts and Dimensions, Star/Snowflake schemes), Data Migration, Data Cleansing, Data Transformation, ETL Processes and Strategic Data Architecture Designs. Extensive experience in Amazon Web Services (AWS) Cloud services such as EC2, S3, Redshift.Utilized Databricks on Apache Spark to design and optimize data processing workflows .Analyzed various reports, and dashboards using Tableau and Power BI Visualizations.Developed and implemented ETL processes to ensure efficient data extraction,transformation, and loading.
Implemented CI/CD pipelines on AWS using Terraform, Jenkins and Shell Scripting for automated builds.Capable of version control systems like Github.