Experienced Data Engineer with 6 years in developing, optimizing, and automating complex ETL/ELT pipelines leveraging AWS, Azure, Spark, Hadoop, and Snowflake. Skilled in Python and Scala for advanced data transformations and analysis. Certified AWS Associate Solutions Architect with hands-on experience in S3, DynamoDB, Glue, EMR, ECS, IAM, EC2, and Lambda. Proficient in data warehousing with Snowflake, Redshift, and relational (MySQL, PostgreSQL) and NoSQL (MongoDB, DynamoDB, HBase) databases. Expertise in ETL tools such as Talend and Informatica, as well as real-time data streaming and processing with Apache Kafka and Spark Streaming. Managed deploying containerized applications using Docker and Kubernetes, and managing infrastructure as code with Terraform. Forte in project management and SDLC methodologies, utilizing JIRA, GIT, Jenkins for CI/CD, Agile practices, and innovation strategies.
Environment: Apache Spark, Spark SQL, Databricks, Scala, Map Reduce, Azure, Tableau, Power BI, Python, Apache Airflow, Apache Kafka, Docker, Azure, Hive, Git, Jira, SQL, MongoDB, Agile.
Environment: Scala, Apache Spark, Python, S3, Hive, PySpark, Spark SQL, RDD, MapReduce, HDFS, Azure Data Factory, Azure Data Lake, Azure Functions, Hadoop, Kafka, Apache Airflow.
Environment: Python, Amazon Elastic Kubernetes Service, Informatica, ETL, Power BI, Tableau, AWS, Snowflake, Python, RESTFul, Docker, AWS Glue Data Catalog, MongoDB, SQL, AWS ECS.
Environment: AWS Kinesis, AWS S3, AWS EMR, AWS Lambda, Redshift, DynamoDB, Spark, Windows, Hive, Hadoop, Pig, Oracle, Microsoft SQL Server, MySQL, Tableau, Power BI, Terraform, Agile.