Location: Toronto, ON A seasoned Data Engineer with leadership skills, possessing extensive experience, and a focus on delivering results. Demonstrates resourcefulness and effective problem-solving abilities. Successfully navigated tight release schedules and achieved success. With over 5+ years of diverse IT experience, including the development and implementation of various applications across big data and mainframe systems. PROFILE SUMMARY: Total 5+ years of comprehensive experience as a Data Engineer, Hadoop, Big Data &Analytics Developer. Proficient in Hadoop architecture and its ecosystem components including HDFS, MapReduce, Pig, Hive, Sqoop, Flume. Thorough comprehension of Hadoop daemons like Job Tracker, Task Tracker, Name Node, Data Node, as well as MRV1 and YARN architecture. Experienced in installing, configuring, managing, supporting, and monitoring Hadoop clusters using different distributions including Apache Hadoop, Cloudera Horton works, and various cloud platforms such as AWS and GCP. Execute a one-time data migration of multi-state level data from SQL Server to Snowflake utilizing Python and Snow SQL. Create Docker images to enable Airflow execution in a local environment for testing ingestion and ETL pipelines. Proficient in installing and configuring various components of the Hadoop stack, including MapReduce, HDFS, Hive, Pig, Sqoop, Flume, and Zookeeper. Examined the impact of changes on existing ETL/ELT processes to ensure the punctual completion and availability of data in the data warehouse for reporting purposes. Experience in Developing Spark applications using Spark - SQL In Data-bricks for data extraction, transformation, and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns. Extensive experience in writing and implementing complex test plans, design, development and execution of test scripts for system, Data integration, user acceptance (UAT) and regression testing. Having a strong foundation in JCL (Job Control Language), I have developed, maintained, and optimized many applications, guaranteeing reliable and efficient performance in a range of IT environments. Because of my extensive expertise in JCL, COBOL, CICS, and DB2, I can provide complete, integrated solutions. Worked on source version control tools such as Subversion (SVN), TFS and GIT. Designed execute features of ATDD/BDD using selenium and cucumber. Expertise in developing automation scripts in BDD format using cucumber and proficient in writing scenarios in GHERKIN format. Ample knowledge on Apache Kafka, Apache Storm to build data platforms, pipelines, and storage systems and search technologies such as Elastic search. Good knowledge of Data Marts, OLAP, Dimensional Data Modeling with Ralph Kimball Methodology (Star Schema Modeling, Snow-Flake Modeling for FACT and Dimensions Tables) using Analysis Services Expertise in writing custom Kafka consumer code and modifying existing producer code in Python to push data to Spark-streaming jobs. Skilled in System Analysis, E-R/Dimensional Data Modeling, Database Design and implementing RDBMS specific features. Shown excellent resilience by successfully adjusting to shifting project needs, resolving technical difficulties, and sustaining high performance under duress, all of which contributed to the success and continuity of the project. With a great deal of knowledge in data validation, I have created and put into place dependable processes to ensure data reliability, precision, and integrity in a variety of situations and projects. I have extensive development, execution, and reporting experience in all phases of data validation.