Stericycle UK, 02/2023, Present, 7, Sr Cloud Data Engineer, Azure Cloud, Spark, SQL, Synapse Analytics, Stericycle is a compliance company that specializes in collecting and disposing regulated medical waste, such as medical waste and sharps, pharmaceuticals, hazardous waste, and providing services for recalled and expired goods. It also provides related education and training services, and patient communication services., Experience in building and architecting multiple Data pipelines, end to end ETL process for Data ingestion and transformation in Azure. Design and architect various layer (Raw, Cleansed and curated) of Data Lake using Azure ADLSgen2. Provide Azure technical expertise including strategic design and architectural mentorship, assessments, POCs, etc., in support of the overall lifecycle for the product development. Architect solutions using Azure Data technologies. Analysing client data (AX, PROACTICS) using python, spark, spark SQL and define an end-to-end data lake presentation towards the team. Write PySpark program for spark transformation in Synapse Notebooks. Have copied multiple tables in bulk with lookup and for each in Azure Data Factory. SI (simplified invoices), 03/2021, 02/2023, 20, Sr Cloud Data Engineer, Azure Cloud, Databricks, Spark, Maersk is a Danish integrated shipping company, active in ocean and inland freight transportation and associated services, such as supply chain management and port operation. Maersk has been the largest container shipping line and vessel operator in the world, Experience in building and architecting multiple Data pipelines, end to end ETL process for Data ingestion and transformation in Azure. Design and architect various layer of Data Lake using Azure ADLSgen2. Have applied data quality check on DLT Live tables like drop invalid records, retain invalid records, validating row counts across tables. Provide Azure technical expertise including strategic design and architectural mentorship, assessments, POCs, etc., in support of the overall invoice’s lifecycle for the product development. Architect solutions using Azure Data technologies. Analysing client data using python, spark, spark SQL and define an end-to-end data lake presentation towards the team. Write PySpark program for spark transformation in Data bricks. Have copied multiple tables in bulk with lookup and for each in Data Factory. Have done the near real time data analysis. CRM RDP, 03/2020, 02/2021, Shell, 5, Cloud Data Engineer, Azure Cloud, Databricks, Spark, Shell is an Anglo-Dutch multinational oil and gas company headquartered in the Netherlands and Incorporated in England. It is one of the oil and gas 'super majors' and the third-largest company in the world measured by 2018 revenues., Experience in building and architecting multiple Data pipelines, end to end ETL process for Data ingestion and transformation in Azure. Design and architect various layer of Data Lake using Azure Data Lake storage gen2. Analysing client data using python, spark, spark SQL and define an end-to-end data lake presentation towards the team. Have applied data quality check on DLT Live tables like drop invalid records, retain invalid records, validating row counts across tables. Write PySpark program for spark transformation in Data bricks. Have copied multiple tables in bulk with lookup and for each in Azure Data Factory. Have done the near real time power shell data analysis. Ecom Big Data, 08/2019, 03/2020, US Foods, 5, Cloud Data Engineer and Machine learning with spark, Azure Cloud, Databricks, Spark, US Foods offers more than 350,000 national brand products and its own 'exclusive brand' items, ranging from fresh meats and produce to pre-package and frozen foods. And provides food and related products to more than 250,000 customers, including independent and multi-unit restaurants, healthcare and hospitality entities, government and educational institutions., Experience in building and architecting multiple Data pipelines, end to end ETL process for Data ingestion and transformation in Azure. Design and architect various layer of Data Lake using Azure Data Lake storage gen2. Analysing client data using python, spark, spark SQL and define an end-to-end data lake presentation towards the team. Loading Data to Synapse analytics using HDInsight, data bricks, Spark, Scala and PySpark. Write PySpark program for spark transformation in Data bricks. Have done the data quality checks like null values in data frames at cleaned layer. Have done the data migration to azure synapse data warehouse using Azure Data Factory. Have copied multiple tables in bulk with lookup and for each in Azure Data Factory. Have done the near real time log analysis. Have done the product recommendations using spark ml. GFR Finance Implementation, 01/2018, 08/2019, Tesco, 6, Data Engineer and spark Developer, Azure Cloud, SQL, Spark, Databricks, Tesco is a British multinational groceries and general merchandise retailer with headquarters in Welwyn Garden City, Hertfordshire, England, United Kingdom. it has shops in seven countries across Asia and Europe, and is the market leader of groceries in the UK (where it has a market share of around 28.4%), Ireland, Hungary and Thailand., Experience in building and architecting multiple Data pipelines, end to end ETL process for Data ingestion and transformation in Azure. Design and architect various layer of Data Lake using Azure Data Lake storage gen2. Analysing client data using python, spark, spark SQL and define an end-to-end data lake presentation towards the team. Have done the data quality checks like null values in data frames at cleaned layer. Loading Data to Synapse analytics using HDInsight, data bricks, Spark, Scala and PySpark. Write PySpark program for spark transformation in Data bricks. Have done the data migration to azure synapse data warehouse using Azure Data Factory. Have copied multiple tables in bulk with lookup and for each in Azure Data Factory. Retail RMS, RPM, 01/2017, 12/2017, FedEx, 6, Cloud Data Engineer, Azure Cloud Platform, Spark, Kafka, FedEx Corporation is an American multinational delivery services company. The company is known for its overnight shipping service and pioneering a system that could track packages and provide real-time updates on package location, a feature that has now been implemented by most other carrier services. FedEx is also one of the top contractors of the US government, Experience in building and architecting multiple Data pipelines, end to end ETL process for Data ingestion and transformation in Azure. Design and architect various layer of Data Lake using Azure Data Lake storage gen2. Analysing client data using python, spark, spark SQL and define an end-to-end data lake presentation towards the team. Have done the data quality checks like null values in data frames. Loading Data to Synapse analytics using HDInsight, data bricks, Spark, Scala and PySpark. Write PySpark program for spark transformation in Data bricks. Have done the data migration to azure synapse data warehouse using Azure Data Factory. Have copied multiple tables in bulk with lookup and for each in Azure Data Factory. Have implemented Kafka connect API (File Stream Source connector, Elastic search sink connector). Have done the near real time log analysis. Have done the product recommendations using spark ml. Home Serve USA, 06/2015, 12/2016, Home Serve USA, 5, Spark Developer, Home Serve USA Corp (Home Serve) is a leading provider of home repair solutions serving over 3 million customers across the US and Canada under the Home Serve, Home Emergency Insurance Solutions, Service Line Warranties of America (SLWA) and Service Line Warranties of Canada (SLWC) names. Home Serve protects homeowners against the expense and inconvenience of water, sewer, electrical, HVAC and other home repair emergencies by providing affordable repair service plans and quality local service through employed technicians and a network of independent contractors. Home Serve is dedicated to being a customer-focused company supplying repair plans and other services to consumers directly and through over 450 leading municipal, utility and association partners. The data will be stored in Hadoop file system and processed using Pig, Hive and Spark. Ingestion or acquisition of data will be done through Sqoop., Reading the files from the HDFS using spark and converting into Data frames and performing the operations as per the business requirements. Storing the processed data frames in Hive tables. Importing and exporting data into HDFS using SQOOP. Created different tables like external tables and internal tables in hive and load the data into them. Partitioned the hive tables to improve the performance of hive queries. Improve the performance of Sqoop imports and exports by using different types of methodologies. Exported the analysed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.