Credit Lead, Zions Bancorporation, Midvale, UT, Data Engineer, DataStage, Hive, Linux, ServiceNow, ADF, Azure DevOps, Git, Control M, 09/2023 - Present, Zions Bancorporation is a national bank headquartered in Salt Lake City, Utah. It operates as a national bank rather than as a bank holding company and does business under the following seven brands: Zions Bank, Amegy Bank of Texas, California Bank and Trust, National Bank of Arizona, Nevada State Bank, Vectra Bank Colorado, and the Commerce Bank of Washington., Extracted data from heterogeneous sources such as Oracle, Teradata, Greenplum, and Postgres and loaded it into a different database., Involved in creating Hive tables, loading data, and writing Hive queries., Worked on Branching, Tagging, Release Activities on Version control Tools like Git and Azure Repos., Deployed and designed pipelines through Azure data factory and debugging the process for errors., Detailed and problem-solving oriented in DataStage jobs and addressing production issues such as performance tuning and enhancements., Passing Parameters to Control-M job, creating job dependencies and alerts as per application team and business requirement., Developed and implemented software release management strategies for various applications in an agile environment. MIDAS Optimization, Staples Inc, Massachusetts, USA, Data Engineer, DataStage, Databricks, Python, PySpark, GitHub, Jenkins, Hadoop, Azure DevOps, 07/2022 - 02/2023, Staples Inc. is an American retail company. Migrating On-prem data to Azure cloud in Agile Methodology using Azure DevOps. Recreating the existing application logic and functionality in the ADF, ADL, Database and Datawarehouse., Extensively used the Azure services like ADF and Logic App for ETL to push in or out the data from Database to Blob storage and worked on different data formats CSV, JSON, and Parquet., Provisioned Hadoop and Spark Clusters to build the On-Demand Data warehouse and provided the data-to-Data Scientists. Processed HDFS data and created external tables using Hive, to analyze visitors per day, page views and most purchased products., Worked on creating dependencies of activities in ADF and creating stored procedures and scheduled them in Azure Environment and Experience in writing Python Scripts to automate the deployments. Used ADF as Orchestration tool for integration data from upstream to downstream systems., Used Azure DevOps services such as Azure Repos, Azure Boards, and Azure Test Plans to plan work and collaborate on code development, build and deployed applications. Experience in Jenkins to schedule a job as per the requirement, report monitoring and notification functionality to report success or failure. SCB FMETAL, Standard Chartered Bank, New York, USA, Data Engineer, DataStage, ADF, Databricks, Python, PySpark, GitHub, Jenkins, Hadoop, JIRA, Azure DevOps, 01/2022 - 06/2022, Funds Transfer Pricing provides financial institutions with the most effective decision-making platform for customer, organizational, product profitability, product pricing and balance sheet and resource allocations. Funds transfer pricing is an internal measurement and allocation process that assigns a profit contribution value to funds gathered and lent or invested by the bank., Designed and Developed user-defined functions, Stored procedures, Triggers for Hadoop Database., Developed Azure function apps as API services to communicate Database. Involved in build & Deployment of function apps from Visual Studio., Created low-level design documents based High-level design documents and delivered clear, well-communicated, and completed design documents., Worked on configuring Git branching Strategy to support the software development cycle to include processes, tools, and automation efforts., Integrated Active directory authentication to every database request. Performed Unit Testing, system Integration testing and User acceptance testing., Used Python and PySpark SQL to convert Hive/SQL native queries into Spark DF transformations in Apache Spark. Created, provisioned different Data Bricks cluster needed for batch and continuous streaming data processing and installed the required libraries for clusters. LEAD Program, Caterpillar Inc., Deerfield, Illinois, U.S., Data Engineer, DataStage, ADF, Teradata, Snowflake, GIT, Python, PySpark, Azure DevOps, 06/2020 - 12/2021, Caterpillar is the world's largest construction equipment manufacturer. We used Azure DevOps boards in Agile Scrum environment. We have been involved as ETL team from migrating Teradata objects into Snowflake (SaaS) environment by using DataStage and ADF., Played key role in migrating Teradata objects into Snowflake environment. I did reverse engineering and understood existing DataStage jobs and created solution documents., Created low-level design document based on high-level document. Based on the mapping sheet created DataStage jobs., Created ADF pipelines based on solution documents and performed unit testing and extensive experience in diagnosing and resolving the UAT and Production issues., Building the pipelines to copy the data from source to destination in ADF and taking backups of pipeline codes and scheduling the pipelines. Created Logic Apps with different triggers, connectors for integrating the data from workday to different destinations., Involved in creating Hive tables, loading with data and writing Hive queries which will run internally in MapReduce way. I worked on creating dependencies of activities in ADF and creating stored procedures and scheduled them in Azure Environment. Phoenix HUB Clustering, HSBC, ETL Developer, DataStage, ADF, ADLS, Databricks, Oracle, Teradata, Linux, Control M, ServiceNow, JIRA, 05/2016 - 05/2020, HUB Clustering is one of the initiatives that has been launched under a program called Phoenix. The instance of Hub running across 37 countries will be merged into 7 Cluster, using the Multi country HUB (MCH) feature. My project is modernizing the existing system, migrating on-premises data into Azure cloud, and replacing DataStage ETL jobs with Azure data factory., Involved in Interacting with client to understand the existing system, requirement, and business logic., Worked on Teradata utilities like Fast Load, Multi Load, TPump and BTEQ scripts., Successfully implemented pipelines and partitioning techniques and ensured load balancing of data., Worked on SCDs to populate Type I and Type II slowly changing dimensions tables from several operational source files. Generated surrogate keys by using surrogate stage and Transformer stage., Created before and after routines and subroutines and Transform functions used across the project. Involved in creating table definitions, indexes, sequences, views, materialized views., Involved in ongoing production support and process improvements. Ran the DataStage jobs through third party scheduler Control M., Experience in Developing Spark applications using Spark SQL, PySpark and Delta Lake in Databricks for data extraction, and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns., Implemented Azure logic Apps, Azure Functions, Azure Storage, and Service Bus Queues for large enterprise-level ERP Integration systems. Integra EDW Rebuild, Integra, Plainsboro Township, NJ, ETL Developer\Data Engineer, DataStage, Oracle, SQL server, Unix, Control M, ServiceNow, 06/2014 - 04/2016, Integra Life Sciences manages sales through Oracle ERP system, all orders and invoices will be generated through Oracle ERP. Whenever any new lead, order or invoice is created data will be pushed into Oracle database. We have been involved as an ETL team to pull data from Oracle database and to push into SQL database using DataStage., Extensive experience in extraction, transformation and loading of data directly from different heterogeneous source systems such as Flat files, Data Bases to different target systems., Used most of the stages in data stage like Sequential file, Dataset, Filter, Change Capture, Copy, Remove duplicates, Sort, Aggregates, Lookup, Join, Funnel, Transformer etc. Extensively worked on job sequences to control the execution of the job flow using various activities and triggers., Created local and shared containers to facilitate ease and reuse of jobs. Worked on sequence job like job activity, wait for file, email notification, sequencer, exception handler activity and execution command. Used Data Stage Director and Control M tool to run and monitor the Data Stage jobs., Involved in Performance tuning to improve the performance of the DataStage jobs by Created Reusable Transformations using Shared Containers.