Around 8 years of professional experience as a Azure Data Engineer , involved in developing, implementing, configuring Hadoop ecosystem components on Linux environment, development and maintenance of various applications using Python,
Developing strategic methods for deploying Big data technologies to efficiently solve Big Data processing requirement
Having good knowledge in Python programming language
Having good experience in Azure data bricks , Azure data lake,Azure Data Factory,GCP,Azure Synapse Analytics,Azure storage services such as Blob and Azure Key vault in Azure Credential Management protect access to applications and resources across the corporate data center and into the cloud with Identity and Access Management experience in Hadoop eco system components such as HDFS, MapReduce, Pig, Hive and Sqoop
Good understanding in processing of real-time data using Spark Hands on experience in Importing and exporting data from different databases like MySQL, Oracle, Teradata into HDFS using Sqoop Experience of working closely with various departments, conducting requirements workshops, documenting requirements specifications and developing complex ETL logic to load Managements Information marts and build Cubes for data analysis and build business critical report with Power BI.
Experience of working with complex data sets, carrying out data analysis to understand the relationships, anomalies, patterns etc. to provide robust valuable insights as well as data integration and reporting solutions to various departments in the business.
Successfully designed and delivered multiple projects that involved working with large data volumes on a variety of platforms such as SQL Server, Hadoop, Teradata, etc.
Proficient in writing complex SQL queries, building ETL packages with SSIS, building ETL pipelines with Azure Data Factory, Azure Databricks and Python code.
Experience of working with non-structured data like xml and JSON. Worked in Agile methodology and waterfall delivery process.
· Primarily involved in Data Migration using SQL, SQL Azure, Azure Data Lake, and Azure Data Factory, GCP.
· As a support to architect, designed administrated and build analytic tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics.
· Working with Source team to extract the data and it will be loaded in the ADLS Creating the linked service for source and target connectivity Based on the requirement.
· Once created, pipelines and datasets are triggered based on LOAD (History/Delta) operations.
· Based on source (big or small) data loaded files will be processed in Azure Databricks by applying operations in Spark SQL which will be deployed through Azure Data Factory pipelines.
· Involved in deploying the solutions to QA, DEV and PROD in azure Devops environment connecting through power shell to Azure.
· Professional in creating a data warehouse, design-related extraction, loading data functions, testing designs, data modeling, and ensure the smooth running of applications.
· Responsible for extracting the data from OLTP and OLAP using Azure Data factory and Databricks to Data Lake.
· Used Azure Databricks notebook to extract the data from Data Lake and load it into Azure, and On-prem SQL database.
· Worked with complete architecture and large data sets and high-capacity big data processing platform, SQL and Data Warehouse projects, Azure Synapse Analytics.
· Developed pipelines that can extract data from various sources and merge into single source datasets in Data Lake using Databricks.
· Generate and request certificates from trusted certificate authorities (CAs) or Azure Key Vault.
· Ensuring that the certificate's purpose (e.g., SSL/TLS, authentication) and key type (RSA, ECDSA) align with your data engineering needs.
· Store certificates securely to prevent unauthorized access. Azure Key Vault is a secure and centralized service for managing certificates and cryptographic keys.
· Deploy certificates to relevant Azure resources, such as virtual machines, Azure Kubernetes Service (AKS) clusters, or Azure App Service instances.
· Utilize Azure Automation or infrastructure-as-code tools (like Azure Resource Manager templates) for consistent and repeatable deployments.
· Utilize Azure RBAC (Role-Based Access Control) to grant the minimum required permissions to users and applications.
· Utilize Azure Private Link and Private Endpoints to ensure that communication between services and resources within Azure environment remains within the Azure network and is not exposed to the public internet.
· Implement multi-factor authentication for accessing the Azure Key Vault to enhance the security of certificate management.