Data engineering professional with solid history of creating and managing efficient data systems. Known for delivering impactful solutions through collaboration and results-driven approach. Recognized for expertise in data warehousing and ETL processes, along with adaptability to evolving project needs.
Leveraging history of driving successful projects and providing innovative solutions. Known for strong team collaboration and adaptability to evolving project requirements, ensuring reliable and efficient results. Proficient in advanced coding practices and project management.
Extensive background in leading end-to-end project architecture, design, and development for Data Migration, Data warehousing and Data Analytics projects. Expereience in using JIRA and ServiceNow for the release managemen process. Experience as a skilled technical lead and individual contributor for complex big data engineering projects. Good knowledge on Microsoft Fabric Used Microsoft Fabric to do a POC to migrate on premise data to Dedicated pool in synapse. Explored the option and feasibility of product migration to Microsoft to reduce licensing costs. Developed cloud reference architectures, governance policies, security models, and best practices. Implemented lake house architecture using databricks for different clients. Implemented role based access controls to restrict the users to access PII information. Extensive experience in developing ELT workflows and logic using PySpark and Cloud Technologies like Azure Databricks,Azure Data Factory and Azure Devops. Strong experience in building Azure Data Factory pipelines using mapping data flows. Strong experience in Functional design, Data warehouse design, Data Integration and Reporting and Analytics. Experience in loading data using ADF from different sources like SQL Server, My SQL, JSON, XML, CSV, SFTP, Salesforce, http etc. Experience in using Azure Blob/Azure Data Lake store. Experience in using Azure Databricks Notebooks with PySpark for processing and transforming massive amounts of data. Designed and implemented scalable data pipelines using Azure Databricks, enabling efficient data ingestion, transformation, and analysis for a high-volume dataset. Leveraged Azure Databricks and Apache Spark to process and analyze large-scale data, reducing data processing time by 30% and improving overall performance. Collaborated with cross-functional teams to develop and deploy machine learning models on Azure Databricks, resulting in improved predictive analytics and actionable insights for business stakeholders. Conducted data profiling and quality assessment, identifying and resolving data anomalies and inconsistencies to ensure data integrity and accuracy. Integrated Azure Databricks with Azure Data Lake Storage and Azure SQL Database to enable seamless data exchange and efficient data storage and retrieval. Optimized Spark job execution and cluster performance, implementing techniques such as partitioning, caching, and resource allocation optimization, resulting in a 20% reduction in processing costs. Design and implement reporting solutions using PowerBI Used Azure Purview as Data Governance solution Create Row level security with Power BI desktop and integration with PowerBI Service portal. Worked with table and matrix visuals, worked with different levels of filters like report level, visual level filter, page level filters. Provided technical guidance and support to team members, conducting training sessions on Azure Databricks and promoting best practices for data engineering and analytics. Practical knowledge in Azure Synapse analytics which is Azure SQL Data warehouse with massive parallel processing ability and loading data into Azure Synapse Analytics using PolyBase. Experience in using Azure Key Vault to store the connection strings and secrets. Experience in designing, developing and deploying end to end ETL/ELT solutions using Azure Data Factory, SQL Server integration Services (SSIS), SQL Server. Excellent understanding of data engineering and data modeling principles, ensuring the development of robust and scalable data solutions.