Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic
Ravi Adapaka

Ravi Adapaka

FRISCO,Texas

Summary

Data engineering professional with solid history of creating and managing efficient data systems. Known for delivering impactful solutions through collaboration and results-driven approach. Recognized for expertise in data warehousing and ETL processes, along with adaptability to evolving project needs.

Leveraging history of driving successful projects and providing innovative solutions. Known for strong team collaboration and adaptability to evolving project requirements, ensuring reliable and efficient results. Proficient in advanced coding practices and project management.

Extensive background in leading end-to-end project architecture, design, and development for Data Migration, Data warehousing and Data Analytics projects. Expereience in using JIRA and ServiceNow for the release managemen process. Experience as a skilled technical lead and individual contributor for complex big data engineering projects. Good knowledge on Microsoft Fabric Used Microsoft Fabric to do a POC to migrate on premise data to Dedicated pool in synapse. Explored the option and feasibility of product migration to Microsoft to reduce licensing costs. Developed cloud reference architectures, governance policies, security models, and best practices. Implemented lake house architecture using databricks for different clients. Implemented role based access controls to restrict the users to access PII information. Extensive experience in developing ELT workflows and logic using PySpark and Cloud Technologies like Azure Databricks,Azure Data Factory and Azure Devops. Strong experience in building Azure Data Factory pipelines using mapping data flows. Strong experience in Functional design, Data warehouse design, Data Integration and Reporting and Analytics. Experience in loading data using ADF from different sources like SQL Server, My SQL, JSON, XML, CSV, SFTP, Salesforce, http etc. Experience in using Azure Blob/Azure Data Lake store. Experience in using Azure Databricks Notebooks with PySpark for processing and transforming massive amounts of data. Designed and implemented scalable data pipelines using Azure Databricks, enabling efficient data ingestion, transformation, and analysis for a high-volume dataset. Leveraged Azure Databricks and Apache Spark to process and analyze large-scale data, reducing data processing time by 30% and improving overall performance. Collaborated with cross-functional teams to develop and deploy machine learning models on Azure Databricks, resulting in improved predictive analytics and actionable insights for business stakeholders. Conducted data profiling and quality assessment, identifying and resolving data anomalies and inconsistencies to ensure data integrity and accuracy. Integrated Azure Databricks with Azure Data Lake Storage and Azure SQL Database to enable seamless data exchange and efficient data storage and retrieval. Optimized Spark job execution and cluster performance, implementing techniques such as partitioning, caching, and resource allocation optimization, resulting in a 20% reduction in processing costs. Design and implement reporting solutions using PowerBI Used Azure Purview as Data Governance solution Create Row level security with Power BI desktop and integration with PowerBI Service portal. Worked with table and matrix visuals, worked with different levels of filters like report level, visual level filter, page level filters. Provided technical guidance and support to team members, conducting training sessions on Azure Databricks and promoting best practices for data engineering and analytics. Practical knowledge in Azure Synapse analytics which is Azure SQL Data warehouse with massive parallel processing ability and loading data into Azure Synapse Analytics using PolyBase. Experience in using Azure Key Vault to store the connection strings and secrets. Experience in designing, developing and deploying end to end ETL/ELT solutions using Azure Data Factory, SQL Server integration Services (SSIS), SQL Server. Excellent understanding of data engineering and data modeling principles, ensuring the development of robust and scalable data solutions.


Overview

13
13
years of professional experience
1
1
Certification

Work History

Lead Developer

Truist Bank
08.2023 - Current
  • Supported product owners in defining clear requirements for feature enhancements or new functionality requests through active participation during backlog grooming sessions.
  • Designed robust database architecture that supported seamless integration of new datasets and facilitated rapid analysis capabilities.
  • Improved application performance by optimizing code, implementing caching strategies, and conducting regular profiling sessions.
  • Consistently met project objectives by setting realistic goals, breaking down complex tasks into manageable subtasks, and monitoring progress towards key milestones.
  • Implemented proactive monitoring tools for early detection and resolution of potential database issues.
  • Led successful migration projects from legacy systems to modern database platforms while minimizing downtime risks.
  • Enhanced user experience by devising intuitive interfaces for database interaction, enabling easier access to information.
  • Resolved critical production issues by troubleshooting complex problems quickly and efficiently under tight deadlines.

Lead Data Engineer

Reckitt Benckiser
01.2023 - 07.2023
  • Worked on data modelling around business needs by gathering requirements from business stakeholders
  • We have implemented a delta lakehouse architecture using meta data driven framework
  • Extensive experience in developing ELT workflows and logic using PySpark, Python and Cloud Technologies like Azure Databricks, Azure Data Factory and Azure Devops
  • Handled Data security through out the framework
  • Incorporated RBAC, MFA using Azure Active Directory
  • We have implemented dynamic data masking of data(PII data)
  • Network security, secure data movement, data governance and compliance, auditing and logging has been taken care though out the project
  • Utilized Azure Databricks and Apache Spark for data cleansing, aggregation, and enrichment, ensuring data quality and accuracy for downstream analytics
  • Developed and executed SQL, PySpark and Python scripts for data manipulation and transformation within Azure Databricks notebooks
  • Implemented cost optimization techniques on a project level
  • Handled code optimization
  • Understanding client requirements and interacting with various stakeholders for requirement gathering and analysis for cloud environment
  • Worked with multiple file formats like JSON, PARQUET, CSV for data processing and ingestion
  • Building end to end pipeline in Azure Environment
  • Experience in Batch-Data processing for different domains using Azure Data Factory and Azure Databricks for data ingestion into Azure ADLS Gen2
  • Environment: Azure Data Factory, Azure Databricks, Azure SQLDB, Azure Logic Apps, Python, PySpark, SparkSQL, PowerBI, Azure Purview
  • Developed scalable infrastructure capable of handling vast amounts of structured and unstructured data, improving overall system performance.

Lead Data Engineer

Mars
08.2022 - 12.2022
  • Worked on architecture design for multi state implementation or deployment
  • Processed external sources data which is hosted Hadoop environment using Jupyter notebooks
  • Bulk loading from external stage, internal stage to snowflake using the COPY command
  • Day to day responsibilities includes developing ETL pipelines in and out of datawarehouse, develop major regulatory and financial reports using advanced SQL queries in snowflake
  • Build docker images to run airflow on local environment to test the ingestion as well as ETL pipelines
  • Created Airflow DAGs to schedule the ingestions, ETL jobs and various business reports
  • Performed data quality issue analysis using Snow SQL by building analytical warehouses on snowflake
  • Peform troubleshooting analysis and resolution of critical issues
  • Support production environment and debug issues
  • Environment: JupyterLab, Python(Pandas), Airflow, Snowflake, Kubernetes, PowerBI

Lead Data Engineer/Azure Data Engineer

Fulton Bank
04.2021 - 07.2022
  • Project architecture implementation
  • Involved in setting up of project environment in Azure
  • Setup the metadata database driven framework for initial sources and migrated the data to Azure Synapse SQL Pools
  • Built automated ETL pipelines and Synapse notebooks to pull data from different sources and move it across different data layers in data lake and storing it to Synapse warehouse tables
  • Developed ETL pipeline to connect to various sources and extract data to raw data layer in azure data lake
  • Built various pipelines to move data from raw layer to processed layer in azure data lake
  • By applying bussing logics(SCD2) manipulated the data in processed layer and saving it as delta tables
  • Created spark tables on top of delta tables using spark pool for various data marts
  • Created a dedicated sql pool as an immutable zone, integrate spark tables from different sources and creating fact and dimension tables
  • Have used hash and round-robin distributions on immutable layer tables wherever necessary
  • Implemented data governance policies, data encryption and auding mechanism Azure Synapse Analytics
  • Strong problem-solving and troubleshooting skills, with a track record of resolving complex data and analytics challenges using Azure Synapse Analytics
  • Had numerous client interactions to gather requirements and for data analysis
  • Mentor junior resources on technologies, project related activities and help them getting the right amount of work
  • Helped client team to understand project architecture framework by taking various training sessions
  • Real-time alerts, logging
  • Environment: Azure Synapse Analytics, Azure SQL DB, Azure Logic Apps, Python, PySpark, Spark SQL, SQL PowerBI

Senior Data Engineer

CHL (Cornerstone Home Loans)
07.2019 - 03.2021
  • Used Data Factory as an orchestrator, built various Azure Data Factory pipelines to move data across different zones
  • Created table schema/structure in Azure SQL Datawarehouse solution
  • Used Azure Databricks move and transform js on files to delta archive tables to maintain SCD Type2 data
  • Automated Delta Archive process using Databricks with Scala to fulfill the requirement
  • Handling schema changes when there’s any changes in schema for existing tables in source, these changes automatically taken care by SQL Scripts and updates the changes into Azure SQL tables/Datawarehouse tables
  • Implemented Log Analytics to capture logs different Azure services used in the project
  • Written different log analytics queries (Kusto Query Language) to view logs
  • Created technical/architectural documentations
  • Creating Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform, and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool and backwards
  • Extract Transform and Load (ETL) data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL and U-SQL Azure Data Lake Analytics, Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in Azure Databricks
  • Developed Spark application using Pyspark and Spark-SQL for data extraction, Transformation, and aggregation from multiple file formats for analyzing and transforming the data to uncover in sights into customer usage pattern
  • Environment: Azure Data Factory, Azure Databricks, Azure SQL DB, Python, PySpark, Azure Log Analytics, Azure Logic Apps

Senior Data Engineer

Shearman & Sterling
03.2017 - 06.2019
  • Used Data Factory as an orchestrator, built various Azure Data Factory pipelines to move data across different zones
  • Using Databricks moving CSV file data from landing to refined zone and storing the data Delta tables in refined zone
  • Used Azure Databricks notebooks for data transformation in Refined zone
  • Used Azure Databricks to flatten JSON source into a table structure
  • Created table schema/structure in Azure SQL (Immutable zone)
  • Implemented Log Analytics to capture logs different Azure services used in the project
  • Written different log analytics queries (Kusto Query Language) to view logs
  • Created Azure Automation account to run Power Shell Runbooks to fetch logs of storage account, convert them to and send them to Log Analytics
  • Configured different Alerts on Data factory Pipelines status to get mail notification if there is any failure
  • Created custom alerts on Storage accounts
  • Worked on technical documentation and make it available in Dev Ops wiki
  • Environment: Azure Data Factory, Azure Databricks, Azure SQL DB, Python, PySpark, Azure Log Analytics, Azure Logic Apps, Azure Automation

Data Engineer

Fidelity National Financial
01.2016 - 02.2017
  • Prepare case studies, understand the client requirements and provide cost effective solution
  • Gather requirements, do data analysis and prepare Function specification documents for the System
  • Estimate the project timeline and plan the resource
  • Preparation of detail technical design document for the System
  • Design & develop ETL/Reporting Solutions for analytical reporting capabilities
  • Design & develop Azure Data Factory like Pipelines, Datasets, Linked services, Integration Run Time, Lift and Shift packages using SSIS Integration Run Time, Templates in JSON and deployment using Power Shell script
  • Design & develop Azure Infrastructure like Resource Groups, Virtual Machines, Storage Accounts, SQL databases etc., and Templates for deploying them using JSON and Power Shell
  • Analyze the root cause of issues and resolve them
  • Resolve multiple data quality issues by tracking them back from reporting layers to ETL & identifying areas of resolution
  • Develop Power BI Dashboards, Reports, implement Row Level Security for Various Work Group spaces, Power BI Gateway Installation for refreshing Dashboards, reports to get latest data
  • Environment: MS SQL Server 2012, MSBI (SSIS, SSRS), Azure Data Factory, Azure Infrastructure, Power BI, Crystal Reports

Data Engineer

Philip Morris International
09.2013 - 12.2015
  • Created ETL (SSIS) process to transfer data from OLTP Files to data warehouse and monitored jobs
  • Identified and resolved the errors raised by job failures
  • Preformed QA Testing for Store Procedures to validate the CRM (Customer Relationship Management) supported by MINT
  • Used Control Flow Tasks like For Loop Container, For Each Loop Container, Sequential Container, Execute SQL Task and Data Flow Task
  • Creating Packages on SSIS by using different data Transformations like Derived Column, Lookup, Conditional Split, Merge Join, Sort and Execute SQL Task to load data into Database
  • Worked on various enhancements at database level, ETL and cube level
  • Done unit testing of changes in Stored Procedure and tested the stored procedure against various conditions
  • Analyzed the requirements to implement the tasks/changes assigned to me
  • Created the SSRS reports like document map, and by using v-lookup functions
  • Implemented dynamic SSRS reports for job statuses
  • Created the test packages in DEV and QAS environments to check the data correctness
  • Modified the stored procedure as per requirement
  • Involved in data transformation
  • Environment: SQL Server 2008 R2, SSIS, SSAS, SSRS

Software Developer

Dick’s Sporting Goods
05.2012 - 08.2013
  • Extracting data from multiple sources like flat files, SQL server and transforming, cleansing, validating data and loading into destination
  • Using all kinds of SQL Server Constraints (Primary Keys, Foreign Keys, Defaults, Check and Unique etc) and Generating Transact SQL (T-SQL) Queries, Sub queries
  • Based on the requirement created Stored Procedures, joins, triggers, indexes on tables
  • Extracted data from sources and transformed the data using different transformations like data conversion, derived columns, look up, Conditional Split, Aggregate, Union all, merge join and multi cast transformations
  • Created Event Handlers for the Packages Using Event Handler Tab for Error Handling
  • Created Configurations to change the package properties dynamically
  • Maintained log information in SQL table to tracking Errors and recording package execution status
  • Tune the performance of long running Packages by Analyzing tables, Indexes, Hints
  • Generated Sub reports, Drilldown reports, Drill through reports, parameterized reports and linked reports from queries in SSRS
  • Environment: SSIS, SSRS and SQL SERVER 2008

Education

Bachelor’s - computer science

JNTUK
India
01.2011

Skills

  • Azure Data Factory
  • Azure Databricks
  • Airflow
  • Azure Synapse Analytics
  • SSIS
  • Dremio
  • Azure Data Lake Storage
  • Azure Log Analytics
  • Azure Key Vault
  • Azure Logic Apps
  • Azure DevOps
  • Azure Alerts
  • GitHub
  • Power BI
  • DAX
  • SSRS
  • T-SQL
  • PySpark
  • Python
  • SparkSQL
  • KQL
  • MSSQL Server
  • Data Modelling
  • MDM
  • Azure SQL
  • Snowflake
  • Agile, Scrum
  • JIRA
  • ServiceNow
  • Microsoft Fabrics

Certification

  • Data Engineering on Microsoft Azure, DP – 203
  • Microsoft Certified Professional (MCP), Querying Language – 70-461
  • Fundamentals of Databricks Lakehouse Accreditation

Timeline

Lead Developer

Truist Bank
08.2023 - Current

Lead Data Engineer

Reckitt Benckiser
01.2023 - 07.2023

Lead Data Engineer

Mars
08.2022 - 12.2022

Lead Data Engineer/Azure Data Engineer

Fulton Bank
04.2021 - 07.2022

Senior Data Engineer

CHL (Cornerstone Home Loans)
07.2019 - 03.2021

Senior Data Engineer

Shearman & Sterling
03.2017 - 06.2019

Data Engineer

Fidelity National Financial
01.2016 - 02.2017

Data Engineer

Philip Morris International
09.2013 - 12.2015

Software Developer

Dick’s Sporting Goods
05.2012 - 08.2013

Bachelor’s - computer science

JNTUK
Ravi Adapaka