Summary
Overview
Work History
Education
Skills
Certification
Projects
Timeline
Generic

Venkata Subba Reddy Bovilla

Chattanooga,TN

Summary

As a Databricks Data Engineer, experienced in designing and optimizing end-to-end data pipelines using Apache Spark and Databricks platform. Proficient in scalable data processing, real-time analytics, and data modeling for efficient insights extraction. Collaborative team player with strong communication skills, adept at translating business requirements into technical solutions. Committed to continuous learning and staying updated with emerging technologies in data engineering. Dedicated to leveraging data to drive business growth and innovation

Overview

1
1
Certification

Work History

Sr Cloud Data Engineer

DirecTV
02.2024 - Current
  • Experience in building and architecting multiple Data pipelines, end to end ETL process for Data ingestion and transformation in Azure
  • Design and architect various layer (Raw, Cleansed and curated) of Data Lake using Azure ADLSgen2
  • Provide Azure technical expertise including strategic design and architectural mentorship, assessments, POCs, etc., in support of overall lifecycle for the product development
  • Architect solutions using Azure Data technologies
  • Analyzing client data (AX, PROACTICS) using python, spark, spark SQL and define an end-to-end data lake presentation towards the team
  • Write PySpark program for spark transformation in Synapse Notebooks
  • Have copied multiple tables in bulk with lookup and for each in Azure Data Factory.
  • Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
  • Designed, implemented, and optimized data pipelines using Apache Spark and Databricks for efficient data processing and analytics.
    Conducted real-time analytics and implemented streaming solutions to extract insights from live data streams.
    Developed and maintained data models to support business intelligence and reporting requirements.
    Collaborated with cross-functional teams to understand business needs and translate them into technical solutions.
    Actively participated in continuous learning initiatives to stay updated with the latest advancements in data engineering.

Cloud Data Engineer

AT&T
  • Experience in building and architecting multiple Data pipelines, end to end ETL process for Data ingestion and transformation in Azure
  • Design and architect various layer of Data Lake using Azure ADLSgen2
  • Have applied data quality check on DLT Live tables like drop invalid records, retain invalid records, validating row counts across tables
  • Provide Azure technical expertise including strategic design and architectural mentorship, assessments, POCs, etc., in support of the overall invoice’s lifecycle for the product development
  • Architect solutions using Azure Data technologies
  • Analyzing client data using python, spark, spark SQL and define an end-to-end data lake presentation towards the team
  • Write PySpark program for spark transformation in Data bricks
  • Have copied multiple tables in bulk with lookup and for each in Data Factory

Data Engineer

TCS India
  • Experience in building and architecting multiple Data pipelines, end to end ETL process for Data ingestion and transformation in Azure
  • Design and architect various layer of Data Lake using Azure Data Lake storage gen2
  • Analysing client data using python, spark, spark SQL and define an end-to-end data lake presentation towards the team
  • Have applied data quality check on DLT Live tables like drop invalid records, retain invalid records, validating row counts across tables
  • Write PySpark program for spark transformation in Data bricks
  • Have copied multiple tables in bulk with lookup and for each in Azure Data Factory
  • Have done the near real time power shell data analysis.

Education

Master of Technology (M.Tech) -

National Institute of Technology (NIT)
07.2009

Skills

  • Apache Spark
  • Databricks
  • Data Pipeline Development
  • Scalable Data Processing
  • Real-time Analytics
  • Data Modeling
  • Spark SQL
  • PySpark
  • Windows Family and Linux
  • Jenkins, Ansible, Docker, Kubernetes, Azure DevOps
  • Having total 13 years IT experience in development and currently working as Sr Data Engineer Hands on Cloud, Azure Data storage gen2, Azure Data Factory, Azure Data Bricks, Azure Event hub, Azure Synapse Analytics, Azure Key Vault Hands on experience on architecting ETL transformation layers and writing spark jobs to do processing Have applied data quality check on DLT Live tables like drop invalid records validating row counts across tables Have good Programming experience with Python and SQL Have expert technical knowledge in Bigdata technologies mainly in Hadoop HDFS, Hive, Sqoop, Pig, Kafka and Spark Developed end to end Hive Queries to parse raw data, populated external & internal tables and store refined data in partitioned external tables RDBMS Tables have been imported/exported using SQOOP A good team player having ability to meet tight deadlines and work under pressure Systematic approach towards assignment with good analytical thinking, reasoning ability and multitasking ability Having work experience on Data Bricks Spark, Delta Lake

Certification

  • Microsoft certified Azure Fundamentals AZ-900
  • Databricks Certified Data Engineer Associate
  • Microsoft Certified: Azure Data Engineer Associate
  • Databricks Certified Data Engineer Professional

Projects

Stericycle UK, 02/2023, Present, 7, Sr Cloud Data Engineer, Azure Cloud, Spark, SQL, Synapse Analytics, Stericycle is a compliance company that specializes in collecting and disposing regulated medical waste, such as medical waste and sharps, pharmaceuticals, hazardous waste, and providing services for recalled and expired goods. It also provides related education and training services, and patient communication services., Experience in building and architecting multiple Data pipelines, end to end ETL process for Data ingestion and transformation in Azure. Design and architect various layer (Raw, Cleansed and curated) of Data Lake using Azure ADLSgen2. Provide Azure technical expertise including strategic design and architectural mentorship, assessments, POCs, etc., in support of the overall lifecycle for the product development. Architect solutions using Azure Data technologies. Analysing client data (AX, PROACTICS) using python, spark, spark SQL and define an end-to-end data lake presentation towards the team. Write PySpark program for spark transformation in Synapse Notebooks. Have copied multiple tables in bulk with lookup and for each in Azure Data Factory. SI (simplified invoices), 03/2021, 02/2023, 20, Sr Cloud Data Engineer, Azure Cloud, Databricks, Spark, Maersk is a Danish integrated shipping company, active in ocean and inland freight transportation and associated services, such as supply chain management and port operation. Maersk has been the largest container shipping line and vessel operator in the world, Experience in building and architecting multiple Data pipelines, end to end ETL process for Data ingestion and transformation in Azure. Design and architect various layer of Data Lake using Azure ADLSgen2. Have applied data quality check on DLT Live tables like drop invalid records, retain invalid records, validating row counts across tables. Provide Azure technical expertise including strategic design and architectural mentorship, assessments, POCs, etc., in support of the overall invoice’s lifecycle for the product development. Architect solutions using Azure Data technologies. Analysing client data using python, spark, spark SQL and define an end-to-end data lake presentation towards the team. Write PySpark program for spark transformation in Data bricks. Have copied multiple tables in bulk with lookup and for each in Data Factory. Have done the near real time data analysis. CRM RDP, 03/2020, 02/2021, Shell, 5, Cloud Data Engineer, Azure Cloud, Databricks, Spark, Shell is an Anglo-Dutch multinational oil and gas company headquartered in the Netherlands and Incorporated in England. It is one of the oil and gas 'super majors' and the third-largest company in the world measured by 2018 revenues., Experience in building and architecting multiple Data pipelines, end to end ETL process for Data ingestion and transformation in Azure. Design and architect various layer of Data Lake using Azure Data Lake storage gen2. Analysing client data using python, spark, spark SQL and define an end-to-end data lake presentation towards the team. Have applied data quality check on DLT Live tables like drop invalid records, retain invalid records, validating row counts across tables. Write PySpark program for spark transformation in Data bricks. Have copied multiple tables in bulk with lookup and for each in Azure Data Factory. Have done the near real time power shell data analysis. Ecom Big Data, 08/2019, 03/2020, US Foods, 5, Cloud Data Engineer and Machine learning with spark, Azure Cloud, Databricks, Spark, US Foods offers more than 350,000 national brand products and its own 'exclusive brand' items, ranging from fresh meats and produce to pre-package and frozen foods. And provides food and related products to more than 250,000 customers, including independent and multi-unit restaurants, healthcare and hospitality entities, government and educational institutions., Experience in building and architecting multiple Data pipelines, end to end ETL process for Data ingestion and transformation in Azure. Design and architect various layer of Data Lake using Azure Data Lake storage gen2. Analysing client data using python, spark, spark SQL and define an end-to-end data lake presentation towards the team. Loading Data to Synapse analytics using HDInsight, data bricks, Spark, Scala and PySpark. Write PySpark program for spark transformation in Data bricks. Have done the data quality checks like null values in data frames at cleaned layer. Have done the data migration to azure synapse data warehouse using Azure Data Factory. Have copied multiple tables in bulk with lookup and for each in Azure Data Factory. Have done the near real time log analysis. Have done the product recommendations using spark ml. GFR Finance Implementation, 01/2018, 08/2019, Tesco, 6, Data Engineer and spark Developer, Azure Cloud, SQL, Spark, Databricks, Tesco is a British multinational groceries and general merchandise retailer with headquarters in Welwyn Garden City, Hertfordshire, England, United Kingdom. it has shops in seven countries across Asia and Europe, and is the market leader of groceries in the UK (where it has a market share of around 28.4%), Ireland, Hungary and Thailand., Experience in building and architecting multiple Data pipelines, end to end ETL process for Data ingestion and transformation in Azure. Design and architect various layer of Data Lake using Azure Data Lake storage gen2. Analysing client data using python, spark, spark SQL and define an end-to-end data lake presentation towards the team. Have done the data quality checks like null values in data frames at cleaned layer. Loading Data to Synapse analytics using HDInsight, data bricks, Spark, Scala and PySpark. Write PySpark program for spark transformation in Data bricks. Have done the data migration to azure synapse data warehouse using Azure Data Factory. Have copied multiple tables in bulk with lookup and for each in Azure Data Factory. Retail RMS, RPM, 01/2017, 12/2017, FedEx, 6, Cloud Data Engineer, Azure Cloud Platform, Spark, Kafka, FedEx Corporation is an American multinational delivery services company. The company is known for its overnight shipping service and pioneering a system that could track packages and provide real-time updates on package location, a feature that has now been implemented by most other carrier services. FedEx is also one of the top contractors of the US government, Experience in building and architecting multiple Data pipelines, end to end ETL process for Data ingestion and transformation in Azure. Design and architect various layer of Data Lake using Azure Data Lake storage gen2. Analysing client data using python, spark, spark SQL and define an end-to-end data lake presentation towards the team. Have done the data quality checks like null values in data frames. Loading Data to Synapse analytics using HDInsight, data bricks, Spark, Scala and PySpark. Write PySpark program for spark transformation in Data bricks. Have done the data migration to azure synapse data warehouse using Azure Data Factory. Have copied multiple tables in bulk with lookup and for each in Azure Data Factory. Have implemented Kafka connect API (File Stream Source connector, Elastic search sink connector). Have done the near real time log analysis. Have done the product recommendations using spark ml. Home Serve USA, 06/2015, 12/2016, Home Serve USA, 5, Spark Developer, Home Serve USA Corp (Home Serve) is a leading provider of home repair solutions serving over 3 million customers across the US and Canada under the Home Serve, Home Emergency Insurance Solutions, Service Line Warranties of America (SLWA) and Service Line Warranties of Canada (SLWC) names. Home Serve protects homeowners against the expense and inconvenience of water, sewer, electrical, HVAC and other home repair emergencies by providing affordable repair service plans and quality local service through employed technicians and a network of independent contractors. Home Serve is dedicated to being a customer-focused company supplying repair plans and other services to consumers directly and through over 450 leading municipal, utility and association partners. The data will be stored in Hadoop file system and processed using Pig, Hive and Spark. Ingestion or acquisition of data will be done through Sqoop., Reading the files from the HDFS using spark and converting into Data frames and performing the operations as per the business requirements. Storing the processed data frames in Hive tables. Importing and exporting data into HDFS using SQOOP. Created different tables like external tables and internal tables in hive and load the data into them. Partitioned the hive tables to improve the performance of hive queries. Improve the performance of Sqoop imports and exports by using different types of methodologies. Exported the analysed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.

Timeline

Sr Cloud Data Engineer

DirecTV
02.2024 - Current

Cloud Data Engineer

AT&T

Data Engineer

TCS India

Master of Technology (M.Tech) -

National Institute of Technology (NIT)
  • Microsoft certified Azure Fundamentals AZ-900
  • Databricks Certified Data Engineer Associate
  • Microsoft Certified: Azure Data Engineer Associate
  • Databricks Certified Data Engineer Professional
Venkata Subba Reddy Bovilla