Summary
Overview
Work History
Education
Skills
Websites
Certification
Additional Skills And Activities
Technical Summary
Timeline
Generic

Praveen Kumar Bhavirisetti

Austin,TX

Summary

A Data Engineer and ETL Developer with over 15 years of experience in the analysis, design, development, and support of complex data warehousing and ETL solutions. I have successfully contributed to numerous data integration and migration projects, demonstrating deep technical proficiency in ETL tools (Informatica PowerCenter, IICS, Apache Airflow), proficiency in writing complex SQL queries in the databases (Snowflake, Teradata, Oracle, SQL Server, BigQuery, Hive), cloud technologies (Google Cloud Platform), and big data technologies (Hadoop, PySpark). I have worked across multiple industries, including healthcare, telecommunications, and technology, gaining invaluable expertise in optimizing data workflows and supporting business-critical applications.

Overview

17
17
years of professional experience
1
1
Certification

Work History

Data Engineer

Wipro Ltd.
02.2023 - Current
  • Client: Apple, Austin, TX
  • As a Data Engineer at Wipro, I played a key role in developing and optimizing a sales data application used across various cross-functional teams
  • This application supported business decision-making through dynamic data aggregation, materialization, and sales hierarchy management
  • Requirement Gathering and Analysis: Engaged with business stakeholders to define and refine data requirements for reporting and analytics related to sales hierarchies, product forecasting, and route-to-market changes
  • This allowed for tailored reporting frameworks and custom hierarchies within the application
  • ETL Development and Optimization: Transformed and optimized Teradata stored procedures for Snowflake, ensuring efficient data processing and aggregation
  • Leveraged Apache Airflow to orchestrate these processes, ensuring reliable scheduling, error handling, and logging
  • Integrated the SQL queries within Python scripts, orchestrating them through the Keystone Framework jobs for efficient data processing and automation
  • Data Quality and Testing: Implemented rigorous testing practices, including the use of MINUS queries for data validation between source and target systems
  • Also used Time-Travel queries in Snowflake to troubleshoot data discrepancies following data refresh cycles
  • Deployment and Version Control: Led code deployment activities, following Git best practices for version control
  • Coordinated the creation of deployment plans, feature branches, and pull requests, ensuring seamless production rollouts
  • Tool Expertise: Snowflake SQL, Python, Apache Airflow, BAP upload tool, Keystone framework, Teradata BTEQ
  • Client: Apple, Austin, TX

Senior ETL Developer/Data Engineer

MasTech Digital
09.2021 - 01.2023
  • Client: Verizon, TX
  • I was responsible for developing and migrating ETL processes in a cloud environment, specifically targeting Verizon's digital analytics module
  • This included transitioning data processing from on-premise Hadoop to Google Cloud Platform (GCP) BigQuery and PySpark in DataProc
  • ETL Migration and Development: Designed and implemented the migration of critical ETL jobs from Hadoop Hive to GCP BigQuery, utilizing PySpark for distributed data transformations
  • Ensured data consistency and integrity through robust testing and validation scripts
  • Data Orchestration: Developed and optimized Composer DAGs to automate and manage job scheduling in Google Cloud Composer, ensuring that critical data pipelines ran efficiently and met SLAs
  • Data Validation and Quality Assurance: Performed detailed data validations, comparing data between old (Hadoop-based) and new (GCP-based) environments to ensure consistency and accuracy
  • Wrote complex BigQuery SQL queries for metadata comparison and job validation
  • Cloud and Big Data Tools: BigQuery, Google Composer, DataProc, Apache Airflow, PySpark, Teradata utilities (BTEQ), MLOAD, Informatica PowerCenter IICS
  • Client: Verizon, TX

Senior ETL Developer/Data Engineer

Accenture
07.2016 - 09.2021
  • Client: Independence Blue Cross, Philadelphia, PA
  • I led the development and enhancement of data integration solutions for Independence Blue Cross (IBC), focusing on Informatica PowerCenter and Teradata
  • These efforts supported multiple initiatives, including healthcare campaigns and member onboarding
  • ETL Design and Development: Built complex Informatica PowerCenter workflows and Teradata jobs for data extraction, transformation, and loading into the data warehouse
  • This process involved moving data from multiple healthcare platforms into a centralized data repository for reporting and analytics
  • Onboarding and Campaign Management: Designed and implemented an ETL process for the onboarding of new healthcare members, including data transfers via SFTP and file gateway systems, ensuring timely and accurate data for new enrollee processes
  • Agile Methodology: Actively participated in Agile Scrum methodologies, attending daily stand-ups, sprint planning, and retrospectives, ensuring that all data-related tasks were completed on time
  • Automation and Optimization: Utilized shell scripting and Informatica scheduling to automate data processing tasks, ensuring better resource utilization and minimizing manual intervention
  • Tools Used: Informatica PowerCenter, Teradata, Unix Shell Scripting, IBM Tivoli Workflow Scheduler, Git/Stash, Jenkins
  • Client: Independence Blue Cross, Philadelphia, PA

Senior ETL Developer/Data Engineer

Accenture
04.2013 - 06.2016
  • Client: Independence Blue Cross, Philadelphia, PA
  • During my tenure at Accenture, I worked on complex data integration solutions for Independence Blue Cross, focusing on data migration, automation, and campaign support for the healthcare provider
  • Data Migration and ETL Development: Designed and developed ETL jobs to move healthcare data from various legacy systems into a production-ready Teradata EDW, supporting downstream reporting and analytics
  • Campaign Data Support: Assisted marketing teams by designing data pipelines for tracking campaign performance and providing data-driven insights for healthcare-related campaigns
  • Data Validation and Testing: Developed robust MLOAD scripts for quick data loading and validation, ensuring the quality and completeness of data after migration
  • Tools Used: Informatica PowerCenter, Teradata BTEQ, Shell Scripting, Oracle DB, Unix
  • Client: Independence Blue Cross, Philadelphia, PA

ETL Developer

Cognizant
02.2011 - 04.2013
  • Client: The McGraw-Hill Company
  • Contributed to projects involving data extraction and transformation using Informatica PowerCenter for McGrawHill Company
  • Led efforts on data synchronization between MS SQL Server and Oracle DB, ensuring seamless integration between various systems
  • Client: The McGraw-Hill Company

ETL Developer

Hewlett Packard
08.2008 - 01.2011
  • Designed and developed ETL mappings as part of the data warehouse initiative, handling large-scale data transformation and loading into the CDRW DB

Education

Master of Technology (M.Tech) - Computer Integrated Manufacturing

National Institute of Technology
01.2008

Bachelor of Engineering (B.E.) - Mechanical Engineering

Swami Ramanand Teerth Maratwada University
01.2006

Skills

  • ETL Development, Data Warehousing, Data Analysis and data profiling
  • DataBases: Snowflake DB, Oracle DB, Teradata and TD utilities such as BTEQ, FLOAD, MLOAD
  • Google Cloud Platform (GCP) services - BigQuery, Cloud Storage, Cloud Composer, Data Proc
  • Big Data Hadoop - Hive SQL, HDFS
  • Programming Languages: Python & Shell Scripting
  • Data Pipeline (Keystone)
  • ETL Tools: Informatica PowerCenter & IICS
  • UNIX & Shell scripting
  • PySpark
  • Deployment: GitHub, Jenkins, GIT Stash
  • Other Tools: Putty, WinSCP, Jupyter Notebooks, TD SQL Assistant
  • Schedulers: IBM Tivoli Workflow Scheduler, Informatica Scheduler, Cron Jobs, Apache Airflow Schedulers, ESP (Elastic Scheduling Platform)

Certification

  • Google Cloud Certified - Associate Cloud Engineer, 02/01/25
  • Informatica Certified Developer, 2014

Additional Skills And Activities

  • Extensive experience in data profiling, data migration, and data quality assurance across multiple platforms.
  • Proficient in creating custom reports and ad-hoc data queries for internal stakeholders.
  • Expertise in implementing automated data pipelines for both batch and real-time processing environments.
  • Active participant in code reviews and test case development.

Technical Summary

Informatica PowerCenter, Informatica IICS, Teradata, Oracle, SQL Server, Hive, BigQuery, Snowflake SQL, Shell scripting, Python, Teradata BTEQ, TPT scripting, MLOAD, FASTLOAD, Google Cloud Platform (GCP), BigQuery, Cloud Composer, DataProc, DataFlow, HDFS, Hive, PySpark, GitHub, Stash, Jenkins, TFS, Putty, WinSCP, SSHCA, Jupyter Notebooks, IBM Tivoli Workflow Scheduler, Cron Jobs, Apache Oozie, ESP (Elastic Scheduling Platform)

Timeline

Data Engineer

Wipro Ltd.
02.2023 - Current

Senior ETL Developer/Data Engineer

MasTech Digital
09.2021 - 01.2023

Senior ETL Developer/Data Engineer

Accenture
07.2016 - 09.2021

Senior ETL Developer/Data Engineer

Accenture
04.2013 - 06.2016

ETL Developer

Cognizant
02.2011 - 04.2013

ETL Developer

Hewlett Packard
08.2008 - 01.2011

Bachelor of Engineering (B.E.) - Mechanical Engineering

Swami Ramanand Teerth Maratwada University

Master of Technology (M.Tech) - Computer Integrated Manufacturing

National Institute of Technology
Praveen Kumar Bhavirisetti