Summary
Overview
Work History
Education
Skills
Languages
Timeline
Generic

Sai Bhavana Gulla

Seattle,WA

Summary

Recent Master’s graduate with strong hands-on experience in Big Data technologies, specializing in PySpark and AWS cloud services. Skilled in designing and deploying scalable data pipelines, debugging production workloads, and optimizing Spark performance for large-scale data processing.

Overview

4
4
years of professional experience

Work History

Big Data Engineer

T-Mobile
12.2024 - 05.2025
  • For data injection from Oracle, MySQL to S3 which can be queried using hive and spark SQL tables.
  • Worked on Sqoop jobs for injecting data from MySQL to Amazon S3.
  • Created hsource and destination. Validate the source and final output data.
  • Used Hive external tables for querying the data.
  • Use Spark Dataframe APIs to inject Oracle data to S3 and store it in Redshift. Write a script to get RDBMS data to Redshift.
  • Optimize Hive and and Spark performance. and identify the errors using logs.
  • Automatically scale-up the EMR Instances based on the data. Apply transformation rules on the top of DataFrames.
  • To run and schedule the spark script in EMR pipes. Process Hive, csv, json, oracle, data at a time(POC).

Big Data Engineer

IBM
07.2022 - 11.2023
  • For data injection from Oracle, MySQL to S3 which can be queried using hive and spark SQL tables.
  • Prepared Kafka producer code to get data from web logs. Installed and managed Kafka cluster in dev environment.
  • With the help of Kafka Consumer API's get data from Kafka process using Spark streaming.
  • Run SQL queries on top of spark streaming.
  • Spark structure streaming process using PySpark.
  • Created regular expressions to clean data properly.
  • Created udf to clean data, fulfill client requirements.
  • Integrated spark and Cassandra with proper version compatibility.
  • Automated all these activities using Oozie in AWS environment.

Data Analyst Intern

Virtusa
08.2021 - 06.2022


  • Analyzed data sets to identify trends and support decision-making processes and performed data cleaning over 50+ tables to improve quality and accuracy.
  • Developed 100+ SQL queries to extract, manipulate and analyze datasets for reporting purposes.
  • Automated reporting using power query and advanced Excel functions in Power BI.

Education

Master of Science - Computer Science

University of Illinois At Springfield
Springfield, IL
05-2025

Computer Science

Jawaharlal Nehru Technological University
India
07-2018

Skills

  • Tools/ Frameworks : Spark, Hadoop, Hive, Sqoop, Oozie
  • Languages : Python, HiveQL and CQL
  • IDE : IntelliJ IDE, VSCode, PyCharm
  • Cloud : AWS, Azure
  • RDBMS : MySQL, Oracle, PostgreSQL
  • Power BI, Tableau, Excel, DAX Functions
  • Git, Linux, Data governance

Languages

English
Full Professional
Hindi
Full Professional
Telugu
Full Professional

Timeline

Big Data Engineer

T-Mobile
12.2024 - 05.2025

Big Data Engineer

IBM
07.2022 - 11.2023

Data Analyst Intern

Virtusa
08.2021 - 06.2022

Master of Science - Computer Science

University of Illinois At Springfield

Computer Science

Jawaharlal Nehru Technological University
Sai Bhavana Gulla