Summary

Overview

Work History

Education

Skills

Languages

Timeline

Sai Bhavana Gulla

Seattle,WA

Summary

Recent Master’s graduate with strong hands-on experience in Big Data technologies, specializing in PySpark and AWS cloud services. Skilled in designing and deploying scalable data pipelines, debugging production workloads, and optimizing Spark performance for large-scale data processing.

Overview

years of professional experience

Work History

Big Data Engineer

T-Mobile

Seattle, WA

12.2024 - 05.2025

For data injection from Oracle, MySQL to S3 which can be queried using hive and spark SQL tables.
Worked on Sqoop jobs for injecting data from MySQL to Amazon S3.
Created hsource and destination. Validate the source and final output data.
Used Hive external tables for querying the data.
Use Spark Dataframe APIs to inject Oracle data to S3 and store it in Redshift. Write a script to get RDBMS data to Redshift.
Optimize Hive and and Spark performance. and identify the errors using logs.
Automatically scale-up the EMR Instances based on the data. Apply transformation rules on the top of DataFrames.
To run and schedule the spark script in EMR pipes. Process Hive, csv, json, oracle, data at a time(POC).

Big Data Engineer

IBM

Bangalore

07.2022 - 11.2023

For data injection from Oracle, MySQL to S3 which can be queried using hive and spark SQL tables.
Prepared Kafka producer code to get data from web logs. Installed and managed Kafka cluster in dev environment.
With the help of Kafka Consumer API's get data from Kafka process using Spark streaming.
Run SQL queries on top of spark streaming.
Spark structure streaming process using PySpark.
Created regular expressions to clean data properly.
Created udf to clean data, fulfill client requirements.
Integrated spark and Cassandra with proper version compatibility.
Automated all these activities using Oozie in AWS environment.

Data Analyst Intern

Virtusa

Hyderabad

08.2021 - 06.2022

Analyzed data sets to identify trends and support decision-making processes and performed data cleaning over 50+ tables to improve quality and accuracy.
Developed 100+ SQL queries to extract, manipulate and analyze datasets for reporting purposes.
Automated reporting using power query and advanced Excel functions in Power BI.

Education

Master of Science - Computer Science

University of Illinois At Springfield

Springfield, IL

05-2025

Computer Science

Jawaharlal Nehru Technological University

India

07-2018

Skills

Tools/ Frameworks : Spark, Hadoop, Hive, Sqoop, Oozie
Languages : Python, HiveQL and CQL
IDE : IntelliJ IDE, VSCode, PyCharm
Cloud : AWS, Azure

RDBMS : MySQL, Oracle, PostgreSQL
Power BI, Tableau, Excel, DAX Functions
Git, Linux, Data governance

Languages

English

Full Professional

Hindi

Full Professional

Telugu

Full Professional

Timeline

Big Data Engineer

T-Mobile

12.2024 - 05.2025

Big Data Engineer

IBM

07.2022 - 11.2023

Data Analyst Intern

Virtusa

08.2021 - 06.2022

Master of Science - Computer Science

University of Illinois At Springfield

Computer Science

Jawaharlal Nehru Technological University