Summary

Overview

Work History

Education

Skills

Certification

Timeline

Ravitheja Papareddy

Frisco,TX

Summary

Currently working as a Senior Data Engineer, I lead the design and development of end-to-end data pipelines on Databricks using PySpark, integrating a range of AWS services such as S3, Lambda, and Redshift, while orchestrating complex workflows with Airflow to ensure efficient, scalable data processing. I have successfully led cross-functional teams to implement advanced data solutions on Snowflake and AWS, significantly enhancing data accessibility and performance through optimized ELT workflows and robust data modeling practices. With extensive experience as a Data Engineer and Data Analyst, I specialize in building data pipelines using the Hadoop ecosystem, Spark, Hive, HDFS, MapReduce, YARN, Sqoop, Kafka, Oozie, and Teradata, while also leveraging cloud platforms like AWS and Google Cloud. I bring deep technical expertise in developing Spark applications using PySpark and Spark-SQL, creating Hive tables with custom UDFs, and utilizing visualization tools like Tableau and Amazon QuickSight. I’ve worked with cluster monitoring tools such as Cloudera Manager and Hortonworks and have hands-on experience with real-time data streaming via Kafka. My skill set includes advanced data manipulation using Partitions, Joins, and Window Functions, along with designing, testing, and maintaining data management systems using Spark, Hadoop, AWS, and Shell scripting. I am proficient in Python, Core Java, SQL, and Object-Oriented Design, with a strong background in creating stored procedures, triggers, and views for reliable data operations. Additionally, I work closely with business users, product owners, and engineering teams in Agile environments to deliver data-driven features, translating technical outcomes into actionable business insights and aligning data strategies with organizational goals.

Overview

years of professional experience

Certification

Work History

Data Engineer

Walt Disney - Wipro Technologies - IDC

01.2024 - Current

Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
.Fine-tuned query performance and optimized database structures for faster, more accurate data retrieval and reporting.
Enhanced data quality by performing thorough cleaning, validation, and transformation tasks.
Streamlined complex workflows by breaking them down into manageable components for easier implementation and maintenance.
Provided technical guidance and mentorship to junior team members, fostering a collaborative learning environment within the organization.
Led end-to-end implementation of multiple high-impact projects from requirements gathering through deployment and post-launch support stages.
Evaluated various tools, technologies, and best practices for potential adoption in the company''s data engineering processes.
Collaborated with cross-functional teams for seamless integration of data sources into the company''s data ecosystem.
Gathered, defined and refined requirements, led project design and oversaw implementation.

DATA ENGINEER

Apple - Infosys - AML SHURI-JE

03.2023 - 01.2024

Created Kerberos Dataproc clusters for data migration and established cross-realm setup to migrate the data from HDFS to GCP
Migrated the existing spark jobs from CDH to GCP
Proficient in using Py-Spark and Spark SQL
Migrated the historical data and pointed the Kafka to load data into GCS buckets
Validated the customer data upon successful migration
Registering the tables in HMS in Iceberg format
Worked on tuning the code/Queries to reduce load and resolved small file issue
Contributed to Apple wiki's
Built dashboards using Tableau
Interacting with multiple business owners across AML on GCP migration
The AML Solution team build solutions that impact across Apple
Typically, these problems are related to Machine Learning platform, Machine Learning Solution and Big Data
Currently, we partner with Retail, AppleCare, Marcom, and Manufacturing to build solutions that include Search Engine
Implemented cutting-edge machine learning algorithms to unlock valuable insights from large volumes of structured and unstructured data.

Data Engineer

Apple - Infosys - AMP CORE

03.2021 - 03.2023

Good Hands-on experience in data extraction, exploration, and analysis to produce reports and visualizations
Proficient in using Spark Scala, Py-Spark and Spark SQL to read data from large datasets
Data transformations performed using the Spark data-frame methods and joining different tables to generate single level order ID for every transaction
Worked on tuning the code/Queries to reduce load of the data engine and to improve the execution time by implementing joins, selecting the preferred columns and other SQL and spark logics
Built dashboards using Tableau
Monitoring the Hydra dashboard to check the App Store health
Monitoring the QGT dashboard by creating rules for threshold for app crashes
Performing Data refresh activities for both Prod and UAT environments
Worked on Schema changes in Kafka pipelines and Oracle pipelines
Interacting with multiple business projects to gather the use cases and helping teams by providing new business models
The AMP Data investigation team analyzes and produce insights from diagnostic and usage data from hundreds of millions of devices every day from all over the world
The insights are used to improve Apple's products and services, to inform strategic directions, and to improve user experience
Our team works at a high-pace and high-functioning team of Data Analysts that use the latest Big Data technologies to tackle complex, large-scale problems using immense quantities of collected data

Data Engineer Intern

LendingWise - Northwest Missouri State University

01.2019 - 04.2019

Load the raw data from Kafka into Spark by applying business logics into big query
Created trends based on the customer data using Tableau
Performed performance tuning and troubleshooting of jobs by analyzing and reviewing Hadoop log files
Experience with Airflow Workflow Engine in running workflow jobs with actions that run Hadoop Map/Reduce jobs
Worked on issues reported by the QA team actively and resolved all the issues with data processing
Documented the systems processes and procedures for future references
LendingWise.com is a robust, cloud-based CRM and LOS platform designed for Hard & Private Money Mortgage Lenders & brokers of all sizes
Private money lenders typically piggyback off the traditional mortgage market tools and software, which is bloated with unnecessary features that's low them down and burn their pockets with extra fees
Now they have a CRM & LOS fully integrated under one platform, so they can manage the sales & marketing, broker management, deal flow, processing, underwriting and closing of hard money loans

Data Analyst

LettuceDream - Northwest Missouri State University

08.2018 - 12.2018

Developed and designed the IOS QR code generator application for the tracking of Lettuce life cycle management using pre-existing libraries
Migrated the data from RDBMS files into the landing zones (Google Buckets)
Data transformations performed using the Spark data-frame methods and joining different tables to generate single level order ID for every transaction
Applied business logics on the data and generated reports on daily basis, quarterly and annually using Tableau from different clients like Northwest Missouri State University, Hyvee and Walmart
Experienced in working with customer transaction level data and building the consumption tables by joining multiple tables like customer visit, items scanned, stores dimension table, item dimension table etc., to get an overview on the customer transaction
Created an order Management System which manages all the orders from different clients and calculates the waste using the waste management
Work closely with the business teams to meet the business requirements and SLA's daily
Scheduled jobs using Airflow by creating DAG's and run them on daily basis to update the tables
Lettuce Dream is a Non-Profit Organization, which produces organic lettuce
Organization holds massive data of harvesting cycles and its customers
Data Migration from MySQL server to Google cloud Buckets

Student Manager

Northwest Missouri State University

09.2017 - 07.2018

Kronos scheduling tool that helps to track in-time and out-time of the workforce
Involved in Requirement gathering and analyzing the existing manual process of scheduling the hours
UI Customization and creation of user-defined fields forms as per the requirement
Certified food handler
Excellent customer service skills and ability to deal with a variety of restaurant patrons
Awarded 'Front Line First' honor for keeping up the good work
Conducted staff training regarding fine dining
Ensure proper food safety handling is enforced
Aramark Corporation, known commonly as Aramark, is an American food service, facilities, and uniform services provider to clients in fields including education, healthcare, business, corrections, and leisure

Associate Engineer

Unisys Corporation

10.2015 - 07.2017

Developed and maintained the conversion of 9 bits data to 8 bits data for the migration of data from OS2200 systems to Windows and vice versa using TCP/IP and MELLANOX drivers
Developed windows application to access OS2200 data from Windows
Analyzed and created solutions and technical designs for Windows application to access the OS2200 files from windows
Maintained the reports using ASP.Net and C# in Presenter-Repository pattern
Provided outbound web access and other .NET capabilities
Led engineers in various development challenges, elected as KT champ for the module
Collaborated with QA on testing and fixing bugs
Elected as scrum master for the OS2200 file transfer module
Application Integrated Services-Connectivity Services is a layer 5 interface for Unisys OS network applications
It provides access to Unisys OS services from Connectivity Services clients running on remote systems
CS2200 is paired with remote Connectivity Services by a protocol that allows clients and agents to easily communicate with each other in a securely in a message-oriented environment

Project Engineer

Wipro Technologies

04.2014 - 10.2015

In-depth understanding of Hadoop Architecture and various components such as HDFS, Application master, Node Manager, Resource Manager, NameNode, DataNode and MapReduce Concepts
Developed Spark Programs using Python to compare the performance of spark with Hive and SQL
Imported data from different sources like AWS S3, LFS into Spark RDD
Worked with different file formats AVRO, Sequence file and various compression formats using Snappy
Involved in converting SQL scripts into Spark transformations using Spark Data Frames
Created a data mart in the data lake to enable Tableau access to build in-scope metrics
Experience in analyzing data using HIVEQL, PIG and created HIVE UDF'S to analyze and transform data into HDFS
Used Spark-SQL to load JSON data and create Schema RDD and loaded into Hive Tables and handled structured data using Spark-SQL
Loaded data and performed operations using Spark SQL and send the results to TABLEAU dashboards
Designed and Implemented partitioning (Static, Dynamic), Buckets in HIVE
Software development in a collaborative team environment using Scrum Agile methodologies to build data pipelines using Spark
Worked on processing batch and real time data using Spark using python
Used Sqoop to transfer the files efficiently between databases and HDFS and Flume to stream the log data from servers
Used JSON SerDe's for serialization and De-serialization to load JSON data into Hive Tables
Used Oozie workflow to coordinate Pig and Hive Scripts
Generated Data Flow Diagrams (DFD) and Unified Modeling Language (UML) Diagrams which explains system architecture to client
Involved in scrum meetings and sprint planning
Worked on issues reported by the QA team actively and resolved all the issues with data processing
Worked for one of the top financial corporations which issues top credit cards in the USA apart from that it is also advanced its goals in the field of auto loans, banking and savings accounts
Customer contact information, policy holdings data, claims summary, etc
Is keep on growing
Using Data Warehouses, it is hard to handle the large amounts of data operations, transformations which consumes more amount of time and adding more nodes for storage is difficult
So, we made the existence of Hadoop and its ecosystem and by using Teradata, Hive, Sqoop, Spark, Kafka, Amazon EMR etc
To perform all the data warehousing transformations efficiently in a very less amount of time

Project Engineer

Wipro Technologies

06.2013 - 04.2014

Developed IBRIX data migration tool for transferring the massive data from one server to other using High availability and RAID techniques
Created file systems for sharing files through NFS export or SMB share using HP file system wizard
Created new virtual machines using HP data center to transfer files from one server to other
Worked on CIFS and NFS file sharing protocols to transfer files from windows and Linux environment
Expertise in Windows ACL's and UNIX permissions for the files that are being transferred to clients
Created snapshots and NDMP backup and restore of files through RAID techniques
Created and updated unit test case, black box testing and integration testing and reported using Bugzilla
HP Store-All Storage is beyond traditional Network Attached Storage (NAS) in both capacity and performance
HP Store-All storage delivers excellent performance and a modular storage infrastructure to provide storage growth and performance using IBRIX file system which manages petabytes of data

Education

Master of Science - Big Data Analytics & CIS

University of Central Missouri

Warrensburg, MO

12.2020

Master of Science - Information And Computer Systems

Northwest Missouri State University

Maryville, MO

05.2019

Bachelor of Science - ELECTRICAL AND ELECTRONICS ENIGINEERING

SASTRA University

Thanjavur, TN, India

04.2013

Skills

Apache Spark, Hive, Hadoop, HDFS, HBase, Kafka, Airflow
Data Warehousing
Python, PySpark, C, SQL, R, Scala, Shell scripting
NumPy, SciPy, Scikit-learn, Pandas, Matplotlib, Pytables, Seaborn
MySQL, SQL Server, Snowflake, Cassandra, Teradata, Big Query

Amazon Web Services, Google Cloud Platform, Cloudera (CDH 56 & CDH 511)
Visual Studio, NetBeans, IntelliJ IDEA, Spyder, Teradata Studio
Tableau, Power BI, Excel
Git Hub, GitLab, Bit Bucket, Lucid Chart, Postman, FileZilla

Certification

Amazon Certified Solution Architect-Associate, 04MY60WJCBE4Q1SP, http://aws.amazon.com/verification

Timeline

Data Engineer

Walt Disney - Wipro Technologies - IDC

01.2024 - Current

DATA ENGINEER

Apple - Infosys - AML SHURI-JE

03.2023 - 01.2024

Data Engineer

Apple - Infosys - AMP CORE

03.2021 - 03.2023

Data Engineer Intern

LendingWise - Northwest Missouri State University

01.2019 - 04.2019

Data Analyst

LettuceDream - Northwest Missouri State University

08.2018 - 12.2018

Student Manager

Northwest Missouri State University

09.2017 - 07.2018

Associate Engineer

Unisys Corporation

10.2015 - 07.2017

Project Engineer

Wipro Technologies

04.2014 - 10.2015

Project Engineer

Wipro Technologies

06.2013 - 04.2014

Master of Science - Big Data Analytics & CIS

University of Central Missouri

Master of Science - Information And Computer Systems

Northwest Missouri State University

Bachelor of Science - ELECTRICAL AND ELECTRONICS ENIGINEERING

SASTRA University

Ravitheja Papareddy

Summary

Overview

Work History

Data Engineer

DATA ENGINEER

Data Engineer

Data Engineer Intern

Data Analyst

Student Manager

Associate Engineer

Project Engineer

Project Engineer

Education

Master of Science - Big Data Analytics & CIS

Master of Science - Information And Computer Systems

Bachelor of Science - ELECTRICAL AND ELECTRONICS ENIGINEERING

Skills

Certification

Timeline

Data Engineer

DATA ENGINEER

Data Engineer

Data Engineer Intern

Data Analyst

Student Manager

Associate Engineer

Project Engineer

Project Engineer

Master of Science - Big Data Analytics & CIS

Master of Science - Information And Computer Systems

Bachelor of Science - ELECTRICAL AND ELECTRONICS ENIGINEERING

Similar Profiles

Erica LedesmaErica Ledesma

Enzo LunettaEnzo Lunetta

Javi MañasJavi Mañas

Kayla MasseyKayla Massey

James SummerallJames Summerall