Summary
Overview
Work History
Education
Skills
Projects
Certifications
Timeline
Generic

Raghavendra Rajarao Gari

Springfield,MO

Summary

  • Data Engineer with over 3 years of experience in designing, developing, and implementing robust data pipelines, ETL processes, and data architecture.
  • Experienced in building ETL pipelines using DBT and Informatica, ensuring efficient data transformation and integration processes.
  • Skilled in designing and optimizing cloud-based data warehouses with Snowflake and Amazon Redshift to support scalable and high-performance analytics.
  • Proficient in PySpark for big data processing, including large-scale transformations, Spark SQL operations, and data ingestion workflows.
  • Strong expertise in relational databases like MySQL, SQL Server, and PostgreSQL, along with NoSQL technologies such as MongoDB and Cassandra for flexible data modeling and storage.
  • Hands-on experience developing robust data pipelines and analytics solutions with Azure Data Factory, Synapse Analytics, Databricks, Cosmos DB, Azure Data Lake, and Azure DevOps.
  • Well-versed in big data ecosystem components, including HDFS, Apache Kafka, Apache Spark, and Zookeeper, for end-to-end large-scale data management solutions.
  • Solid understanding of data warehousing concepts, including dimensional modeling, data migration, data validation, data cleansing, data profiling, and complex ETL processes.
  • Proficient in AWS services such as EC2, S3, RDS, EMR, IAM, CloudFront, CloudWatch, SNS, SES, and Redshift, supporting cloud-based data engineering and analytics initiatives.

Overview

3
3
years of professional experience

Work History

Data Engineer Intern

Accenture
Bangalore, Karnataka,India
01.2023 - 04.2023
  • Used Python and Spark to create pipelines for processing payment data. Managed to handle 500,000 records a day for a risk assessment task.
  • Built basic Airflow schedules to transfer files from AWS S3 to a database. Saved the team about 15 hours weekly on routine updates.
  • I worked on Kafka setups for a customer alert system. Kept data flowing in near real time for a small pilot project.
  • Arranged tables in AWS Redshift for a sales tracking report. Made data pulls 25% quicker for the reporting group.

Data Security Analyst - L2

Wipro Technologies
Bangalore, Karnataka, India
09.2020 - 09.2022
  • Monitored and analyzed data movement across various channels, including Exchange (Office 365), and endpoint security solutions (McAfee), ensuring compliance with security policies.
  • Assisted in querying and analyzing security event logs using SQL, extracting insights from structured data to identify anomalies and trends.
  • Developed basic Python scripts to automate data extraction and preprocessing, improving efficiency in security incident analysis.
  • Gained hands-on experience in ETL processes, helping to structure and transform security logs for reporting and visualization.
  • Worked with data pipelines to centralize incident logs, supporting real-time monitoring, and alerting mechanisms.
  • Collaborated with teams to implement data access policies, and ensure secure handling of sensitive information.
  • Escalated security incidents and performed data-driven investigations, ensuring logical closure, and mitigating potential data leaks.

Education

Master of Science - Computer Science

Missouri State University
Springfield, MO

Skills

Programming & Scripting: Python, SQL, PL/SQL, JavaScript, C, C, and Shell Script

Databases & Data Warehousing: Snowflake, Redshift, PostgreSQL, SQL Server, MySQL, MongoDB, Cassandra

Big Data & Distributed Systems: Apache Spark, Hadoop, Hive, Kafka, Data-bricks, Apache Flink, Spark SQL, Spark Core, MapReduce, Spark Streaming

ETL Data Pipelines: Informatica, DBT

Cloud Platforms: AWS, Azure

Scheduler Tools & APIs: Apache Airflow, REST APIs

Data Visualization & Analytics: Power BI, Tableau, Matplotlib, Pandas, NumPy

DevOps & CI/CD Tools: Docker, Kubernetes, Terraform, GIT, GitHub

Project Management and Operating Systems: Jira, ServiceNow, Confluence, Windows, Linux, Unix, MacOS

Projects

Real-Time Streaming Analytics on E-Commerce Transactions - 

Developed a real-time data pipeline to process e-commerce transactions using Kafka for streaming and PySpark for data transformation. Stored raw data in AWS S3 and performed analysis using Redshift and SQL to derive customer purchase trends. Built interactive visualizations in Matplotlib to identify high-demand products and sales patterns, improving business decision-making.

Sales Data Pipeline & Analysis Using PySpark and AWS - 

Built an end-to-end ETL pipeline to process and analyze sales data for a retail business. Ingested raw sales data from CSV files stored in AWS S3 and used AWS Glue to clean and transform data. Loaded the processed data into AWS Redshift for further analysis. Used PySpark and SQL to extract insights such as top-selling products, revenue trends, and customer behavior. Created visual reports using Matplotlib to present key findings, helping businesses make data-driven decisions.

Certifications

  • Data Science Certification - Missouri State University

Timeline

Data Engineer Intern

Accenture
01.2023 - 04.2023

Data Security Analyst - L2

Wipro Technologies
09.2020 - 09.2022

Master of Science - Computer Science

Missouri State University
Raghavendra Rajarao Gari