Overview
Work History
Education
Skills
Projects
Certification
Accomplishments
Timeline
Generic

Avinash Vamshi Kanthreegala

Richardson,Texas

Overview

3
3
years of professional experience
1
1
Certification

Work History

Data Engineer

Firstzen
Hyderabad, Telangana
09.2021 - 07.2022
  • Designed and implemented ETL processes with Apache Spark and Apache Airflow, loading data from 25 sources into the Data Warehouse over 10 months.
  • Utilized Business Objects, QlikView, and Power BI to design and develop more than 80 comprehensive reports.
  • Developed and deployed 15 Python scripts to extract data from web services APIs and load it into databases over 10 months.
  • Developed and deployed around 10 SSIS packages to streamline ETL processes, facilitating the loading of data from multiple sources into a consolidated data warehouse environment
  • Collaborated with solution architects to define database and analytics engagement strategies for operational territories and key accounts, facilitating successful strategy implementations across multiple projects within a 10 month tenure.
  • Designed, implemented, and maintained Hadoop applications leveraging Java, Pig, Hive, and MapReduce technologies to process and analyze large-scale datasets efficiently

SQL Developer

Firstzen
Hyderabad, Telangana
06.2019 - 08.2021
  • Developed approximately 200 stored procedures, functions, and triggers to address various application requirements within a span of two years.
  • Proactively identified and addressed issues related to slow-running queries or deadlocks, resulting in minimized disruptions and optimized system functionality
  • Conducted unit testing for every database object, including tables, views, stored procedures, functions, and triggers, resulting in a 100% coverage rate before deployment into the production environment
  • Developed intricate reports utilizing SQL Server Reporting Services (SSRS), encompassing over 30 reports tailored to meet diverse business requirements
  • Designed and implemented a total of 5 multidimensional or tabular models using Microsoft SQL Server Analysis Services (SSAS) over the course of two years, enhancing data analysis and reporting capabilities within the organization's business intelligence environment.
  • Generated logical and physical database descriptions, incorporating database identifiers, to effectively communicate database structures to management systems

Education

Master of Science - Business Analytics

The University of Texas At Dallas
05.2024

Skills

Languages

  • Python
  • SQL

Databases

  • RDBMS (MySQL, SQL Server)
  • NoSQL (MongoDB)

Big Data/Cloud

  • Apache Spark,
  • Apache Airflow, Hadoop,
  • AWS Redshift

Tools

  • QlikView,
  • Power BI,
  • Microsoft SQL Server Analysis Services (SSAS),
  • SSRS,
  • SSIS,
  • Docker,
  • Kubernetes

Projects

Sparkify Music Data ETL Pipeline with Apache Airflow

  • Designed and implemented an ETL pipeline with Apache Airflow to ingest Sparkify's music data into an AWS Redshift Data Warehouse on an hourly basis.
  • Utilized Python scripting for data extraction, transformation, and loading, employing a STAR schema design to facilitate efficient querying and aggregation.
  • Enabled Sparkify to analyze user behavior, song popularity, and artist preferences, contributing to data-driven decision-making and enhancing user experience.

Big Data Analytics with Hadoop: Baseball Statistics Analysis

  • Analyzed and processed over 10 years of play-by-play baseball statistics data from Retrosheet using Hadoop ecosystem tools, totaling more than 100,000 records, highlighting proficiency in handling large-scale datasets.
  • Developed and executed Pig scripts within the Hadoop environment to accurately count the number of games represented and establish relationships between player IDs and names, contributing to comprehensive data analysis and insights generation.
  • Demonstrated proficiency in Hadoop ecosystem tools by successfully storing and processing data on HDFS, showcasing ability to work with distributed computing frameworks for big data projects.

Bankruptcy prediction model 

  • Built Bankruptcy classification model in Python utilizing gradient boosting to predict financial distress.
  • Employed SMOTE to oversample unbalanced classes and achieved ~0.93 AUC.

Marketing Analytics for Conagra Brands 

  • Utilized pricing analytics and consumer behavior insights to identify optimal price points for Conagra's product portfolio across diverse market segments, considering consumer demographics and product attributes, leading to increase in profits by 11% for the next quarter.

Railway Control System Database 

  • Designed, implemented and populated a relational database schema for a Railway Control System in MySQL.
  • Analyzed the railway control system database to extract actionable insights, including identifying operational inefficiencies, understanding passenger behavior, and optimizing revenue generation strategies.

Certification

  • AWS Certified Solutions Architect - Associate

Accomplishments

  • Recipient of the Team Player award in 2021 in recognition of exceptional teamwork and collaboration within the DBA team
  • Worked as a Technical Workshops Organizers head for one year during the Undergraduate studies and improved student participation by 40% compared to the previous academic year

Timeline

Data Engineer

Firstzen
09.2021 - 07.2022

SQL Developer

Firstzen
06.2019 - 08.2021

Master of Science - Business Analytics

The University of Texas At Dallas
Avinash Vamshi Kanthreegala