Experienced professional with a strong emphasis on design, development, and implementation of datacentric applications with a combination of technical, process, and business-oriented skills
Extensive Data Engineering experience in implementing complex data pipelines to ingest, process, load, and transform largely structured, unstructured, and semi-structured data sets at scale
Skilled in processing, and analyzing data using Apache Spark components spark-core, spark-SQL in Python using Databricks
Background includes data mining, warehousing, and analytics. Proficient in machine and deep learning. Quality-driven and hardworking with excellent communication and project management skills.
Created configuration-based pipelines using spark batch processing, and delta storage using PySpark in the new Databricks environment
Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
Identified and offloaded the historical data and periodic deltas from Netezza to Azure Data Lake using PySpark scripts on Databricks
Designed and implemented complex data loads efficiently like changing fact loads, data pivoting loads, recursive hierarchies, and data archival processes
Designed downstream databases and created data dictionaries
Data Ingestion into the Data Lake using Open-source Hadoop distribution to process Structured, Semi-Structured, and Unstructured datasets
Worked with Netezza for data import/export operations from different data marts
Worked extensively with Data migration, Data cleansing, Data profiling, and ETL process features for data warehouses
Hands-on experience in Azure Cloud stored the data in ADLS – Azure Data Lake Storage, App services, Databricks cluster for running the jobs, Azure Synapse, Virtual machines, Azure AD, Azure search, and notification hub
Designed, configured, and deployed Microsoft Azure for a multitude of applications utilizing the Azure Data Factory (Including Compute, Blobs, Resource Groups, Azure SQL, and Cloud Services), focusing on high - availability, fault tolerance, and auto-scaling.
Developed, implemented, and maintained data analytics protocols, standards, and documentation.
IT Tech Analyst
California Lutheran University
Thousand Oaks, CA
10.2019 - 05.2021
Performed Inventory Analysis and tested dynamic forms
Assisting the IT department in troubleshooting minor technical issues and setting up A/V requests for events and meetings.
Applied knowledge of IT best practices to tackle new challenges and make educated decisions.
Data Mining and Machine Learning Intern
XpertReview Software Solution
Thousand Oaks, CA
05.2020 - 08.2020
Designed, developed, and delivered machine learning-enabled solutions to a data extraction problem in HR
Streamlined the use of machine learning techniques by implementing NLTK, Clustering, and Classification in Python for data analysis.
Created customized applications to make critical predictions, automate reasoning and decisions and calculate optimization algorithms.
Identified new problem areas and researched technical details to build innovative products and solutions.
SQL Developer/Analyst
Appstek Solutions Pvt ltd
Hyderabad, TG
07.2017 - 06.2019
Client: Fidelity National Financial
Built dashboards using Tableau for the team facilitating immediate reaction to failure points in the product pipeline and reducing failure rate by 15%
Created data modeling standards and procedures
Developed numerous custom reporting solutions using SQL to deliver unique, innovative reports to executive leadership
Developed customized Functions, Packages, and Triggers based on business requirements
Performed detailed data validation spanning several different international projects.
Participated in software field testing to verify the performance of developed projects.
Education
Master of Science - Information Technology-Data Analytics Conc
California Lutheran University
Thousand Oaks, CA
09.2019 - 05.2021
Bachelor of Technology - Computer Science
JNTU Hyderabad
09.2013 - 05.2017
Skills
TECHNICAL EXPERIENCE AND SOFTWARE KNOWLEDGEundefined
Accomplishments
Orchestrated CI\CD pipelines to provision Azure resources and deploy code using Azure CI/CD and Gitlab
Ingested, parsed, and processed data in various formats including JSON, CSV, and parquet received through various modes including REST API through Python, Data Analysis | Ventura County Fire Department (, ):
Description: The goal of the project is to analyze the number of accidents using Tableau over a period and the type of accidents occurring in Ventura County, depending on the type of emergency request
Involved in analyzing the data produced by Ventura County Fire Department
Responsible for finding the performance KPIs of fire captains and using geo-analytics to find insights about stations and accident locations.
Certification
AWS Certified Developer – Associate
Timeline
AWS Certified Data Analytics - Specialty
11-2022
SnowPro Core Certification
12-2021
Data Engineer
Apexon
06.2021 - Current
AWS Certified Developer – Associate
05-2021
Data Mining and Machine Learning Intern
XpertReview Software Solution
05.2020 - 08.2020
IT Tech Analyst
California Lutheran University
10.2019 - 05.2021
Master of Science - Information Technology-Data Analytics Conc