Summary
Overview
Work History
Education
Skills
Timeline
Generic

Bhanusree Alaparthy

Allentown

Summary

Meticulous Data Engineer accomplished in compiling, transforming, and analyzing complex information through software. Expert in machine learning and large dataset management. Demonstrated success in identifying relationships and building solutions to business problems.

Overview

8
8
years of professional experience

Work History

Data Analyst

Red Hot
01.2024 - 09.2024
  • Design, develop, and implement analytics solutions on Google Cloud Platform (GCP) to extract actionable insights from diverse datasets
  • Collaborate with stakeholders to understand business requirements and translate them into analytical solutions using GCP services such as Big Query, Data Studio, and Dataflow
  • Develop and maintain data pipelines for data extraction, transformation, and loading (ETL) to ensure data quality, accuracy, and availability
  • Design and optimize data models for efficient storage and retrieval, ensuring scalability and performance
  • Enhanced data migration process to GCP using Cloud Dataflow and migrated many sources to big query
  • Implemented Air byte to create data pipeline between sources like Facebook marketing, YouTube, Konnective to Big Query
  • Developed and executed many SQL queries for data extraction, transformation, and loading
  • Strong proficiency in working with structured and unstructured data, including data extraction, transformation, and loading (ETL) processes and data manipulation for analytics
  • Extensive experience with SQL for complex query building, performance optimization, joins, indexing, and writing stored procedures across relational databases
  • Developed complex SQL queries to extract, transform, and analyze data, supporting business decision-making and ensuring data integrity across various projects
  • Built tools using Looker to allow internal and external teams to visualize and extract insights from big data platforms
  • Built and maintained dynamic Tableau dashboards, enabling real-time data visualization and empowering stakeholders with actionable insights
  • Designed, Build and deployed set of Python modelling APIS for customer analysis, which integrate multiple machine learning techniques for used behavior prediction
  • Develop and maintain Python-based applications and software system
  • Implemented and enforced data governance frameworks, ensuring data accuracy, consistency, and integrity across enterprise-wide systems
  • Established and maintained data quality standards, performing regular audits and resolving data discrepancies to support business decisions
  • Designed and implemented leading-edge analytics and reporting systems to translate complex business challenges into actionable insights, driving strategic decisions
  • Identified growth opportunities and operational trends through advanced analysis of complex datasets, enabling data-driven initiatives
  • Developed automated workflows and tools to reduce manual processes, increasing operational efficiency and saving valuable time

Data Python Analyst

Truist Financial
05.2023 - 12.2023
  • Managed the UI/UX team for the front-end development of the project
  • Created Data mapping, Data Dictionary for ETL and application support, metadata, DML as required
  • Responsible for extracting, transforming, and loading (ETL) data from various sources into the organization's data systems
  • This involves working with different data formats, APIs, databases, and data integration tools to ensure data accuracy and consistency
  • Ability to leverage Excel's features and functions to improve efficiency and accuracy in data management and analysis tasks
  • Built tools using Tableau to allow internal and external teams to visualize and extract insights from big data platforms
  • Responsible for expanding and optimizing data and data pipeline architecture, as well as optimizing data flow and collection for crossfunctional teams
  • Experience in developing spark using Spark-SQL in Databricks for extraction and aggregation from multiple file formats for analyzing and transforming data into customer usage patterns
  • Expert in SQL with a deep understanding of complex queries, database optimization, and data modeling across both relational (SQL) and non-relational (NoSQL) databases
  • Hands-on experience with Database Management for relational databases like MySQL and PostgreSQL, as well as NoSQL databases like MongoDB
  • Collaborated with cross-functional teams to define and document data policies, standards, and procedures, aligning with organizational and regulatory requirements
  • Designed, scheduled, and monitored batch job processing using JCL and CA7 on IBM Mainframe, ensuring efficient workflow execution and reducing processing errors by 20%
  • Processed millions of daily financial transactions by developing and maintaining batch programs on IBM Mainframe, adhering to strict compliance and regulatory standards
  • Designed and implemented disaster recovery strategies for IBM Mainframe systems, achieving seamless data restoration and minimal downtime during system outages
  • Monitored compliance with data governance policies, providing actionable insights to improve adherence and reduce risk
  • Developed and maintained data lineage documentation to enhance transparency in data flow and usage across various business processes
  • Followed the Agile development with Jira as a development management and issue-tracking tool
  • Created a confluence page for developing best practices and project documentation
  • Automate the process to send the data quality alerts to slack channel and email using Databricks, Python, and HTML
  • This will alert users if there are any issues with data
  • Collaborated with cross-functional teams to design, implement, and deploy cloud-based microservices on AWS, ensuring scalability and high availability
  • Perform data comparison between SDP(Streaming Data Platform) real-time data with AWS S3 data and Snowflake data using Databricks, Spark SQL, and Python
  • Developed and maintained complex software systems utilizing object-oriented principles and advanced Python techniques to model business processes and improve efficiency
  • Environment: Power BI, R, Python, AWS, RSDS Teradata SQL Assistant, ServiceNow, Google cloud path, MySQL, XML, Postman, Spark 2.4, Spark SQL, Kafka 2.3.0, Apache Airflow 1.10.4, Snowflake, Databricks

Insights and Data Analyst

CAPGEMINI - ANZ Bank
06.2021 - 10.2022
  • Company Overview: Australia
  • Followed Agile Software Development Methodology to build the application iteratively and incrementally
  • Participated in scrum related activities and daily scrum meetings
  • Developed and implemented quantitative models (e.g., Value-at-Risk, Monte Carlo simulations) for measuring and managing financial risk, improving predictive accuracy
  • Worked with the team of 'Group technology'
  • Worked together for the removal of Risk dependencies and done analysis to find out what are systems and applications involved with it
  • Designed and maintained risk management dashboards and reports for senior executives, providing real-time insights on risk exposure, capital adequacy, and key financial risk metrics
  • Built and optimized SQL queries in Teradata for data extraction, transformation, and loading (ETL) in a high-volume financial data environment
  • Designed and executed Teradata SQL scripts to analyze customer behavior and generate actionable insights for marketing campaigns
  • Analyzed and monitored credit risk exposures for corporate clients, developing risk metrics and policies to maintain credit quality and avoid default incidents
  • Documented and analyzed all expenses and maintained and managed relevant databases
  • Maintained cash logs, produced detailed reports, and processed all payments
  • Worked on customer clustering based on Machine learning and statistical modelling effort includes building predictive models
  • Developed data models that streamlined data processing pipelines in the Azure environment, resulting in an increase of 30% in productivity
  • Written Python Scripts to parse JSON documents and load the data into database
  • Worked on a Google cloud path to train and set the datasets and with the help of big query created a pipeline
  • Installed, configured, and hosted the Tomcat app servers and MySQL database servers on physical servers (Linux, Windows), and Amazon AWS virtual servers (Linux)
  • Documented all the data transformations, validations, downstream impacts, and API's and provided guidance to the engineers
  • Performed SQL queries using the RSDS Teradata SQL assistant in-Order to analyze the daily feeds of the banking systems
  • Used AWS services like EC2 for deployments, S3 for storage and SES, SQS for sending notifications
  • Created a platform as infrastructure with AWS (EC2, RDS, ELB) used Jenkins to run the automated deployments
  • Implemented business models with Tableau and Power-BI and used DAX expressions to effectively communicate the business insights
  • Designed, Build and deployed set of Python modelling APIS for customer analysis, which integrate multiple machine learning techniques for used behavior prediction
  • Performed exploratory data analysis using R programming, also involved in generating various graphs and charts for analyzing data using Python Libraries
  • Used classification Techniques including Random Forest and Logical Regression to qualify the likelihood of each user referring
  • Applied machine learning algorithms with Spark standalone R/ Python
  • Australia
  • Environment: Power BI, R, Python, Tableau, AWS, RSDS Teradata SQL Assistant, Google cloud path, MySQL, XML, Postman, Outlook, Spark framework, MS Access, MS Outlook, MS Excel, Jupiter Notebooks, Oracle, Jenkins, Linux, windows

Data Engineer

Aussie Broad band
11.2019 - 06.2021
  • Company Overview: Melbourne, Australia
  • Improved overall user experience through support, training, troubleshooting, improvements, and communication of system changes
  • Performed specified data processing statistical techniques such as sampling techniques, time series, estimation, co-relation, and regression using R
  • Applied Data mining Techniques using LR, classification and clustering
  • Used Juypter notebooks(NumPy, seaborn, pandas, SciPy) and spark (PySpark, MLlib) to develop variety of models and algorithms for analytic purposes
  • Collaborated with the data engineers to implement ETL processing and optimized SQL queries to perform data attraction to fit the analytical requirements
  • Utilized NLP(National language Processing) techniques to optimize customer satisfaction
  • Implemented the framework to migrate Relational data to non-relational data stores and to run performance tests against different NoSQL vendors
  • Designed rich data visualizations to model into human-readable forms with Tableau and Power-Bi
  • Performed many SQL queries in Teradata SQL workbench to prepare right data sets for Tableau dashboards, Queries involves in retrieving data from multiple tables using various join conditions that enabled to utilize the optimized data extractions for Tableau workbooks
  • Worked on data cleaning and ensured data quality, consistency, integrity using Pandas NumPy
  • Melbourne, Australia
  • Environment: R, AWS, NoSQL, AWS, MySQL, Juypter Notebooks, Python, Power-Bi, Business Intelligent, Wi-Fi routers and switches, MapReduce, Hadoop, Tableau, PowerBi, NLP, Teradata, Git, Agile/Scrum, Hive, Pig, Oracle, Tomcat Server

Data engineer

Glo project
11.2018 - 04.2019
  • Company Overview: Melbourne, Australia
  • Synthesized current business intelligence data to produce reports and polished presentations, highlighting findings and recommending changes
  • Worked on the website which highly helps the students named XPLUR and developed all the requirements given by the client
  • Have done data scrapping using the tool named Octoparse which helps to scrap the LinkedIn and seek data for loading into the database
  • For the backend third-party application named Caspio is used, which helps to run the data at the back end
  • Analyzed SAP transactions to build logical business intelligence model for real-time reporting needs
  • Melbourne, Australia
  • Environment: Excel, XPLUR, Octoparse, Caspio, SAP

Data engineer

Tech Mahindra
01.2017 - 04.2018
  • Company Overview: India
  • Used SSIS to create ETL packages to validate, Extract, Transform and load data into data warehouse and data mart
  • Create views and Table-valued Functions, Joins, Complex subqueries to provide the reporting solutions
  • Optimized the performance of queries with modification in T-SQL queries, removed unnecessary columns and reductant data, normalized tables, established joins and created indexes
  • Created SSIS packages using Pivot transformation, Fuzzy lookup, derived columns, condition split and data flow task
  • Have written SQL queries and PL/SQL - procedures, functions, triggers, sequences, cursors etc
  • Migrated data from SAS environment to SQL server 2008 via SQL integration Services(SSIS)
  • Built Rest API's to easily add the new analysis or issuers to the model
  • Developed and dynamic performance and finance reports(profit loss statements, statements, funding reports, profitability gross margin etc.) by using SSRS and ran the reports monthly and distributed them to respective departments through mailing server subscriptions
  • Used SAS/SQL to pull the data out from databases and aggregate to provide detailed reporting based on user requirements
  • India
  • Environment: SSIS, SQL, JAVA, PL/SQL, T-SQL, SAS, Rest API's, SSRS, Agile/Scrum, SharePoint 2010, Visual studio 2010, DB2, SQL server management studio, Oracle

Education

Master of Science - Data Analytics

Swinburne University of Technology
Melbourne, Australia
04-2020

Bachelor of Science - Engineering

Amrita University of Technology
Coimbatore, India
04-2017

Skills

Programming Languages

Java, Python, SQL, T-SQL, Java, HTML,CSS

Databases

Oracle, Snowflake, MS Access, MySQL, RSDS Teradata Assistant, Mongo DB, Hadoop, Caspio, Apache Honeycomb

Tools

Power BI, Tableau, SQL Developer, Excel, Microsoft office, Juypter Notebooks , IBM SPSS, Jira, Microsoft Azure, AWS,Balsamic, Visio, Power designer

Version Control

GitHub, SVN, Mockito

Platforms

Windows UNIX, LINUX, Ubuntu

Cloud Technologies

Amazon (EC2, EBS,EMR, RDS, IAM, Glue), Google cloud Path

Timeline

Data Analyst

Red Hot
01.2024 - 09.2024

Data Python Analyst

Truist Financial
05.2023 - 12.2023

Insights and Data Analyst

CAPGEMINI - ANZ Bank
06.2021 - 10.2022

Data Engineer

Aussie Broad band
11.2019 - 06.2021

Data engineer

Glo project
11.2018 - 04.2019

Data engineer

Tech Mahindra
01.2017 - 04.2018

Master of Science - Data Analytics

Swinburne University of Technology

Bachelor of Science - Engineering

Amrita University of Technology
Bhanusree Alaparthy