Over 14 years of experience across diverse data streams, including Machine Learning/Deep Learning Engineering, Data Engineering, Data Science, Data Analysis, Database Architecture, Development, and Performance DBA roles.
Expertise in designing, creating, and deploying AI-driven software solutions using Data Engineering, Machine learning, Deep learning, and Statistical methodologies.
Proven ability to design and implement scalable data engineering solutions on Azure, AWS, and custom platforms.
Proficient in deploying machine learning solutions in production using microservices architecture with Docker and Kubernetes.
Skilled in building user-friendly interfaces for data analytics solutions using Angular, Node.js, Flask, and Java.
Proficiency in data preprocessing and summarization using AWS, Azure, Databricks, SQL, PL/SQL, Python, and Node.js.
Key Technical Expertise:
1. **Machine Learning/Deep Learning:**
- **Regression Analysis:** Linear, Polynomial, Decision Tree Regression, Random Forest, SVR, Lasso, Ridge, and Elastic Net.
- **Classification Models:** Logistic Regression, K-NN, SVM, Naive Bayes, Decision Tree, and Random Forest.
- **Natural Language Processing:** Sentiment analysis, custom analyzers, entity extraction, and word embeddings.
- **Clustering:** K-Means, Hierarchical Clustering.
- **Association Analysis:** Apriori, Eclat, and Crosstab.
- **Neural Networks:** ANN and RNN.
Data Engineering: - Advanced expertise in database administration, data modeling, ETL pipeline creation, and performance optimization.
- Extensive experience with SQL/PL-SQL and Python-based custom ETL processes.
- Expertise in streaming data solutions using Kafka and Golden Gate.
- Proficient in setting up and managing environments for real-time data processing using Databricks, Spark, and Azure Synapse Analytics.
- Skilled in integrating cloud storage solutions like Azure ADLS, BLOB, AWS S3, and Redshift.
Cloud Platforms & Services: - **Azure:** Databricks, ADLS, Synapse Analytics, Data Factory, Key Vault, and Stream Analytics.
- **AWS:** S3, Lambda, Glue, Redshift, Kinesis, Athena, SageMaker, and Lake Formation.
4. **Visualization & Text Mining:**
- Proficient in data visualization with D3/DC, Matplotlib, and Tableau.
- Experience in text mining with NLTK for sentiment analysis and keyword extraction.
5. **Database and Performance Optimization:**
- Expertise in database tuning and optimization using RAT tools such as DB Replay and SPA.
- Strong background in optimizing data processing jobs and enhancing query performance.
6. **Certifications:**
- Oracle 10g OCA Certified.
- AWS Solutions Architect Professional Certified.
**GitHub Portfolio:**
- Developed a Flask-based DC web application: [GitHub Repository](https://github.com/EswaraRaoBudha/Flask_DC_WebApplication.git).
Overview
14
14
years of professional experience
1
1
Certification
Work History
Associate Director of Data Engineering
Merck
06.2022 - Current
Leading Global Merck Vaccine use-case which includes USA,Japan,AP and EU regions
Led teams of up to 10 personnel, supervising daily performance as well as training and improvement plans.
Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
Modeled predictions with feature selection algorithms.
Leveraged mathematical techniques to develop engineering and scientific solutions.
Analyzed large datasets to identify trends and patterns in customer behaviors.
Improved data collection methods by designing surveys, polls and other instruments.
Devised and deployed predictive models using machine learning algorithms to drive business decisions.
Implemented randomized sampling techniques for optimized surveys.
Developed polished visualizations to share results of data analyses.
Compiled, cleaned and manipulated data for proper handling.
Ran statistical analyses within software to process large datasets.
Applied loss functions and variance explanation techniques to compare performance metrics.
Organized and detail-oriented with a strong work ethic.
Worked flexible hours across night, weekend, and holiday shifts.
Identified issues, analyzed information and provided solutions to problems.
Manager Data Architect
Tech Mahindra Ltd.
01.2011 - Current
Playing a Key Role in Design, Create, and Implement Data Engineering, Multi-Cloud Data Engineering solutions, and AI Solution Using Machine Learning, Deep Learning, AI, and statistical solutions
Working closely with Business Leads to Understand the Business Data
Built Data Engineering solutions like Create Custom Data pipelines using Python, SQL, and PL/SQL & Node Js
Creating new architectural designs for business needs, setup POCs and productionize the new designs
Create and implement Data Modeling for both SQL and No SQL DBs (Oracle, MS SQL Server, Postgres, and Cassandra DBS)
Build Analytical platform for multipurpose needs like reporting, visualization and ML model building
Strong experience with working both on prem & multi cloud services
Worked on setting up Azure Databricks cluster for ML environments
Create Data models in Azure Delta Lake store huge applications calls Data
Created Databricks cluster and notebooks to process real-time millions of application calls to find anomaly detection using event-hub and data bricks
Worked on creating spark code process real-time and also analytics data using py-spark and SQL
Created POC with Azure stream analytics
Worked on Databricks secrete configuration with azure key-vaults and data bricks
Mapped ADLS with Delta lake
Providing different storage solutions using Blob storge & ADLS (like hot, cool and archive)
Create schedulers process for day to day data processing, like as below Historical and near Realtime data processing for Rep-metrics dashboards
ML Features data generation for training ML Models
Process text data for sentiment identification
Process Millions of records to generate recommendations Accessory vs Device recommendations
Creating Monitoring & Notification services for tracking failures
Create application endpoints fetch data for business needs in real-time
Deploy Custom pipelines using CI/CD methods (docker and Kubernetes)
Work with Azure Analysis Services (cube Database) run analytics using PowerBI
Write SQL/PL-SQL code on top SQL Server and Oracle DB to aggregate Data
Stream the large volume of data using Python, Node Js, Kafka, and Golden Gate
Analyze large amounts of data to discover trends, patterns, and perform statistical analysis using Pandas/Matplotlib, PowerBI, and D3/DC
Built a lot of Different ML Models for Cricket Wireless, such as Predicting Customer Churn (TensorFlow ANN), Recommender Engine for Accessories and Multi-Product Recommendations
(Collaborative filtering/SVD/Crosstabulation), Retail store Sales Predictions
(TensorFlow ANN), Predicting Adding Line for Existing Account (ML Classification Models), Application Error Log Clustering (NLTK), Customer Segmentation
(RFM), Sentiment word-cloud building for Internal Social media platform
Integrate Azure cognitive service to identify sentiment score for Rep chat application
Build Analytic dashboard for rep-performance with Descriptive statistics techniques (D3/DC)
And so on
Built a Rule-based ML Suggestion in the Transaction Flow
Building and Deploying Model in Production using Node Js and Flask
Built Rep Performance Dashboard for Store Managers and Reps which will help to improve a lot of store performance
Develop as well as Implement Machine Learning/Deep Learning Models for several business problems for Cricket MS project which gets the business insights for various data streams like for instance Retailers Data, Authorized Retailers Data and even Call Center Data
Fine-tune the Machine Learning and Depp Learning solution using hyperparameter tuning techniques
Expertise in Create Design diagrams ML end-to-end Solutions
Storing Experience in Design and Building User Interface solutions using Angular, Node, Flask, and Java
Expertise in designing and implementing solutions using Data Engineering, predictive solutions for various baseness problems
Lead Data Engineer
Tech Mahindra Ltd.
10.2014 - 02.2020
Playing a key role as Data Engineer and Data Scientist in different streams such as Data Modelling, Data Streaming, Data Analysis, and Statistical Data Modeling for the AT&T RETAIL project
Create Custom Data pipelines using Python, Shell scripts, SQL, and PL/SQL
Creating new architectural designs for business needs, setup POCs and productionize the new designs
Setup environments like Apache kafka, Airflow and Bigdata golden gate
Implemented large-scale Data Streaming by building Kafka and Kafka Consumers
Migrate jobs from standard servers to Airflow
Design and Build data Streaming within Oracle and oracle to Cassandra using Oracle Golden Gate
Created a Streaming tool to stream the Data from Oracle to Cassandra without breaking Source Normalization
Created Custom scripts to process Large Scale Application logs using Python and Shell scripting
Create Database code using performance standards
Great experience in Data Modeling in Oracle/Cassandra for Large Scale Datasets
Create a lot of Custom Scripts in PL/SQL for aggerating Business Data and run them through the scheduler for visualizing the business reports
Create and Implement, Database Optimize technics to improve the performance for Database processing Jobs using SQL Tuning and DB Tuning
Create monitoring tools & notification services for scheduler alerting
Worked AWS Migrations POCs
Hands-on experience with AWS Storage solutions using S3, S3 Glacier and EFS
Created Custom event functions using AWS lambda to process Data ETLs
Experience in External table using glue catalog
Create jobs and workflow using AWS Glue
Integrate Cloud watch services for notifications
Working with AWS Relational DB services like Amazon RDS and Amazon Redshift
Data streaming with AWS kinesis & firehose service
Good knowledge of AWS Sage Maker for ML deployments
Analyzing queries Athena & sql-workbench
Strong knowledge of AWS services like Amazon Athena, Kinesis Datastreams, and AWS Lake
Built a lot of ML Models for AT&T Retail, such as iPhone Recommendations using (Apriori and CrossTab models), Customer Wait Time Prediction (ML Classification Models), Built Intelligent Ticket Tracking Tool using NLTK, Application events clustering
(ML Clustering Algorithms), Predict Customer visiting count in the next 3 Days, Identify Rep Fraud Activity
And so on
Develop as well as Implement Machine Learning/Deep Learning Models for several business problems for AT&T Retail project which gets the business insights for various data streams like for instance Retailers Data, Authorized Retailers Data, and even Call Center Data
Analyze large amounts of data to discover trends, patterns, and perform statistical analysis using Tableau, Matplotlib, and Pandas (Python)
Implemented User Interface with Flask, D3, DC for Visualizing various Business Insights and ML Predictions on Dashboards such as Rep Potential analysis Dashboard, iPhone Sales Dashboard, Overall Sales Dashboard, Application Ticketing Dashboard (Error Clustering), And so on
Implement and enhance full-Stack Application building using Flask(Python) and Java
Specialized expertise in Python programming, SQL, PL/SQL, predictive modeling, machine learning, deep learning, statistical & Hadoop, Hive, Spark, risk management, database, data mining, cleansing, data analysis
Performance Database Administrator
Tech Mahindra Ltd.
01.2011 - 10.2014
Company Overview: The client is BT (British Telecom), Its BT Whole Sale business Project
Working on SQL query Tuning and Core Data Base Administration
Create and Implement Monitoring for DB health check(Bad performing SQLs and Space Growth)
Generate SQL Executing plan using Explain Plan, Auto Trace, and Cursor plan, and AWR Plan
Expertise in Analyzing Wait for events, AWR /Stats pack report, SQL Advisory Reports, and Tracing sessions (event no’s 10046 and 10053) to identify problematic SQL's
Provide solution for Bad performing SQLs using Index Creation, SQL plan Base Lines, and SQL profile and stats gathering on Table/Index
CPU & Memory Utilization Using UCPC,prstat, VMSTAT, and OSWATCHER
Proved the performance analysis of RAC (Real Application Cluster) server with LIVE Load Replication Analysis and Migrated RAC successfully into LIVE, Got Much Appreciation from the Client
Played Key Role in Oracle 11g Up-gradation (10.2.0.5 to 11.2.0.3) in view of Performance Area and configuration
Worked on RAT(Real Application Testing) to prove the performance of Oracle 11g
Detail analysis of SQLs behavior in Oracle 10g SQLs with 11g by Using the SPA(SQL Performance Analyzer ) tool
Create SPBs (SQL Plan Base Lines) for problematic SQLs in Oracle 11g for maintaining the stable plan
Worked on Golden Gate Implementation and Worked Identify the Transactional tables to configure GG
The client is BT (British Telecom), Its BT Whole Sale business Project
Education
Master of Computer Applications -
Andhra University
Kakianda, AP, INDIA
01.2010
Bachelor of Science - Mathematics
Andhra University
Prathipadu, AP, India
01.2007
Skills
AWS,Azure and GCP (ML & DE Services)
Analytical Databases (Redshfit, Snowflake, Data bricks and Google Big Query)