Summary
Overview
Work History
Education
Skills
Accomplishments
Certification
Timeline
Generic
SIRISHA VODURU

SIRISHA VODURU

Fairview,PA

Summary

  • More than 11 years of professional IT experience across multiple technologies demonstrating expertise in data engineering, design, and application development.
  • Experience in Big data, Data warehousing, data modeling & Automation in building up projects involving Data Ingestion, Transformation, and Processing.
  • Excellent programming skills using Scala and Python.
  • Experience in data processing with Hadoop Ecosystem tools like HDFS, Hive, Sqoop, Spark with Scala and ETL tools like DataStage9.x and Snowflake with SQL Server, Oracle, Teradata and Unix.
  • Worked on the Automation of scripts using Shell scripting & Python. Experience with data manipulation using Python libraries such as Pandas, PySpark, and Numpy.
  • Experience in handling large datasets using Partitions, Spark in Memory capabilities, Effective & efficient Joins and Transformations during ingestion process itself.
  • Developed RestAPIs using Scala and Akka framework.
  • Experience in Machine Learning, Deep Learning and Artificial Intelligence for user analytics.
  • Experience in Advanced SQL with RDBMS (SQL Server, Oracle, and Teradata) and developing Hive scripts using Hive UDTF, HQL for data processing and end user analytics.
  • Well-versed with importing and exporting data using Sqoop from HDFS to Relational Database Management Systems (RDBMS) and vice-versa.
  • Worked on continuous integration and continuous delivery/continuous deployment tools like Team City/Ops logic and GITHUB.
  • Worked extensively with AWS technologies like RedShift, S3, Cloud Watch, Athena, Glue, DynamoDB, Lambda, ECS, EKS, EMR, Flink, Kenesis, RDS, Kafka and Athena.
  • Involved in the business/client meetings for design, development, requirements discussion and providing solution for complex scenarios.
  • Involved in all phases of Unit Testing, SIT, UAT and Support.
  • Experience in interacting with Business Users/Stakeholders to analyze the business rules and requirements in Banking and Domains.
  • Strong Knowledge on the Spark architecture, Pair RDD’s, Spark DataFrame API including Adaptive Query Execution and profound experience working with UDFs and Spark SQL functions for transforming the raw data to meaningful data for Visualization. Worked explicitly on PySpark and Scala.
  • Experienced in developing ETL data pipelines in AWS Glue, Transformed the data using AWS Glue Dynamic Frames with PySpark; Cataloged the transformed data using Crawlers and scheduled the job and crawlers using workflow feature.
  • Expertise in Data engineering and Development of various Data warehousing applications and experienced in fact dimensional modeling (Star Schema, Snowflake Schema), Transactional modelling and SCD (Slowly changing Dimensions).
  • Well-versed in ingestion of data from different data sources into HDFS using Sqoop and managed sqoop jobs with incremental load to populate HIVE external tables.
  • Worked with HIVE data warehouse infrastructure –creating tables, data distribution by implementing partitioning and bucketing, writing and optimizing the HQL queries.
  • Experience with different file formats like ORC, Parquet, AVRO, JSON and XML.
  • Developed Automation scripts using Shell scripting & Python in building up projects involving Data Ingestion, Transformation and processing.
  • Experienced on Advanced SQL (Views/Stored procedures/Indexes) and hands-on experience in handling database issues and connections with SQL such as Oracle, Teradata, SQL Server, MySQL and NoSQL databases such as HBase, Mongo DB, Dynamo DB.
  • Instantiated, created, and maintained CI/CD (continuous integration & deployment) pipelines using tools such as Team City/Ops logic and GITHUB/BITBUCKET.
  • Expert in designing ETL data flows using creating mapping/workflows from heterogeneous source systems and transforming the data.
  • Experience in Creating, Debugging, Scheduling and Monitoring jobs using Orchestration tools like Autosys, Control M and Oozie.
  • Worked extensively on agile methodology which involves Iteration planning, Sprint, Retro and backlog planning.
  • Experience in interacting with Business Users/Stakeholders to analyze the business rules / requirements and perform source-to- target data mapping. Prepared LLD, Technical Specification documents and providing solution for complex scenarios.
  • Efficient Cloud Engineer with years of experience assembling Cloud Infrastructure. Utilizes strong managerial skills by negotiating with vendors and coordinating tasks with other IT members. Implements best practices to create cloud functions, applications and databases.
  • Experienced Snowflake data engineer adept at seamlessly integrating Snowflake's cloud-native data warehousing capabilities with AWS services. Proficient in configuring secure data pipelines and optimizing storage solutions, ensuring seamless data access and analysis in the AWS environment.
  • Experienced in designing and optimizing data warehousing solutions in Snowflake. Proficient in creating scalable data pipelines, optimizing query performance, and ensuring data integrity within the Snowflake platform.
  • Basic knowledge in GCP, Abinitio and Apache Airflow.

Overview

12
12
years of professional experience
1
1
Certification

Work History

Technical Lead | Senior Data Engineer

HCL Technologies Pvt. Ltd
09.2020 - 08.2023
  • Package Assurance is an CBA initiated project
  • The objective of this project is to provide benefits to the customer as well as the bank by giving subsidies in interest rates to the customers while applying for Personal Loan/Home Loan/Credit Card
  • Worked as senior data engineer/Squad lead coordinating Joint Application Development (JAD) sessions with Solution Designer/Business Analysts and Business stakeholders for performing data analysis and gathering business requirements
  • Performed end-to-end architecture and implementation assessment of various AWS services like AWS EMR, Redshift, S3 and AWS Glue
  • Installed, configured and managed Hadoop Clusters and Data Science tools using AWS EMR
  • Worked on setting up the High-Availability for Hadoop Clusters components and Edge nodes
  • Designed and developed a Python Parser to auto-convert HiveQL codes into equivalent PySpark (Spark SQL) jobs to leverage the Spark capabilities on AWS EMR, thus reducing conversion time by over 90%
  • Designed services for seamless monitoring like monitor Active EMR Clusters running across all regions
  • Used Boto3 library and deployed solution on Lambda
  • Business notifications configured via SES and scheduled via CloudWatch
  • Worked on a framework for orchestration & monitoring of our Core EMR Cluster using Lambda, CloudWatch, & SNS
  • Fetched data from various source systems such as SAP, HLS, CC and COMSSEE by building data pipelines using Pyspark
  • Created PySpark data frame to profile, clean and transform data in the form of CSV files in Amazon S3 bucket
  • Designed, developed, and implemented ETL pipelines using python API (PySpark) of Apache Spark using AWS Glue
  • Performance tuning of PySpark scripts in AWS Glue
  • Collect batch files from customers and extract, unzip and load the files to S3 buckets
  • The final refined tables in S3 are moved to AWS Redshift and wrote various data normalization jobs for new data ingested to Redshift
  • Worked on CI-CD to facilitate seamless integration and deployment; achieved the goals using Github, Team-City and a control framework called Workflow Tables
  • Worked on an agile methodology and have involved in daily standups, technical discussions with business counterparts, sprint planning, scrum meetings, and have adhered to agile principles and have delivered quality code
  • Won CEO Award for completion of the project on time and with great accuracy.

Senior Software Engineer | Hadoop Developer

TVS NEXT Pvt. Ltd
08.2019 - 04.2020
  • Developed software modules for customer alerts and meter functioning for energy client later acquired by Ormat Technologies
  • Performed data ingestion using Sqoop to ingest tables from sources including SQL Server and Oracle
  • Developed HIVE tables on top of the resultant flattened data, storing data as Parquet file to enable quick read times
  • Created HiveQLs to apply business rules, structural transformation, and ensure conformance to refined database
  • Implemented Hive partitioning and bucketing techniques as part of code optimization
  • Created shell scripts to run High-Quality Leads (HQLs), capture reported errors, log error situations, and report them to the calling scripts
  • Named Best Performer for completing user module before deadline.

Systems Analyst | Hadoop Developer

UST Global
10.2018 - 08.2019
  • Designed and modified Customer Alert application for Client Equifax
  • Designed and implemented application in Apache Spark with Scala
  • Served as module lead for end-to-end delivery to customer, including testing
  • Involved in loading from SQL Server/HDFS data to HDFS/SQL Server, creation of partitioning in Hive, and loading data into Hive.

Senior Product Analyst | ETL Developer

Standard Chartered
05.2011 - 04.2017
  • Developed mobile-banking applications in DataStage for UK’s Standard Chartered Bank
  • Nearly eliminated customers’ 1-minute post-transaction delay in receiving alerts by removing multiple joins and modifying API services on existing application
  • Designed source-to-target mappings, assisted in designing selection criteria document, and developed technical specifications of the ETL process flow to proceed with development
  • Created mappings using various transformations like Aggregator, Expression, Filter, Router, Joiner, Lookup
  • Designed and documented validation rules, error handling and unit test strategy of ETL process
  • Tuned performance of mapping and sessions by optimizing source
  • Involved in writing UNIX shell scripts to run and schedule batch jobs.

Education

Bachelor of Technology in Electrical and Electronics Engineering - Technology

Jawaharlal Nehru Technological University
05.2006

Master of Science in Business Analytics - Information Technology

University of Louisville
Louisville, KY
07.2024

Skills

  • Hadoop
  • HDFS
  • Sqoop
  • NiFi
  • Hive
  • Oozie
  • Kafka
  • Zookeeper
  • YARN
  • Apache Spark
  • Cloudera
  • HBase
  • Oracle
  • MySQL
  • SQL Server
  • Teradata
  • Python
  • PySpark
  • Scala
  • R
  • Shell Script
  • SQL
  • Deep Learning
  • Machine Learning
  • Artificial Intelligence
  • SAS Viya
  • AWS
  • GCP
  • Rest API
  • PyCharm
  • Eclipse
  • IntelliJ
  • Visual Studio
  • Plus
  • SQL Developer
  • TOAD
  • SQL Navigator
  • QueryAnalyser
  • SQL Server Management Studio
  • SQL Assistance
  • Hue
  • GIT
  • GITHUB
  • Bit Bucket
  • Linux - Ubuntu
  • Windows
  • Kerberos
  • Dimension Modelling
  • ER Modelling
  • Star Schema modelling
  • Snowflake Modelling
  • Erwin
  • Visio
  • Apache Airflow
  • Autosys
  • Control M
  • Tivoli
  • Tableau
  • PowerBI
  • Team city
  • Ops logic
  • Jenkins
  • Octopus
  • Datastage
  • Snowflake
  • Abinitio
  • Putty
  • WinSCP
  • FileZilla
  • GITBASH
  • Zeppelin
  • Jupyter
  • MongoDB
  • Cassandra
  • Dynamo DB

Accomplishments

  • Achieved [Result] by completing [Task] with accuracy and efficiency.
  • Collaborated with team of [Number] in the development of [Project name].
  • Supervised team of [Number] staff members.
  • Achieved [Result] through effectively helping with [Task].
  • Achieved [Result] by introducing [Software] for [Type] tasks.

Certification

AWS certified Solution Architect - Associate

Snowflake snowpro

Timeline

Technical Lead | Senior Data Engineer

HCL Technologies Pvt. Ltd
09.2020 - 08.2023

Senior Software Engineer | Hadoop Developer

TVS NEXT Pvt. Ltd
08.2019 - 04.2020

Systems Analyst | Hadoop Developer

UST Global
10.2018 - 08.2019

Senior Product Analyst | ETL Developer

Standard Chartered
05.2011 - 04.2017

Bachelor of Technology in Electrical and Electronics Engineering - Technology

Jawaharlal Nehru Technological University

Master of Science in Business Analytics - Information Technology

University of Louisville
SIRISHA VODURU