Summary
Overview
Work History
Education
Skills
Websites
Certification
Timeline
Generic

Thirupathi Gurunatham

Ashburn,VA

Summary

Over 8 years of professional IT experience and expert in Requirements Gathering, designing, development, implementation and testing of Multi-tired, Distributed Applications and Web Based Applications using Big data and JAVA/J2EE technologies. Strong experience in various phases of Software Development Life Cycle (SDLC) as requirement gathering, modeling, analysis, architecture design, development, testing and implementation. Strong experience on designing Big data pipelines such as Data Ingestion, Data Processing (Transformations, enrichment and aggregations) and Reporting. Strong experience in developing jobs using Apache Spark. Extensive knowledge in programming with DataFrames and Resilient Distributed Datasets (RDDs). Strong experience in submitting Spark applications in different clusters such as Spark Standalone and Hadoop Yarn. Strong knowledge in various Python libraries like Numpy and Pandas. Strong knowledge in various Machine Learning algorithms like KNN, Naive Bayes, Logistic Regression and Linear Regression, SVM, Decision trees, Random Forest and Gradient Boosted Decision Trees. Experienced applying machine learning and deep learning techniques to build models and analyze large scale data. Profound experience in implementing real time data streaming solutions using Spark Streaming, Kafka. Good knowledge on various Amazon Web Services (AWS) such as S3, EC2,Redshift, ECS, EMR, VPC, RDS,SQS, ELB . Experience in tuning and improving the performance of spark jobs by exploring various options. Strong Experience in migrating data using Sqoop from HDFS to Relational Database Systems and vice-versa. Strong experience in developing the workflows using Apache Oozie framework to automate tasks. Experience in working with Map Reduce programs using Apache Hadoop to analyze large data sets efficiently. Strong experience in working with Core Hadoop components like HDFS, Yarn and Map Reduce. Strong experience in Cloudera Hadoop distribution with Cloudera manager. Experience in launching spark applications by using the Kerberos authentication. Strong knowledge in developing Spark applications using Scala. Good understanding and knowledge of NoSQL databases like HBase and Cassandra. Good experience in performing and supporting Unit testing, Integration testing, QAT and UAT and production support for issues raised by application users. Experience in using Design Patterns: Singleton Pattern, DAO and MVC Pattern. Experienced in generating logging by Log4j to identify the errors in production test environment. Efficient in developing java applications in various Integrated Development Environment (IDE) tools like Eclipse, My Eclipse and RAD. Experience in deploying applications on IBM WebSphere and Apache Tomcat. Having good experience on using version control tools like GIT, SVN and Clearcase. Hands on experience in setting up repositories in SBT, Maven and Ant. Outstanding skills in design aspects and technical documentation along with strong interpersonal, analytical, and Organizational skills. Experience in developing application on different platforms like Windows, UNIX and LINUX.

Overview

13
13
years of professional experience
1
1
Certification

Work History

Senior Data Engineer

Red Ventures
02.2023 - Current
  • CDM is an analytic software that utilizes ETL techniques to extract all this data in its raw from, standardize it into meaningful information, and build staging and reporting layers into a database
  • This resulting information can then be easily queried or used by a data visualization tool, such as Tableau, to expose KPIs for a business
  • It enables standardization of user activity on a website to help new businesses visualize key insights using out-of-the-box reports and dashboards for audience insights, page metrics, paid search reporting, product analysis, and revenue reporting

Big Data Engineer

Bank of America
10.2023 - 01.2023
  • Credit Risk Platform - Consumer Cards

Big Data Engineer

Deloitte Consulting
05.2014 - 09.2018
  • Interacting with product owners and business analysts for requirements
  • Reading the source data from Kafka topics by using Spring boot Kafka and writing into HDFS landing zone
  • Importing data from RDBMS to HDFS landing zone
  • Creating solutions using Java, Scala, HDFS, Spark Core, Streaming and Hadoop tools
  • Orchestrated Spark jobs by using customized schedulers
  • Copy the data from landing zone to processing zone by using HDFS scripts
  • Load the data from processing zone to Spark process as Dataframes
  • Transforming the dataframes by using user specified transformation rules
  • Filtering the data by using the inclusion rules
  • Aggregated the dataframe by using group by columns
  • Enriched the dataframe by joining with reference data tables
  • Write the data to conformed zone as parquet format
  • Created external hive table on top of parquet data for reporting
  • Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop
  • Running the Spark application in YARN deployment mode
  • Trained java developers from other teams on Big Data technologies
  • Developed the POC for Apache Hadoop framework initiation
  • Used Sqoop to import the data from RDBMS to HDFS
  • Used Sqoop to export the analyzed data to relational databases to generate reports by business intelligence tools
  • Developed several map reduce jobs using java to analyze and transform the log data into structured way to find user information
  • Created Oozie workflows and coordinators to automate Sqoop jobs weekly and monthly
  • Used Cloudera distribution for Hadoop ecosystem
  • Developed custom Input format and record reader classes for reading and processing the binary format in MapReduce
  • Involved in creating Hive tables, loading data & writing hive queries
  • Developed Custom writable classes for Hadoop serialization and De serialization of Time series tuples
  • Developed java utility classes that were needed in Map-Reduce functionality.

Java developer

RMN Infotech
05.2011 - 12.2012
  • Responsible for analysis, design, development and integration of UI components with backend J2EE
  • Developed REST based web services
  • Implemented Web service calls using JAX-WS and SOAP
  • Used JAXB to create set of classes that correspond to XML schema
  • Used JAXB to create XML that correspond to classes
  • Used SOAPUI Pro to perform functional and load testing of web services
  • Developed code for consuming SOAP and RESTful web services
  • Supported in the maintenance of web services developed by IBM Websphere Message Broker
  • Developed and debugging ESQL and Java code using Message Broker Toolkit
  • Implemented Rules Engine using IBM rational tool IWODM Jrules
  • Used JPA/Hibernate, Spring JDBC to interact with Oracle 10g database
  • Used MyEclipse IDE to develop the applications
  • Developed user interfaces using JSP, Java Script, HTML and CSS
  • Developed SQL queries and stored procedures for retrieving data
  • ANT is used as build tool
  • Used Log4j for logging to trace the errors
  • Deployed the applications and web services on IBM WebSphere application server
  • Assisting production support team to resolve application issues.

Education

Master of Science in Computer Science -

University of Central Missouri
Warrensburg, MO

Bachelor of Technology in Computer Science -

Jawaharlal Nehru Technological University
Hyderabad, AP

Skills

  • Big data ecosystem
  • Apache Spark
  • Spark SQL
  • Hadoop
  • HDFS
  • Yarn
  • Map reduce
  • Sqoop
  • Hive
  • Impala
  • Spark Streaming
  • Kafka
  • Oozie
  • HBase
  • Big data platforms
  • Databricks
  • Cloudera
  • Amazon EMR
  • Languages
  • Scala
  • Python
  • Java
  • HTML
  • JavaScript
  • C
  • C
  • Machine Learning Techniques
  • K-Nearest Neighbors
  • Naive Bayes
  • Logistic Regression
  • Linear Regression
  • Support Vector Machines
  • Decision Trees
  • Random Forests
  • Gradient Boost decision Trees
  • Stacking Classifiers
  • Cascading Models
  • K-Means Clustering
  • Hierarchical and DBSCAN Clustering
  • PCA
  • T-SNE
  • Amazon Web Services
  • S3
  • EC2
  • Redshift
  • ECS
  • EMR
  • CI/CD
  • CircleCI
  • Jenkings

Certification

AWS Certified Solutions Architect - Associate

Timeline

Big Data Engineer

Bank of America
10.2023 - 01.2023

Senior Data Engineer

Red Ventures
02.2023 - Current

Big Data Engineer

Deloitte Consulting
05.2014 - 09.2018

Java developer

RMN Infotech
05.2011 - 12.2012

Master of Science in Computer Science -

University of Central Missouri

Bachelor of Technology in Computer Science -

Jawaharlal Nehru Technological University
Thirupathi Gurunatham