Summary

Overview

Work History

Education

Skills

Websites

References

Timeline

Pallavolu Jayakumar

Charlotte,NC

Summary

Highly skilled Big Data Engineer with over 9+ years of experience in developing, implementation and optimization of data plumbing systems and ETL processes with in-depth understanding of business and IT requirements to streamline administration and internal processes, resulting in enhanced automation and operational efficiency. Collaborated with data analysts and leads to develop data pipelines to increase data availability in Hadoop environment. Strong analytical, leadership, and communication skills, with a commitment to excellence.

Overview

years of professional experience

Work History

Java/Big Data Developer

Mitchell Martin Inc.

08.2023 - Current

Successfully implemented a POC to migrate the TIAA Enterprise applications consisting of 2 major projects built by Jenkins Groot Controller to a Federated CI Pipeline
Assisted team with cloud optimization in scaling up/down of the Spark Dynamic allocution of executors
Set up execution pipeline in Electric flow to test the automatic deployment and execution of jobs
Created validation scripts to test connection between Hadoop cloud environment on AWS to pull data from Snowflakes using Sqoop
Supported team with validation on Prod-fix environment with migration from On-Prem to AWS cloud environment
Developed reports from hive table for visualization purposes and presented the results in Tableau.

Application Architect

Mitchell Martin Inc.

03.2022 - 08.2023

Developed Scala/Spark code for Spark SQL transformation on hive tables and optimized performance
Partnered with LOB contacts to design data pipelines in Hadoop ecosystem for report generation and predictive modeling
Extensively worked with Autosys tool for workflow scheduling and actions triggering
Leveraged Sqoop ingestion framework for reading data from RDBMs and loading into hive tables
Developed and implemented data pipelines to improve data quality, resulting an increase in data accuracy
Partnered with Line of Business (LOB) contacts to create the flow of data from source systems to the Strategy Decision Engine (SDE), the brain for Collections and Recovery module.
Participated in designing data pipelines in Hadoop ecosystem, along with Job scheduling tools (AutoSys) and ETL tools (IBM DataStage) to support rapidly growing business processes for report generation and predictive/prescriptive modeling for campaign decision engines.

Application Architect

Mitchell Martin Inc.

10.2021 - 02.2022

Developed Scala/Spark code to read data from Healthcare EDC adapter to download the clinical research data in XML format through web API calls.
Performed data validation for the downloaded XML files using XSD to ensure the attributes match from XML to DB.
Flatten the downloaded XML data using required schema files and appropriate data types in spark and stored in hive tables.

Application Architect

Mitchell Martin Inc.

01.2021 - 10.2021

Enhanced generic file export job in Spark/java code for reading hive tables and exporting to different platforms
Partnered with LOB contacts for data flow design and participated in designing data pipelines in Hadoop ecosystem
Set up Zaloni registration file for data pull from RDBMS into Hadoop Data Lake Hive tables.
Experience in designing DataStage Job sequence flows to read a file from the landing zone and load it into the Oracle table.
Developed JUnit test classes to check data quality for the heavy transformations performed on data for the stage table load in hive.
Experience in working with DataStage to build data extraction from SQL Server and create flat files to be ingested into Hadoop.
Partnered with Line of Business (LOB) contacts to create the flow of data from source systems to the Strategy Decision Engine (SDE), the brain for Collections and Recovery module.
Participated in designing data pipelines in Hadoop ecosystem, along with Job scheduling tools (AutoSys) and ETL tools (IBM DataStage) to support rapidly growing business processes for report generation and predictive/prescriptive modeling for campaign decision engines.
Set up Zaloni registration file to perform DATA PULL from RDBMS source, Oracle and SQL Server into Hadoop Data lake Hive tables

Big Data Engineer

Populus Group

12.2019 - 06.2020

Hands-on experience in migrating applications between Hadoop clusters
Developed and automated Spark jobs with Oozie actions
Scheduled Oozie workflows and coordinators for Sqoop imports/exports from various sources
Experienced in handling Hadoop jobs in Yahoo's native cluster.
Identified and executed process improvements in relation to data processes.
Ability to understand and interpret the Machine learning code for data analytics.
Exposure to Linear regression and classification models.
Extensively worked with Yahoo's CI/CD pipeline tool, Screwdriver, for continuous delivery with YAML file support.

Hadoop Admin and Support Lead

Eniac Systems Inc.

06.2018 - 12.2019

Used Sqoop to ingest the data from RDBMS to Hadoop Distributed File System (HDFS) and later analyzed the imported data using Hadoop Components
Worked on Historical data ingestion and incremental load approaches to the daily batch processes using Hadoop utilities
Responsible for supporting all test environments and issues during warranty period post implementation
Involved in creating Generic Components leveraging the existing capabilities of IIS Suite
Used the tools/technologies such as Hadoop, Hive, Impala, Sqoop, IBM Infosphere Information Suite (DataStage), Teradata, Oracle, Star Team, Autosys, Unix/Linux Scripting.
Experience in writing the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Assisted app teams to set up the Prod Fix and Disaster Recovery Hadoop Environments for critical applications.
Successfully implemented version control of all the codes/scripts so, that all the SDLC Environments are in sync.
Successfully implemented the data ingestion of real time data via Kafka, into Hbase streamlined through NIFI Application.
Automated the Data validation process of critical data loads before sending over the data to Downstream Applications.

Hadoop/Spark Developer

Collabera

11.2017 - 06.2018

Published the HDFS/Hive table data to external system using the custom Kafka Producer for continuous updates
Developed Spark RDD transformations, actions, and Data Frames, case classes, Datasets for the required input data and performed the data transformations using Spark-Core
Worked on Converting Hive/SQL queries into Spark transformations using Spark RDD with Scala and Worked with Spark Context, Spark-SQL, Data Frames, Pair RDD's, and Datasets
Imported from several relational databases to HDFS and exported data from HDFS to RDBMS using Sqoop
Created Parquet Hive tables with Snappy compression, loaded data and wrote Hive queries, which will invoke and run MapReduce tasks in the backend.

Hadoop Developer

Nemo IT Solutions, Inc.

01.2017 - 09.2017

Developed Pyspark code to read data from Hive, group the fields and generate XML files
Enhanced the Pyspark code to write the generated XML files to a directory to zip them to CDAs
Implemented REST call to submit the generated CDAs to vendor website
Implemented Impyla to support JDBC/ODBC connections for Hiveserver2
Enhanced the Pyspark code to replace spark with Impyla
Built data validation dashboard in Solr to display the message record.

Hadoop Developer

Nemo IT Solutions, Inc.

01.2016 - 12.2016

Evaluated Spark’s performance vs Impala on transactional data
Used Spark transformations and aggregations to perform min, max and average on transactional data
Experienced in migrating data from HiveQL to SparkSQL
Knowledge in using Spark Dataframes to load data in Spark Dataframes
Used java to develop Restful API for database Utility Project
Designed a data model in Cassandra(POC) for storing server performance data.

IT Analyst

Serco Global Services

12.2011 - 07.2014

Analyzed and prepared detailed specifications and test requirements
Coordinated with the Business Analyst and the Business users to understand project requirements to figure out the scope of test strategy
Executed test cases based on the BRD and SDD and uploaded to Quality Center
Involved in test case execution and creation of the bugs by using the Chromium tool
Performed functionality testing as per Google Testing Standards
Real Time & continuous follow-up with global support teams for Critical incident resolution.

Education

Master of Science in Computer Science -

University of Illinois Springfield

Springfield, IL

07.2015

Skills

Big Data/Hadoop Framework HDFS, Map Reduce, Pig, Hive, Sqoop, Oozie, Cassandra, Spark, Impala,
Impyla, Streamsets, NiFi, Kafka, Zaloni
Hadoop Distributions Cloudera, Hortonworks
Languages Java, Scala, Python
BI Tools Data Visualization Tools (Tableau, Qlikview, Power BI, MicroStrategy)
Enterprise Applications MS Office Suite
Databases MS SQL Server 2005/2000/70, Oracle 9i/10g, Netezza, Teradata

Enterprise Data Warehouses EDW, RMW, ESP
Operating Systems Windows XP/7/8/10, Ubuntu, RHEL
Development Tools Eclipse, IntelliJ, Visual Studio
Cloud Computing Microsoft Azure, AWS
CI/CD Deployment tools YAML, Screwdriver, Ansible, Jenkins, GIT, Bitbucket
File Transfer SFTP, FTP, NDM, DTS
Other Tools IBM Infosphere Data Stage, Autosys, Snowflake, ElectricFlow

Websites

linkedin.com/in/soumya-jayakumar-417203139

References

Available upon request

Timeline

Java/Big Data Developer

Mitchell Martin Inc.

08.2023 - Current

Application Architect

Mitchell Martin Inc.

03.2022 - 08.2023

Application Architect

Mitchell Martin Inc.

10.2021 - 02.2022

Application Architect

Mitchell Martin Inc.

01.2021 - 10.2021

Big Data Engineer

Populus Group

12.2019 - 06.2020

Hadoop Admin and Support Lead

Eniac Systems Inc.

06.2018 - 12.2019

Hadoop/Spark Developer

Collabera

11.2017 - 06.2018

Hadoop Developer

Nemo IT Solutions, Inc.

01.2017 - 09.2017

Hadoop Developer

Nemo IT Solutions, Inc.

01.2016 - 12.2016

IT Analyst

Serco Global Services

12.2011 - 07.2014

Master of Science in Computer Science -

University of Illinois Springfield

Pallavolu Jayakumar

Summary

Overview

Work History

Java/Big Data Developer

Application Architect

Application Architect

Application Architect

Big Data Engineer

Hadoop Admin and Support Lead

Hadoop/Spark Developer

Hadoop Developer

Hadoop Developer

IT Analyst

Education

Master of Science in Computer Science -

Skills

Websites

References

Timeline

Java/Big Data Developer

Application Architect

Application Architect

Application Architect

Big Data Engineer

Hadoop Admin and Support Lead

Hadoop/Spark Developer

Hadoop Developer

Hadoop Developer

IT Analyst

Master of Science in Computer Science -

Similar Profiles

Valerie CaseyValerie Casey

Pritha PanditPritha Pandit

VINEET A. ALEXANDERVINEET A. ALEXANDER

Elijah SneadElijah Snead

TIMOTHY YEOMANSTIMOTHY YEOMANS