Summary
Overview
Work History
Education
Skills
Timeline
Generic

Subramanian Karuppiah

Summary

  • Twenty years of extensive experience in Software industry and with nine years in Data and Cloud platforms
  • Built many custom ETL and data analytical metric tools
  • Extensive experience in mapping of data sources, data movement, interfaces and analytics.
  • Expertise in Data Governance, Data Quality and Integration
  • Worked as a Mentor, Collaborated with Project Managers and Team Leader
  • Expertise in various open source frameworks and tools like Spring, Quarkus, Hibernate and Log4J.
  • Expertise in Hadoop Bigdata (Storm, Spark, Kafka, Kenisis Steams, Flume, Hbase, Hive, Sqoop and Hadoop eco system)
  • Expertise in Cloud based storage systems (AWS, Azure and GCloud)

Overview

22
22
years of professional experience

Work History

01.2022 - Current
  • Clients: Elevance Health
  • Project Name: Carelon Pharmacy Data Lake
  • Working as an Enterprise Architect / Solutions Architect to support various initiatives within elevance health
  • Built a custom data pipeline tool to load data into sales force
  • Responsible for Supporting Carelon Pharmacy Data Lake
  • Responsible for building many reporting layers for the customers
  • Supporting micro service architecture for various live calls within the organization
  • Working on a migration effort from AWS to Azure
  • Elelvance Health formerly known as 'Anthem Inc' is an insurance provider and brings together the concept of elevate and advance method towards improving health programs, it core purpose is to address physical, behavioral and social needs to improve health, affordability and quality for individuals and communities.
  • Java
  • Microservices
  • Python
  • Spark
  • Kafka
  • Hive
  • Postgres
  • Oracle
  • Shell Script
  • Kubernetes
  • Tableau
  • Git
  • Confluence
  • Jenkins
  • AWS S3
  • Lambda
  • Glue
  • AWS Cluster

Atlanta, GA
08.2020 - 12.2021
  • Clients: Equifax
  • Project Name: DataPrep Streaming
  • Worked as a GCP Cloud Architect to support data prep team.
  • Worked on various use cases to build the streaming applications for data processing.
  • Responsible for Data Security and Data Governance activities.
  • The dataprep streaming is part of data ingestion process within Equifax to streamline the incoming data from various source systems. Its primary focus is to group, index, standardize, cleanse, validate and toll-date raw contributor data to provide data hygiene and quality assurance.
  • Java
  • Google Cloud
  • BigQuery
  • BigTable
  • Apache Beam
  • DataFlow
  • Spring
  • Micro services
  • Jira
  • Confluence
  • Jenkins

Minneapolis
01.2019 - 07.2020
  • Clients: Express Scripts
  • Project Name: Specialty Transformation
  • Worked as a Lead Architect and responsible for leading the team as well as building the design and implementation for specialty products including to support Data Science team.
  • Responsible for building streaming applications using Scala, Kafka, Spark, Phoenix, Hbase and Hive
  • Responsible for Prod Support, Cluster upgrade and CI activities.
  • Responsible for doing POC on AWS Cluster migration
  • Specialty Transformation is part of process within Express Scripts with specialty team to migrate data from traditional storage systems to Hadoop Systems. This hadoop platform is used as single point of contact for any downstream teams to process and populate the data for the own business needs.
  • Java
  • Scala
  • Spark
  • Hbase
  • Phoenix
  • Hive
  • HDFS
  • Kafka
  • Cloudera
  • Teradata
  • Oracle
  • Jenkins
  • Git

Atlanta
05.2018 - 01.2019
  • Clients: Macys Technologies
  • Project Name: DaaS (Data as Service)
  • Worked as a Big Data Lead / Architect and responsible for managing various data sets migration and transformation.
  • Initial data load from various source systems using Infoworks ETL tool to Google Cloud is in progress. Part of Core Team to manage Google data migration process.
  • Built Streaming as well as Batch applications using Kafka, Spark, Hbase and Hive
  • DaaS is responsible for acquiring all/any data that has business value to create single version of truth for the enterprise - combining online and stores data. The main objective is to provide common data services to the enterprise and democratize this data through frictionless access facilitating speed and agility required by various data consumers.
  • Java
  • Spark
  • Hive
  • Elastic Search
  • Kibana
  • HDFS
  • Kafka
  • Google Cloud
  • Big Query
  • Infoworks
  • Column propagation
  • Hortonworks Cluster
  • Azure Data Lake Service

New York
01.2014 - 04.2018
  • Clients: Altice USA
  • Project Name: EDP APP Monitoring System
  • Worked as an architect cum Lead to build this product.
  • Worked on building many API's to connect the cluster and get the monitoring metrics.
  • Developed Kafka Consumer, Storm UI Server, Hbase and Hive metrics to capture the data flow across hadoop eco system
  • Developed an alert mechanism to send an email to stake holders in the event of high or low threshold level data flow.
  • The purpose of this product is to have a monitoring framework for Altice Hadoop Distributed System. This framework captures the metrics data when it passes thru various stages of Hadoop Eco System such as Kafka, Storm, Hbase, Hive etc for each component and collected in a Postgres database. Based on metrics count, the threshold level is being calculated and an alert will be sent if any high / low in count for the particular data set. This is externally configurable in nature to add a new data set / project to an existing monitoring system.
  • Java
  • Kafka
  • Storm
  • Hbase
  • Hive
  • Postgres Database
  • Shell script
  • Jenkins
  • Eclipse

New York
  • Clients: Altice USA
  • Project Name: WIFI DataSets Ingestion
  • Worked as a Lead developer
  • Setup Kafka MirrorMakers to pull data from source kafka cluster
  • Built Storm Streaming applications to process 1.2 Billion records per day
  • Exposed the data into Hive and other relational sql's for downstream purpose.
  • The purpose of this project is to capture the wifi datas from source systems and do the processing and polishing on the datasets based on given business rules and finally persist them in Hbase and Hive tables. Here the source is Kafka and data processing layer is in Storm.
  • JDK 8.0
  • Kafka
  • Storm
  • Hbase
  • Hive
  • AWS
  • Redshift
  • S3
  • Hortonworks Cluster

New York
  • Clients: Altice USA
  • Project Name: DataSets Migration
  • Worked as a Lead Developer along with Leading the team
  • Built many Kafka Producers and Consumers to load data into HDFS using Confluent Connectors
  • Built Kafka Consumers to load data in S3 and Redshift
  • Built Streaming applications using Kafka and Storm to persist data in Hbase and Hive.
  • This project is to migrate existing data sets from traditional systems to Hadoop eco system. Since the Hadoop cluster is secured with Kerberos, the data migration has to be processed on secure layers which includes Kafka, Flume, Storm, Hbase and Hive.
  • JDK
  • Hortonworks Cluster
  • Hadoop Eco System
  • Apache Storm
  • Spark
  • Kafka
  • Flume
  • Oozie
  • R
  • Python
  • Hive
  • Pivotal Cluster
  • Oracle
  • Netezza
  • Redshift
  • AWS
  • Redhat Linux

New York
  • Clients: Altice USA
  • Project Name: EDP Security Framework
  • Worked as an Architect cum Developer to develop this product.
  • This framework was developed for Data Security and Governance in Altice. All the data inflow / outflow is controlled with this framework to make sure the needed field is secured with encryption algorithm.
  • Java
  • Shell script
  • Cipher Security Framework
  • Oracle
  • Redshift
  • Hive

New York
  • Clients: Altice USA
  • Project Name: Enterprise DataLake Platform (EDP)
  • Worked as a Lead Big Data engineer.
  • Responsible for setting up 51 Node Hadoop Cluster
  • Responsible for building Hadoop transport components to persist data in Hadoop and AWS environment
  • The purpose of EDP is to act as repository of facts, deriving data from different source systems. The aim is to provide long term storage, computing capabilities, data security and data graduation to actual DWH environment. This environment is using Pivotal Hadoop eco system and meant to act primarily as a data storage and retrieval area. While it has the capability and capacity for computing, given the enormous amount of data it has to capture, retain over a period of time, and enable a graduation process for the data to either AWS Redshift or other traditional systems, it will be used minimally for computing.
  • Java
  • Pivotal Cluster
  • Hawq Database
  • Apache Storm
  • Hbase
  • Hadoop
  • Kafka
  • Oracle
  • Netezza
  • Spring
  • Hibernate
  • Restful web services
  • Eclipse
  • Maven
  • SVN Repository
  • JIRA
  • Agile
  • Redhat Linux

Orlando, FL
10.2013 - 12.2013
  • Clients: Capco / Fannie Mae
  • Project Name: LDNG Loan Delivery Next Generation
  • Worked with business team and product owners to prepare a POC for Fannie Mae loan
  • Worked on a Test Harness Suite
  • This application is primarily used by lenders to deliver mortgage loan data specified in MISMO XML format to Fannie Mae. The application validates the loan data and notifies the errors to the lenders. The lenders then correct the loan data, revalidate them after corrections. Once the loan data is free of errors, the lenders can submit the loans for downstream processing where these loans are acquired and managed.
  • Java
  • Spring 3.0
  • Spring Integration
  • Spring Test Suite STS
  • Oracle 11g
  • MongoDB
  • TOAD 7.1
  • Tomcat
  • Maven
  • SVN Repository
  • Rally
  • Agile
  • Windows 7

IA
08.2011 - 08.2013
  • Clients: Wells Fargo
  • Project Name: Core Sales & Fulfillment Mortgage System
  • Worked as a Lead Developer and responsible for managing Prepaid Mortgage application.
  • This application manages a loan application from the initiation of a loan application till its funding, thereby reducing the logistics and manual process involved in the product sales and fulfillment process.
  • JDK 6.0
  • GWT
  • Google Guice
  • Spring 3.0
  • Struts
  • EJB3.x
  • EMF (Eclipse Modeling Framework)
  • Hibernate 3.2
  • SOA Web Services
  • WSDL
  • SOAP
  • XML
  • XSD
  • AJAX
  • CSS
  • Java Script
  • JQuery
  • IBM Lombardi Teamworks BPM Tool
  • Oracle 11g
  • TOAD 7.1
  • JBoss 7.0
  • Eclipse
  • Maven
  • TestNG
  • SVN Repository
  • JIRA
  • Agile
  • Web Builder
  • Windows 7

Philadelphia, PA
03.2009 - 08.2011
  • Clients: GlaxoSmithKline
  • Project Name: GlaxoSmithKline (GSK) Vaccines Direct
  • Managing and Interacting with twenty member team at offshore
  • Responsible to prepare functional and technical specification documents
  • Responsible for Product enhancement with Sterling Web & Multichannel Sterling Fulfillment process.
  • The GSK's Vaccines Direct is an e-commerce application that facilitates customer registrations, pre-orders for flu vaccines and direct online orders for non-flu vaccines. GSK's Vaccines Direct is build upon IBM / Comergent product and integrates seamlessly with other GSK systems like Vaccine Order Entry system and Fulfillment System.
  • Sterling Comergent Framework(Sterling OMS and Fulfillment)
  • Spring
  • Java
  • JDBC
  • JNDI
  • XML
  • HTML
  • Servlets
  • JSP
  • JQuery
  • Hibernate
  • Tomcat
  • Jasper Reports
  • MQ Messaging
  • Web Services
  • DOJO
  • Windows
  • Oracle
  • Eclipse
  • CVS
  • Log4j
  • Microsoft Visio
  • IBM Test Director
  • JUnit
  • TDD

Jakarta, Indonesia
05.2008 - 03.2009
  • Clients: Bank Central Asia (BCA)
  • Project Name: Client Trade App
  • Key involvement in development of POC for Client Trade App using Open Lazlo.
  • Key involvement in development of POC for B2C (iBank initiation, execution and JMS) using Java Caps.
  • Bank Central Asia is the Asia's largest private Bank and we have developed three POC's using Java CAPS SOA approach. BCA has n number of products running in their existing EAI (Enterprise Architecture Integration) and the idea is to replace the existing EAI into the ESB (Enterprise Service Bus) using Java CAPS.
  • Java
  • JDBC
  • Open Lazlo
  • HTML
  • Servlets
  • JSP
  • JQuery
  • JBoss
  • Tomcat
  • Java Caps SOA
  • Sun ESB
  • SOAPUI
  • PL/SQL
  • Oracle
  • NetBeans
  • Windows
  • Maven
  • Apache ANT
  • Log4j
  • TDD
  • JAX RPC
  • SOAP

03.2008 - 05.2008
  • Clients: ICICI Bank, India
  • Project Name: Sun Messaging Service
  • Developed UWC for Sun Messaging Platform to replicate the existing Outlook Messaging Service in the organization.
  • Sun JMS for ICICI Bank is the Messaging Server Infrastructure customization. This is giving the robust mailing services across the organization that has the migration of more than one lac users from the existing service.
  • Core Java
  • J2EE-Servlet/JSP
  • JMS
  • XML
  • HTML
  • LDAP
  • Sun App Server
  • Windows

12.2007 - 03.2008
  • Clients: Schawk, India
  • Project Name: SchawkOne
  • Active involvement in developing new enhancements in the MVC Architecture based Web application involving with different Design Patterns.
  • The goal of SchawkOne is 'To create a secure central repository of consistent accurate information that is available globally to Schawk, its clients, associated businesses, and vendors in a convenient easy to access format that will be able to support Schawk in implementing its 2020 vision.'
  • Java
  • J2EE-Servlets/JSP
  • Spring MVC
  • JavaScript
  • JQuery
  • Hibernate
  • MS-SQL
  • NetBeans
  • JUnit
  • TDD
  • Tomcat
  • Windows

Chennai, India
01.2007 - 12.2007
  • Clients: Schawk
  • Project Name: GDMS (Global Data Management System)
  • Worked at offshore location.
  • Communicated with onsite client and gathering inputs.
  • Involved in Developing MVC layers using Spring and Hibernate.
  • The Global Data Management System (GDMS) project has been instituted to centralize all the global data that is currently available in individual systems across different operating units of Schawk.
  • Core Java
  • Java J2EE
  • Spring Frame Work
  • Hibernate
  • JSF
  • MS-SQL
  • Microsoft Visio
  • NetBeans
  • JUnit
  • TDD
  • Tomcat server
  • Windows

Chennai, India
12.2005 - 01.2007
  • Clients: Unison
  • Project Name: Unison Harcourt
  • Responsible for design and development of Students module across the application.
  • Responsible for preparing technical design documents.
  • Unison platform was designed to provide a web-enabled delivery system for formative instructional assessments for mathematics and reading, customized around state standards. Platform provides both online and paper-based testing. Provide educators with a state-by-state custom-built and field-tested formative instructional assessment system for grades 3-8.
  • Core Java
  • J2EE-Servlet
  • JSP
  • JQuery
  • Struts
  • Ant Build
  • Log4j
  • TDD
  • Oracle
  • WebSphere RAD
  • Windows XP

IN
06.2005 - 12.2005
  • Clients: Well Point Inc
  • Project Name: UPI (Usability Portal and Implementation)
  • Responsible for design and development of health care portal application using portlets.
  • Created the flow charts for the application using the UML - class flow diagrams, sequence diagrams.
  • UPI - Usability Portal and Implementation Application is an application for well point Inc USA, a health benefits company that utilizes a number of portals across their business areas to cater to their customer base. This application is to enhance usability across external and internal web sites through the development of enterprise wide usability standards to maximize economies of scale with the new enterprise structure of portals.
  • Core Java
  • IBM Websphere Portal
  • Struts
  • Ant Build
  • Log4j
  • DB2
  • WebSphere RAD
  • Windows XP

01.2005 - 05.2005
  • Clients: India
  • Project Name: HRIS (Human Resource Information System)
  • Developed search engine for recruitment management team
  • Active participation in Unit testing, integration testing, and acceptance testing for the application.
  • HRIS - Human Resource Information System is a redesign project for developing and modeling the portal with JFC functionalities incorporating the features like Search engine, Resumes management and Recruitment activities.
  • Core Java
  • J2EE-Servet/JSP
  • XML
  • MS-SQL
  • Oracle JDeveloper 10g
  • Linux

Tokyo, Japan
04.2004 - 12.2004
  • Clients: NEC
  • Project Name: JRF (Japan Railway Freight)
  • Developed Stateless session beans to interact with business services.
  • Worked in SCM Control system which involves handling the goods for transportation.
  • Involved in offshore communication.
  • JRF- Japan Railway Freight is a Railway Freight management system in Japan. The system works on Container Management System which involves handling the goods in SCM Control. It works on clean, reliable and efficient mass transportation services.
  • Core Java
  • J2EE-Servet/JSP
  • EJB
  • XML
  • Oracle
  • Watool
  • Windows

Education

Bachelor of Engineering (B.E) - Electronics and Communication

Skills

  • UNIX
  • Linux
  • Windows
  • Mac OS
  • J2EE
  • Web Services
  • SOAP
  • REST
  • Spring
  • Quarkus
  • Oracle
  • SQL Server
  • DB2
  • NoSQL DB's
  • Java
  • SQL
  • Python
  • UNIX Shell Scripting
  • Hadoop Bigdata
  • Kafka
  • Flume
  • Apache Spark
  • Storm
  • Cloudera cluster
  • Oozie
  • Confluent
  • Pivotal Hawq
  • Kerberos
  • Ranger
  • AWS
  • Redshift
  • HDFS
  • Sqoop
  • HIVE
  • Azure
  • Big Query
  • GCP
  • GCloud
  • Sterling Comergent 71v
  • Apache Struts
  • Tiles
  • Hibernate
  • J2EE Design Patterns
  • Infoworks
  • AWS Glue
  • Eclipse
  • Intelliji
  • TestNG
  • JUnit
  • Git
  • Bitbucket

Timeline

01.2022 - Current

08.2020 - 12.2021

01.2019 - 07.2020

05.2018 - 01.2019

01.2014 - 04.2018

10.2013 - 12.2013

08.2011 - 08.2013

03.2009 - 08.2011

05.2008 - 03.2009

03.2008 - 05.2008

12.2007 - 03.2008

01.2007 - 12.2007

12.2005 - 01.2007

06.2005 - 12.2005

01.2005 - 05.2005

04.2004 - 12.2004

Bachelor of Engineering (B.E) - Electronics and Communication

Subramanian Karuppiah