Summary

Overview

Work History

Education

Skills

Accomplishments

Certification

Timeline

Jayaprakash Subramani

Cloud,AI

Summary

About Myself Leveraging my 17+ years of IT experience spanning Multiple diverse technologies from, ETL, cloud, ML, and Gen AI, I spearheaded PwC’s Gen AI Factory. Directed and scaled PwC's Gen AI Factory Pod Teams in the development of strategic Retrieval-Augmented Generation (RAG) pipelines, optimizing knowledge ingestion processes. Championed the integration of GraphRAG technology and custom plugins to automate code generation and test case creation, resulting in significant improvements in efficiency and accuracy for AI-driven workflows.

Overview

years of professional experience

Certification

Work History

PwC / Gen AI Factory

10.2023 - Current

Leadership in Gen AI: Led the Gen AI Factory, directing pod teams to design and implement strategic RAG pipelines for multiple clients, optimizing knowledge ingestion, and enhancing AI workflow efficiency and accuracy
GraphRAG Integration: Spearheaded GraphRAG integration, utilizing a transformer-based architecture for RAG knowledge retrieval
Exploration of Gen AI/LLM Frameworks: Championed the exploration of frameworks such as Azure Semantic Kernel
AWS Bedrock, Langchain, and LangGraph, leading to informed decision-making for technology stack
Development of Reusable Plugins: Developed reusable Gen AI plugins for automated code generation and test case creation, leveraging transfer learning and NLP techniques
Training and Mentorship: Provided training and mentorship to team members on new technologies and methodologies, fostering a culture of continuous learning
Impact: Core plugins now integral to Gen AI Factory’s offerings, ensuring consistent knowledge ingestion and improving AI development lifecycles for diverse clients.

PwC

06.2022 - 09.2023

Major Financial Services Client

Data Management

PwC

04.2021 - 05.2022

Solution: Led a team to design and implement a robust data management solution using AWS and Azure, significantly enhancing data accessibility and scalability for a major financial services client
Cloud Integration: Oversaw the seamless migration of on-premise data warehouses to cloud-based solutions, resulting in improved data storage, retrieval, and analysis capabilities
Advanced Microservices Architecture: Developed a microservices architecture with AWS Lambda and Azure
Functions, streamlining ETL processes and improving overall data processing efficiency
Performance Optimization: Applied advanced performance tuning techniques and cloud resource optimization, achieving a 5X improvement in data processing speed and substantial cost reductions
Data Governance Framework: Established a comprehensive data governance framework to ensure data quality, security, and compliance with industry standards
Strategic Impact: Delivered a scalable, high-performance data management solution that significantly improved data quality and operational efficiency, empowering the client to make faster, data-driven decisions
For a Major Bank:
AWS Solution Development: Led a diverse team of onshore and offshore data engineers to develop and implement a comprehensive AWS solution for a major bank's data management challenges
Cloud Migration: Migrated the bank's Spark/Big Data applications to the cloud, significantly boosting processing speed and data management efficiency
Innovative Accelerators: Spearheaded the development of innovative accelerators, including Generative AI for automated data quality detection and a Metadata Driven ETL Framework
ML-Based Performance Tuning: Implemented ML-based performance tuning to optimize ETL pipelines, leading to enhanced performance and reduced operational costs
Automation of Data Workflows: Automated key data workflows, reducing manual intervention and minimizing the risk of errors
Mainframe Technologies Transformation: Led a transformative initiative utilizing extensive knowledge of mainframe technologies
Custom Accelerators Development: Developed custom accelerators for converting EBCDIC data to ASCII and vice versa, ensuring data integrity during the transition
Cloud Migration: Migrated data management processes to a robust AWS environment, leveraging cloud computing for faster processing speeds
Legacy System Modernization: Successfully modernize legacy systems, enabling the bank to leverage modern technologies and improve overall operational efficiency.

Travelers / Lead Cloud & Spark Data Engineer

LTI

03.2018 - 04.2021

BI&A and Data Engineering Strategy: Key contributor to developing and executing BI&A and Data Engineering strategy in AWS Stack
Cloud Migration: Architected and strategized migration of on-prem Spark applications to AWS and Databricks platforms, boosting processing speed and efficiency
Core Analytic Data Products: Designed, developed, and delivered core analytic data products to support BI R&D
Actuarial, Product Management, and business analytics consumers
Resilient Applications: Designed and implemented resilient, cost-effective, highly available applications in AWS Stack
ETL Design and RDF XML Processing: Led solution and overall ETL design for processing RDF XML messaging data using Spark
Reusable Transformation Models: Built reusable transformation rules and repeatable data conversion models, reducing development effort by 30%
Data Lake Pipeline: Implemented a data lake pipeline, orchestrating 20 data sources, applying ETL, and creating a single Hive table with 1200 attributes and over a billion rows in under 2 hours
Best Practices in Big Data: Established best practices, standards, principles, guidelines, and knowledge management in the big data space
Deep Neural Networks and NLP: Worked on Deep Neural Networks (ANN) and NLP algorithms for text mining
Anomaly Detection Models: Implemented Random Cut Forest, Isolation Forest, and Deep Auto Encoder models for anomaly detection in batch and streaming data
Spark Programs Development: Developed Spark programs for data ingestion and transformation from DB2, Teradata, and JSON files
SAS to Spark/Hive Modules: Converted SAS modules to Spark/Hive, creating a unified data entity for data scientist exploration
Performance Tuning: Extensively worked on performance tuning of Spark/Hive components
Real-time Data Pipelines: Implemented Kafka-Spark streaming for real-time data pipelines
AWS Tools: Utilized AWS EMR and Lambda for specific data processing requirements, managing source data in S3
Project Management: Used Kanban, Git, and GitHub for project management and version control as project lead for
Workers Compensation data products
Automation and Infrastructure Management: Implemented bash scripts for multithreading and automation, deployed, and managed cloud infrastructure using Jenkins and Terraform
Technical Leadership: Played a technical leadership and mentoring role for onshore and offshore teams
Data Quality Frameworks: Designed and implemented data quality frameworks to ensure accuracy and consistency across data pipelines
Advanced Analytics Solutions: Developed advanced analytics solutions integrating machine learning models for predictive analytics and decision support
Scalability and Optimization: Enhanced scalability and optimized resource allocation in cloud environments to handle increasing data volumes
Collaboration with Stakeholders: Collaborated with cross-functional teams and stakeholders to align data engineering solutions with business objectives
Compliance and Security: Ensured compliance with industry standards and implemented robust security measures to protect sensitive data.

Tata Consultancy Services / JPMC

08.2016 - 02.2018

Data Lake Analytics: Extensively worked on solution architecture and design for Data Lake analytics implementations using Informatica and Oracle
Data Profiling and Analysis: Performed extensive data profiling on data sources and conducted source data analysis to create source-to-target mapping sheets
Complex SQL Reporting: Built complex SQL reports to assist data modelers and BI developers in solving use cases
Stakeholder Coordination: Coordinated with business users and stakeholders to gather business requirements and manage customer relationships
Product Roadmap and Workshops: Managed customer relationships, conducted product workshops, and facilitated solutions architecture reviews
POC Implementations: Conducted POC of Informatica products in Azure and AWS platforms using Snowflake
Redshift, and Google Cloud Dataflow
Management Meetings: Facilitated weekly deep dive and status meetings with C-level management to report project progress
Data Loading and Streaming: Extensively worked on loading data into HIVE tables using Spark and implemented
Kafka-Spark streaming for unbounded API data
Streaming and Batch Pipelines: Leveraged NIFI to configure streaming and batch sources for pipelining into
Kafka/HDFS sinks
Hive to Spark Transformation: Converted Hive/HQL queries into Spark transformations using Spark RDD, Scala, and
Python
Data Ingestion Utilities: Developed SQOOP import utility to load data from various RDBMS sources and developed data pipelines using Flume and Spark
Cloud and Big Data Tools: Proficient in Cloudera/Hortonworks distributions, AWS (S3, EC2, EMR), Microsoft Azure, and Google Cloud Dataflow
Web Log Analytics: Implemented web log analytics using SPLUNK, Elasticsearch/Kibana, and Grafana
Data Analytics: Performed data analytics using SPARK with Scala and Python APIs.

Tata Consultancy Services / Silicon Valley Bank

05.2014 - 07.2016

SPARK Data Analysis: Worked extensively on SPARK using both Python and Scala for data analysis
Web Log Analytics: Gained expertise in web log analytics using SPLUNK
PIG Script Development: Developed and tested optimized PIG Latin scripts for data processing
Data Export and Visualization: Exported analyzed data to relational databases using Sqoop for visualization and report generation
Data Workflow Automation: Automated data extraction from warehouses and weblogs into HIVE tables using Oozie workflows and coordinator jobs
Data Collection with Flume: Used Flume to collect web logs from online ad-servers and push data into HDFS
Data Transformation and Analysis: Loaded and transformed large datasets using Hive to compute metrics for reporting
Oozie Workflow Development: Developed workflows in Oozie to automate data loading and processing tasks
Hive Table Management: Created and managed Hive tables for data analysis to meet business requirements
Scrum Master Role: Facilitated Sprint Planning, Daily Scrums, Sprint Reviews, and Retrospective Meetings
Sprint Management: Created Task Boards and Sprint Burn Down Charts, and managed team commitments and impediments
Team Mentorship: Served as a coach and mentor, assisting with story selection, sizing, task definition, and adherence to best practices.

Scrum Master and Technical Lead, Business Analyst, Technical Lead

Tata Consultancy Services / Nielsen

04.2011 - 04.2014

Performed roles as Scrum Master

Developer

Computer Sciences Corporation, GLIC

03.2010 - 04.2011

Requirement Gathering and Design: Formulated requirements into design specs, prepared system specifications, and tracked project progress
Mainframe Tools Expertise: Proficient in TSO, ISPF/SDSF, VAGEN, Panvalet, Endeavor, Xpeditor, Abend-Aid
JCL Development: Created JCL and JCL PROCs using various utilities like DFSORT, FILEAID, IEBCOPY
IEBGENER, IEBCOMPR, and ICETOOL
High-Level Design Documentation: Created High-Level Design, Detailed Design, and Functional Requirement documents
DB2 Tools Proficiency: Experienced in SPUFI, File Manager for Db2, and QMF
Debugging Tools: Skilled in using XPEDITOR (CICS/Batch), Debugger, CEDF, and Trace Master for troubleshooting
CICS Transaction Processing: Strong experience with CICS transaction processing and DB2 application integration
Manual and Automated Testing: Advanced knowledge of manual, automated, and performance testing
IBM Mainframes: In-depth knowledge of IBM mainframes (MVS, COBOL, JCL, VSAM, CICS, and DB2) and extensive experience with IBM Mainframe tools and techniques
Test Strategy and Traceability: Involved in preparing Test Strategy and Traceability Matrix documents
Test Case Preparation: Prepared detailed test cases for batch jobs and CICS screens based on code and database analysis
Functional and Regression Testing: Performed functional, regression, integration, end-to-end, and system testing
DB2 Application Development: Developed DB2 applications using cursors (Declare, Open, Fetch), SQL query optimization, and cursor pointer functionality
Cobol-VSAM Development: Developed Cobol-VSAM applications with KSDS and ESDS clusters
CICS Web Services: Developed new inbound/outbound programs in CICS Web Services environment using CICS
Transaction Server 3.1.

Test

Infosys / BNSF

10.2008 - 03.2010

Case Preparation: Prepared detailed test cases for batch jobs and CICS screens based on code and database analysis
Functional and Regression Testing: Performed functional, regression, integration, end-to-end, and system testing
CICS Transaction Processing: Strong experience with CICS transaction processing and DB2 application integration
DB2 Application Development: Developed DB2 applications using cursors (Declare, Open, Fetch), SQL query optimization, and cursor pointer functionality
Cobol-VSAM Development: Developed Cobol-VSAM applications with KSDS and ESDS clusters
Manual and Automated Testing: Advanced knowledge of manual, automated, and performance testing
IBM Mainframes: In-depth knowledge of IBM mainframes (MVS, COBOL, JCL, VSAM, CICS, and DB2) and extensive experience with IBM Mainframe tools and techniques
Test Strategy and Traceability: Involved in preparing Test Strategy and Traceability Matrix documents
CICS Web Services: Developed new inbound/outbound programs in CICS Web Services environment using CICS
Transaction Server 3.1.

Infosys / DHL

01.2007 - 10.2008

Manual and Automated Testing: Advanced knowledge of manual, automated, and performance testing
IBM Mainframes: In-depth knowledge of IBM mainframes (MVS, COBOL, JCL, VSAM, CICS, and DB2) and extensive experience with IBM Mainframe tools and techniques
Test Strategy and Traceability: Involved in preparing Test Strategy and Traceability Matrix documents
Test Case Preparation: Prepared detailed test cases for batch jobs and CICS screens based on code and database analysis
DB2 Application Development: Developed DB2 applications using cursors (Declare, Open, Fetch), SQL query optimization, and cursor pointer functionality
Cobol-VSAM Development: Developed Cobol-VSAM applications with KSDS and ESDS clusters
CICS Web Services: Developed new inbound/outbound programs in CICS Web Services environment using CICS
Transaction Server 3.1
Tableau Reports: Collaborated with business users to gather requirements for building Tableau reports.

Education

Bachelor of Engineering (B.E) - Electronics and Instrumentation

Anna University

2006

Skills

Multiple ML Certifications by Coursera
Technology Expertise
Cloud Platforms:
Databricks - Spark/ML/LLM/DLT
AWS - EMR, Glue, Kinesis, Lambda
Redshift, Sagemaker
Azure Machine Learning Studio, Azure AI
Studio, Azure Data Fabric, Azure
Functions, Azure Service Bus, Azure Event
Hub
Gen AI / LLM Frameworks:
Azure Semantic Kernel
AWS Bedrock
Langchain, LangGraph, GraphRag

Haystack, CrewAI
Data Engineering / Data Science:
Spark (Python and Scala)
Various Cloud SDK’s
Python - Multiple Python Packages
ML packages in Spark / Python
Hive / HQL, SQL, Sqoop, Shell Scripting
Databases:
Various RDBMS Databases
Various NoSql Databases
AWS RDS / Aurora
Azure SQL, Redshift, Snowflake
Neo4J

Accomplishments

Gen AI + ML Innovations:
Advanced Accelerators: Developed sophisticated accelerators to enhance efficiency and innovation in Generative AI, improving data processing speed and accuracy
Automated Code Generation: Created reusable Gen AI plugins for automated code generation and test case creation, using transfer learning and NLP techniques
Data Lineage and Quality: Implemented data lineage detection systems and integrated Databricks Delta Live Tables for reliable data quality and governance
GraphRAG Integration: Led GraphRAG integration for optimized RAG knowledge retrieval, boosting AI workflow efficiency
Gen AI Frameworks: Evaluated frameworks like Azure Semantic Kernel, AWS
Bedrock, Langchain, and LangGraph to develop tailored Gen AI solutions for clients
Migration Success: Led the successful migration of large-scale on-premise
Spark/Big Data applications to AWS and Azure
This involved transferring terabytes of data from diverse on-premise sources and managing extensive ETL pipelines
Microservices Architecture: Implemented a robust microservices architecture using Azure Service Bus
This ensured reliable message delivery and facilitated communication between distributed systems, significantly enhancing the overall efficiency of applications
Spark Pipelines Expertise: Demonstrated expertise in implementing and orchestrating Spark Pipelines in AWS EMR, Databricks, and Azure HDInsight, all while maintaining a strong focus on cost efficiency
Achieved a remarkable 5X performance improvement and substantial cost savings for large ETL pipelines through efficient performance tuning, leveraging advanced
Spark statistical techniques and Azure’s scalable resources.

Certification

Azure Data Scientist Associate AWS Solutions Architect Associate Neo4J Certified Professional

Timeline

PwC / Gen AI Factory

10.2023 - Current

PwC

06.2022 - 09.2023

Data Management

PwC

04.2021 - 05.2022

Travelers / Lead Cloud & Spark Data Engineer

LTI

03.2018 - 04.2021

Tata Consultancy Services / JPMC

08.2016 - 02.2018

Tata Consultancy Services / Silicon Valley Bank

05.2014 - 07.2016

Scrum Master and Technical Lead, Business Analyst, Technical Lead

Tata Consultancy Services / Nielsen

04.2011 - 04.2014

Developer

Computer Sciences Corporation, GLIC

03.2010 - 04.2011

Test

Infosys / BNSF

10.2008 - 03.2010

Infosys / DHL

01.2007 - 10.2008

Bachelor of Engineering (B.E) - Electronics and Instrumentation

Anna University

Jayaprakash Subramani

Summary

Overview

Work History

Data Management

Travelers / Lead Cloud & Spark Data Engineer

Scrum Master and Technical Lead, Business Analyst, Technical Lead

Developer

Test

Education

Bachelor of Engineering (B.E) - Electronics and Instrumentation

Skills

Accomplishments

Certification

Timeline

Data Management

Travelers / Lead Cloud & Spark Data Engineer

Scrum Master and Technical Lead, Business Analyst, Technical Lead

Developer

Test

Bachelor of Engineering (B.E) - Electronics and Instrumentation

Similar Profiles

Jiddesh ShewaleJiddesh Shewale

Saravanakumar VelayuthamSaravanakumar Velayutham

David GreenfieldDavid Greenfield

Sherin Mary SunnySherin Mary Sunny

RAMALINGAM SAMPATHRAMALINGAM SAMPATH