Summary
Overview
Work History
Education
Skills
Timeline
Generic
Sthitaprajna Sahoo

Sthitaprajna Sahoo

Technology leader. Big data & cloud-native solutions architect .AI/ML Engineering Manager . Delivery lead
Tampa,FL

Summary

Experienced Data and ML Engineering Manager with a proven track record of successfully delivering Petabyte-scale data and machine learning projects. 16 years of experience leading teams in developing and managing MLOps, data warehousing, and ETL solutions using Bigdata Platforms. Collaborated with data scientists, software engineers, and stakeholders to ensure alignment with business objectives. Deep expertise in machine learning algorithms, model development, deployment, and monitoring for driving business value through data-driven insights. Strong focus on data quality, accuracy, MLOps best practices, data governance, database management, and ETL development. Skilled in leading agile teams, optimizing system performance, and implementing data security and compliance measures. Known for analytical and problem-solving skills, with a track record of building high-performing teams that consistently meet business goals. Strategic thinker with a strong focus on business outcomes, committed to delivering results that drive business value.

Overview

19
19
years of professional experience

Work History

MLops Engineering Manager

JP Morgan Chase
03.2021 - Current
  • Company Overview: Wholesale Payment - Payment Validation Service
  • Blockchain based platform for transactions and information persistence
  • The platform provided automated and decentralized services for validation and transaction support activities
  • Led a team of 6 MLOps Engineer fostering a culture of collaboration and continuous improvement
  • Oversaw the end-to-end deployment of machine learning models into production environments, resulting increase in deployment speed and reduction in downtime
  • Implemented best practices for version control, automated testing, and monitoring, resulting in 60% improvement in model stability and reliability
  • Optimized data pipelines, improving data quality and reducing preprocessing time by 70%
  • Managed cloud infrastructure using AWS resulting in cost savings through optimized resource utilization and automation
  • Provided technical leadership and mentorship to team members, facilitating their professional growth and skill development
  • E2E API design and development of model serving using FastAPI and Kubernetes
  • Design, develop and deployment of Neural Net based solutions
  • Hyper-parameter tuning and distributed training in spark
  • Model building, versioning, auditing and governance controls
  • Collaboration with product manager and consumers of the microservice
  • Monitor and optimize system performance, ensuring that data solutions are scalable and efficient
  • Wholesale Payment - Payment Validation Service
  • Blockchain based platform for transactions and information persistence
  • The platform provided automated and decentralized services for validation and transaction support activities

Engineering Manager

JP Morgan Chase
02.2019 - 03.2021
  • Company Overview: Wholesale Payment - Cash Flow Intelligence (CFI)
  • Wholesale Payment business moves 20% of US Dollars around the globe everyday through clients' payments and receivables
  • The data science product team specializes in designing new products by utilizing the huge amount of data on the big data platform
  • Developed monitoring and alerting systems to proactively detect and resolve issues in production models, leading to a 40% decrease in incidents
  • Collaboration with product managers, data scientists, engineers, end users, and other stakeholders to integrate data discoveries and processes into operational capabilities
  • Collaborated with data scientists to containerize and deploy models using Docker and Kubernetes, improving scalability and reproducibility
  • Design and Setting up of data science infrastructure for rapid feature engineering and model development
  • Analysis of complex, high-volume, high-dimensionality data from varying sources
  • Building end to end feature engineering pipeline on distributed data platform
  • Building data assets for aiding in model development and frontend UI
  • Data science model development in both batch and online mode for CFI application
  • API design and development using Flask, Gunicorn
  • Model and API deployment to Private Cloud environment
  • Ensemble model development which includes prophet, knn, arima etc
  • For cash flow forecasting
  • Hyper-parameter tuning and distributed training
  • Designed and implemented CI/CD pipelines for machine learning models
  • Proactively solving problems, taking ownership of issues, address performance, and data issues timely
  • Wholesale Payment - Cash Flow Intelligence (CFI)
  • Wholesale Payment business moves 20% of US Dollars around the globe everyday through clients' payments and receivables
  • The data science product team specializes in designing new products by utilizing the huge amount of data on the big data platform

BigData Architect

JP Morgan Chase
08.2016 - 01.2019
  • Company Overview: JPMIS- MAP, Clickfox and Data Discovery
  • The team at JPMIS is responsible in building a marketing analytics platform and discovery environment for both analytical and operational needs
  • Designed and developed an end-to-end data ingestion and ETL tool called 'PIF' written in Python
  • Built an execution and orchestration engine from scratch to support complex ETL needs
  • Created and implemented the data layer for the Multi-Tenancy topology where ClickFox, TDD (Technical Data Discovery), Data Operations and MAP reside as tenants
  • Integrated the ingestion tool to be able to communicate any external data sources like RDBMS data bases like Oracle, Sybase, MySQL etc., FTP sites, Teradata, Greenplum
  • Automated data ingestion from various internal/external sources to discovery environment for data science needs
  • Developed Data Quality tools written in spark to monitor and flag issues between Source and refined destinations
  • Integrated Voltage encryption into Hadoop environments for data security needs
  • Wrote data equality routines using spark to confirm data after decryption remain same as before encryption
  • Designed the security framework of the project in the multi-tenant CDH cluster
  • Support Business deliverables by working with stakeholders
  • JPMIS- MAP, Clickfox and Data Discovery
  • The team at JPMIS is responsible in building a marketing analytics platform and discovery environment for both analytical and operational needs

BigData Architect

Citadel Information Services / Capital One
06.2015 - 07.2016
  • Company Overview: Capital One Financial Corporation is an American bank holding company specializing in credit cards, home loans, loans, banking and savings products
  • Defined and designed architectural framework, data strategy, tools, and technologies for the marketing platform in the Big Data space for the Bank
  • Architected, designed and developed the code deployment and automation process on AWS using Chef scripts
  • Worked on DevOps aspects by automating the infrastructure and application code using Cloud Formation Templates(JSON based CFTs) Chef, Jenkins and uDeploy on AWS platform
  • Merged all legacy ING-Direct bank data with Capital One and pushed to Hadoop Data Lake by setting up the ETL pipelines
  • Extracted data from different sources (Teradata, Oracle, Google double click etc), transformed it according to the business use case and loaded into Hadoop
  • Developed Hive Scripts, Pig scripts, Unix Shell scripts, Spark programming using Scala for all ETL loading processes and converting the files into parquet in the Hadoop File System
  • Developed various ETL transformation scripts using Hive to create refined datasets analytics use cases
  • Automated workflows using shell scripts and Control-M jobs to pull data from various databases into Hadoop
  • Developed scripts in Spark to import and export data from Cassandra, Teradata, Hadoop and vice-versa
  • Created RDD's/DataFrames in Spark using Scala and applied several transformation logics to make the data ready for Cassandra loads
  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings
  • Capital One Financial Corporation is an American bank holding company specializing in credit cards, home loans, loans, banking and savings products

Manager/Application Architect

Cognizant / Comcast
11.2014 - 04.2015
  • Company Overview: A Hadoop based solution for Set Top Box data analysis and management
  • Designing the architecture of the project and coming up with the technology stack
  • Installed and configured Multi-node Hadoop environment (HDP2.2) on AWS
  • Load the STB data in HDFS using Flume from S3 bucket present in AWS
  • Importing and exporting data into HDFS and Hive using Sqoop
  • Load and transform large sets of structured, semi structured and unstructured data
  • Worked on custom MapReduce program to load the data from HDFS to Hbase
  • Integration from Hbase to hive as per the requirement
  • Work on custom Hive UDF to meet the requirement
  • Created Schema RDDs to access Hive tables via Spark for better query efficiency
  • Tuning of Hadoop and Spark jobs thru better memory management, serialization and efficient staging
  • Automated all the jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows
  • A Hadoop based solution for Set Top Box data analysis and management

Project Lead

SumTotal Systems
04.2012 - 10.2014
  • Company Overview: Accero Cyborg Payroll and Analytics
  • Accero Cyborg system is a payroll engine that helps the customers to process their payroll in variety of frequencies
  • Planning, monitoring and tracking of progress of work items performed by different modules
  • Application maintenance, Enhancements, coordinating with development and QA teams
  • Release coordination with PM team for a smooth and timely production release
  • Addressing customer escalations and ensuring customer satisfaction
  • Status reporting and leading various project items to effective closure
  • Co-ordination with Business Partners
  • POCs on social media data for sentiment analysis, community detection, customer purchase power
  • Accero Cyborg Payroll and Analytics
  • Accero Cyborg system is a payroll engine that helps the customers to process their payroll in variety of frequencies
  • Technologies Used: Apache Spark, Hadoop Map Reduce, Hive, Oracle, SQL, D3.js, Python, R, Impala, Core Java, COBOL, DB2, Mainframes, CSL RG, UNIX

Senior Consultant (Kroger, OH)

Computer Science Corporation
05.2011 - 04.2012
  • Company Overview: The client is one of the largest retail food companies in the United States as measured by total annual sales
  • Application maintenance, Enhancements, 24x7 production support
  • Preparing High Level Design Document and Detailed Design Document
  • Interacting with the client on business decisions/issues
  • Reviewing client deliverables
  • Documenting existing processes in the system
  • Abend Analysis and suggesting the permanent fixes to reduce the emergencies
  • Tracking deliverables on timelines
  • Early Availability of Financials for reporting to stakeholders
  • The client is one of the largest retail food companies in the United States as measured by total annual sales
  • Technologies Used: COBOL, PL/1, VSAM, DB2, IMS-DB, JCL, CICS

System Analyst (AT&T, OH)

Convergys
05.2006 - 05.2011
  • Company Overview: AT&T (American Telephone and Telegraph Company) founded in 1983, is leading provider of both local and long distance telephone services
  • Development & Enhancements, Application Maintenance
  • SME for USO (Universal Service Order), RBE (Rating and Billing Engine) and MPS (Message Processing System) which are the important subsystems for the billing system
  • Primary responsible analyst for ATTOMECS
  • This is a subsystem that generates revenue from error usage
  • Requirement gathering and Walkthrough, Design Document, Preparation of test plans, Development, Unit test plan and Results, Reviews, Preparation of implementation plan
  • AT&T (American Telephone and Telegraph Company) founded in 1983, is leading provider of both local and long distance telephone services
  • Technologies Used: COBOL, VSAM, DB2, IMS-DB, JCL, CICS, Easytrieve, Stored Procedures, XML

Education

Bachelor of Engineering - Electrical Engineering

Masters of Science - AI/ML

Skills

Databricks

Timeline

MLops Engineering Manager

JP Morgan Chase
03.2021 - Current

Engineering Manager

JP Morgan Chase
02.2019 - 03.2021

BigData Architect

JP Morgan Chase
08.2016 - 01.2019

BigData Architect

Citadel Information Services / Capital One
06.2015 - 07.2016

Manager/Application Architect

Cognizant / Comcast
11.2014 - 04.2015

Project Lead

SumTotal Systems
04.2012 - 10.2014

Senior Consultant (Kroger, OH)

Computer Science Corporation
05.2011 - 04.2012

System Analyst (AT&T, OH)

Convergys
05.2006 - 05.2011

Bachelor of Engineering - Electrical Engineering

Masters of Science - AI/ML

Sthitaprajna SahooTechnology leader. Big data & cloud-native solutions architect .AI/ML Engineering Manager . Delivery lead