Summary
Overview
Work History
Education
Skills
Current Project Engagement
Technical Summary
Additional Information
Timeline
Generic

SARAT KUMAR SETHY

Data Science Manager,HYDERABAD

Summary

Highly competent Data Science/AI Senior Manager with over 16 years of cross-functional and technical experience in designing and implementing business engineering, intelligence, and machine learning solutions. Possessing 5+ years of Agile project management experience in the advertisement, banking, and insurance domains within the data and analytics technology space. Expertise in Python, Spark, Scala, and ML/DL algorithms to solve complex business problems. Seeking an enthusiastic, creative, and competitive environment that offers opportunities to enhance and apply skills while contributing to organizational growth. With 15 years of experience, including 6+ years providing business solutions on various data analysis platforms, data science, and machine learning. Also, have 5+ years of experience in AI/ML engineering with a strong focus on LLMs and NLP. Extensive experience in data testing for banking and insurance products.

Overview

15
15
years of professional experience

Work History

Senior Manager – Data Science and Analytics

Cognizant
08.2018 - Current
  • Proficient in technology leadership, architecture, designing, building teams from starch, strategy planning, project management, client/customer management along with the budget planning and CP handling.
  • Managing multiple Manager, leaders, architects, engineers and testers.
  • Crafting technology strategies vision, roadmap and leading data team to deliver solution Leveraging OnPrem, cloud and Opensource technologies.
  • Manage and drive direction from strategy, planning, prototype to execution of applying data analytics which includes data scoping, analysis, preprocessing, applying relations, deriving ML models, Optimizing the model performance & accuracy, visualization and delivering actionable insights for Banking and Insurance Product.
  • Worked and manage multiple parallel RPFs/Proposal for Data related project for banking, Communication and tech clients.
  • Working with Product team as well as played a Scrum Manager for multiple projects.
  • Playing as an individual contributor to some of the ML and AI projects.
  • Managed large-scale projects and introduced new systems, tools, and processes to achieve challenging objectives.
  • Held monthly meetings to create business plans and workshops to drive successful business.
  • Reviewed and analyzed reports, records and directives to obtain data required for planning department activities.
  • Executed appropriate staffing and budgetary plans to align with business forecasts.
  • Evaluated employee performance and conveyed constructive feedback to improve skills.
  • Planned, created, tested and deployed system life cycle methodology to produce high quality systems to meet and exceed customer expectations.

Senior lead – Data and Analytics

Wipro Technology
01.2017 - 08.2018

Tech lead – Data and Analytics

Honeywell R&D
11.2015 - 12.2016

Tech lead – Data

IBM
11.2009 - 11.2015

Education

B.Tech - Computer Science

Utkal University

Skills

  • Machine learning expertise
  • AI technology expertise
  • Proficient in Pandas library
  • Proficient in NumPy
  • Data visualization with Matplotlib
  • Keras framework proficiency
  • Experience with Scikit-learn library
  • Experienced with TensorFlow frameworks
  • Language processing proficiency(NLP)
  • Supervised learning
  • Unsupervised learning
  • Deep learning
  • Neural Networks
  • Text Analytics
  • Deep Reinforcement learning
  • Pytorch
  • Boosting Algorithm
  • Markov Chain Model
  • LLM
  • ChatGPT
  • Apache Spark development
  • Hadoop
  • Azure
  • Proficient in Databricks
  • Snowflake
  • Python
  • SQL
  • PySpark
  • Unix
  • Probability
  • Statistics
  • Hypothesis Testing
  • Integrations & Derivative
  • Linear Algebra
  • Exploratory Analysis
  • Gaussian Mixture & Hidden Markov Models
  • Agile/Waterfall
  • JIRA
  • Confluence
  • ALM
  • Cognos
  • Tableau
  • Teradata/SQL Server
  • Voldemort
  • Spark
  • Scala programming

Current Project Engagement

  • ChartDnB, Confidential, USA, D&B

ChatDnB provides capabilities of enhancing the search mechanism which helps to ask questions and get meaningful insights with relevant to the questions.,

 Python, LLM, ChatGPT, Big Query, GCP, 4,

Individual Contributor, Worked on integrating test suite into github actions for automated execution on git commits, checkout., Implemented caching and memory usages., Performance tuning at the application level., Developed integration test suite using pytest in python for the ChatDnB framework.


  • Financials Analytics enablement of Next-Gen User Experience, Confidential, USA

This is an end-to-end data and analytics platform, with new advanced analytics tools, and a suite of modular front-end solutions that would help clients derive key insights, build new strategies, and take customer-level actions that can be easily integrated into their workflows., 

Azure DW(Redshift), AWS S3, AWS SageMaker, Matrix Profiler, Bayesian Networks, AWS DynamoDB, T5, NLP, ThoughtSpot, PowerBI


Data Engineering Manager, Leading 60+ engineers to implement, develop, design and test the Azure cloud based Data Modernize applications and worked with data science team to formalize the AI based decision-making products on the Financial Data., Re-engineer and re-developed the fraud detection algorithm, introducing new attributes resulting in decreased false positive by 90%., Used Amazon Sage Maker for Data Pre-processing such as Data Merging, Cleaning and doing missing value treatment., Leverage the Matrix Profile Algorithm to scan the Multi Variable Time Series Data to identify the Anomalies in the accelerated dataset., Extract Industry Events using NLP, Classify the Text based on the Window Based Outlier., Analyze the KPI Data Store to discover any inconsistent data characteristics, and pattern due to unavailability of key driving field values., Manage and work closely with User experience team Causal Graphs in D3.js & Python Instances., Owned the project planning, delivery, work breakdowns, resource planning, and day-to-day project management using Jira, Confluence and Scrums., Partner with engineering, and Product teams to identify product and technical requirements, and lead our team through dependencies and delivery milestones., Worked as an Individual contributor on developing ML models using python as well as building data pipeline using spark and scala.



  • AI enabled Order Management of MDM Product, Confidential, USA


The main aim of the is AI enabled solution is minimize the Order fallout and predict the root cause of the fall out incidents based on the Address Product. An efficient Order fallout management system ensures that order failures are detected and corrected early for prompt provisioning of customer service. This can be achieved by developing robust AI driven Process Control, a process that evaluates and monitors performance using data collected over time. Fallout results in Customer churn, Degradation of Service Offerings, and a diminished Customer experience. Here we have considered two AIML based use cases to predict, and resolve fallouts with minimum turnaround time

Root cause Analysis based on Historical Data:

Goal:  When the Order fallout happens, AI can help in looking at the Symptoms, and predicting the Root Causes. The time taken to narrow down the root cause is a significant factor in determining resolution times. It requires expert help, usually the developers of the software or the product vendor. AI can help here by looking at the symptoms and predicting the root causes. This helps ITOps to get down to fixing the root cause quickly. Classification is a machine learning problem of identifying a set of categories to which a new observation belongs to classification. Those include simple decision trees, naive Bayes, random forests, support vector machines, and deep learning.

Predicting Root Causes with Keras:

When a new incident happens, we typically identify the symptoms of the incident first, and populate the related feature variables here, like error codes. We then pass these as an array to the model's predict class’s function. This function will return a numeric value for the root cause. We then translate the numeric value into a label using the inverse transform function on the encoder. We can use this model to predict root causes for a batch of incidents.

Tech Stack – GCP, BIGQUERY, Informatica, CoreLogic Data APIs for Address Products, Google Address Verifier APIs, Classifications Models (Random Forest, Survival Analysis, Multinomial logistic regression), Keras, PowerBI

Role – Data Science  Manager.

Leading 30+ engineers to implement , Develop, Design and test the GCP cloud based MDM applications and worked with data science team to reduce address fallout risk  using ML based Solution .Build an AI Driven Anomaly Detection to predict Errors in Service Order Management. Wherein, the ML model can gauge the Service Order Trends to determine if any item is an Error to Alert using Time Series Data with Service Order Requests, Used Python packages like SciKit-Learn for Data pre-processing activities like transforming values into an appropriate format.Analyze the KPI Data Store to discover any inconsistent data characteristics, and pattern due to unavailability of key driving field values, Manage and Worked closely with User experience team Causal Graphs in D3.jS & Python Instances,Owned the project planning, delivery, work breakdowns, resource planning, and day-to-day project management using Jira, Confluence and Scrums.Partner with engineering, and Product teams to identify product and technical requirements, and lead our team through dependencies and delivery milestones. Worked Architecting the entire solution and build a design of a single 360 polygon views of their MDM -Address products.Worked closely with the testing team to find out the automation strategy to test the Data , API, Reports and AI/ML models and their accuracy .


  • Ad Platform ( App Recommendation ), Confidential, USA

A privacy-friendly implementation of the item-item recommender system (an algorithm that learns “similarity” scores for two apps). The production model is trained on the Apple Media Prodcut server and transmits only the app-to-app similarity scores, which are user-agnostic, to Ad Platform servers. These app-to-app similarity scores would be calculated and transmitted for ALL ad-enabled storefronts on a daily/weekly basis. The main To have an user agnostic(unpersonalised) similarity score that tells us how common it is for the user to interact with a specific app and then another specific app, in the case where those two interactions happen .

Tech Stack – AWS, Spark , Scala , Statistical Model, Tableau

Role – Data Science Manager.

Responsibilities :

Worked and built the App recommendation model based on the app to app similarity using mathematical models .

Worked on EDA for user clicks data using python data analysis and pre-processing packages .

Build an MarkOv chain Model as well as mathematical logarithms models to find out the probabilistic ration( P value).

Worked with customer to understand the required and convert the requirement to model where we can get the similarity source between the two apps.


Technical Summary

LLMs and NLP, Python, PyTorch, TensorFlow, LangChain, LlamaIndex, Text Analytics, Ad Analytics, Anomaly Detection, Recommendation Engine, Statistical Modelling, Gen AI and agnostic Ai

Additional Information

  • 14 years of experience, 6+ years in providing business solution on different platforms of data analysis, Data Science and Machine learning and 5+ years of experience in AI/ML engineering, with a strong focus on LLMs and NLP and data testing for banking and Insurance product.
  • Proficiency in Python and AI frameworks such as PyTorch, TensorFlow, LangChain, LlamaIndex,
  • Experience in implementing Text Analytics, Ad Analytics, Anamology Detection and the Recommendation engine using NLP, logistic Regression, Classification, K-mean clustering and isolation forest.
  • Involved in all the phases of project life cycle including Data Acquisition, A/B Testing, Hypothesis Testing, EDA, Data Cleaning ,Data Imputations( Outliers detections , residual Analysis, PCA etc), Data transformations, feature scaling , feature engineering , Statistical Modelling for both liner and non-liner dataset. Factor analysis, testing and validations using ROC plot, F1-Score, K fold cross validations and data pattern visualization.
  • Exposure and 1 yest of experience with open-source (Llama, Mistral, Falcon, etc.) and proprietary models (GPT-4, Claude, Gemini, etc.) to build state-of-the-art AI applications
  • Excellent problem solving and data analysis skills, with expertise in developing or applying predictive analytics, statistical modeling, A/B experiments, data mining and machine learning algorithms
  • Good Experience in spark , python and Scala programming .

Timeline

Senior Manager – Data Science and Analytics

Cognizant
08.2018 - Current

Senior lead – Data and Analytics

Wipro Technology
01.2017 - 08.2018

Tech lead – Data and Analytics

Honeywell R&D
11.2015 - 12.2016

Tech lead – Data

IBM
11.2009 - 11.2015

B.Tech - Computer Science

Utkal University
SARAT KUMAR SETHY