Summary
Overview
Work History
Education
Skills
Timeline
Generic

Prasanth DS

Data Scientist /AI-ML Engineer
Denton,TX

Summary

Professional Data Scientist with 8 + years of experience in Data Science and Analytics including Machine Learning/Deep Learning/Data Mining and Statistical Analysis. Data Scientist familiar with gathering, cleaning and organizing data for use by technical and non-technical personnel. Advanced understanding of statistical, algebraic and other analytical techniques. Highly organized, motivated and diligent with significant background in Machine Learning. Meticulous Data Scientist accomplished in compiling, transforming and analyzing complex information through software. Expert in machine learning and large dataset management. Demonstrated success in identifying relationships and building solutions to business problems. Astute

Overview

9
9
years of professional experience

Work History

Data Scientist Lead

Elevance Health
8 2023 - Current
  • Implemented Machine Learning, Computer Vision, Deep Learning and Neural Networks algorithms using TensorFlow, Kera’s and designed Prediction Model using Data Mining Techniques with help of Python, and Libraries like NumPy, SciPy, Matplotlib, Pandas, Scikit-learn
  • Used pandas, NumPy, Seaborne, SciPy, matplotlib, Scikit-learn, NLTK in Python for developing various machine learning algorithms
  • Worked with text feature engineering techniques like n-grams, TF-IDF, word2vec etc
  • Applied Support vector machines (SVM) and it’s kernels such Polynomial, RBF-kernel on machine learning problems
  • Worked on imbalanced datasets and used the appropriate metrics while working on the imbalanced datasets
  • Worked with deep neural networks and Convolutional Neural Networks (CNN’s) and Recurrent Neural networks (RNN’s)
  • Developed low-latency applications and interpretable models using machine learning algorithms
  • Participated in all phases of data mining; data collection, data cleaning, developing models, validation, visualization and performed Gap analysis
  • Good knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Secondary Name Node, and MapReduce concepts
  • Programmed by a utility in Python that used multiple packages (SciPy, NumPy, pandas)
  • Implemented Classification using supervised algorithms like Logistic Regression, SVM, Decision trees, KNN, Naive Bayes
  • Responsible for design and development of advanced R/Python programs to prepare to transform and harmonize data sets in preparation for modeling
  • Worked as Data Architects and IT Architects to understand the movement of data and its storage and ER Studio 9.7
  • Worked on different data formats such as JSON, XML and performed machine learning algorithms in Python
  • Updated Python scripts to match training data with our database stored in AWS Cloud Search, so that we would be able to assign each document a response label for further classification
  • Handled importing data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into HDFS
  • Implemented Agile Methodology for building an internal application
  • Data Manipulation and Aggregation from a different source using Nexus, Toad, Business Objects, Powerball, and Smart View
  • Interaction with Business Analyst, SMEs, and other Data Architects to understand Business needs and functionality for various project solutions
  • Researched, evaluated, architected, and deployed new tools, frameworks, and patterns to build sustainable Big Data platforms for the clients
  • Data transformation from various resources, data organization, features extraction from raw and stored.

Data Scientist/ML Engineer

Oncor
01.2020 - 07.2022
  • Performed Data Profiling to learn about behavior with various features of USMLE examinations of various student patterns using Tableau, Adobe Analytics, and Python Matplotlib
  • Evaluated models using Cross Validation, Log loss function, ROC curves and used AUC for feature selection and elastic technologies like Elastic Search, Kibana etc
  • Addressed overfitting by implementing the algorithm regularization methods like L2 and L1 and dropouts in neural networks
  • Implemented statistical modeling with XGBoost machine learning software package using Python to determine the predicted probabilities of each model
  • Worked with different performance metrics like log-loss, AUC, confusion matrix, f1-score for classification and mean square error, mean absolute error for regression problems
  • Worked with text feature engineering techniques n-grams, TF-IDF, word2vec etc
  • Created master data for modeling by combining various tables and derived fields from client data and students LORs, essays, and various performance metrics
  • Formulated a basis for variable selection and Grid Search, KFold for optimal hyperparameters
  • Utilized Boosting algorithms to build a model for predictive analysis of student’s behavior who took USMLE exam apply for residency
  • Used NumPy, SciPy, pandas, NLTK (Natural Language Processing Toolkit), matplotlib to build the model
  • Extracted data from HDFS using Hive, Presto and performed data analysis using Spark with Scala, pySpark, Redshift, and feature selection and created nonparametric models in Spark
  • Application of various Artificial Intelligence (AI)/machine learning algorithms and statistical modeling like decision trees, text analytics, Image and Text Recognition using OCR tools like Abbyy, natural language processing (NLP), supervised and unsupervised, regression models
  • Used Principal Component Analysis and T-SNE in feature engineering to analyze high dimensional data
  • Performed Data Cleaning, features scaling, features engineering using pandas and NumPy packages in python and build models using deep learning frameworks
  • Created deep learning models using TensorFlow and Keras by combining all tests as a single normalized score and predict residency attainment of students
  • Used OnevsRest Classifier to fit each classifier against all other classifiers and used it on multiclass classification problems
  • Implemented application of various machine learning algorithms and statistical modeling like Decision Tree, Text Analytics, Sentiment Analysis, Naive Bayes, Logistic Regression and Linear Regression using Python to determine the accuracy rate of each model
  • Created and designed reports that will use gathered metrics to infer and draw logical conclusions of past and future behavior with cloud-based products like Azure ML Studio and Dataiku
  • Generated various models by using different machine learning and deep learning frameworks and tuned the best performance model using Signal Hub and AWS Sage Maker/Azure Data bricks.

Data Scientist

Indeed
10.2017 - 12.2019
  • Collected and preprocessed customer data from CRM systems using Python (Pandas, NumPy) and stored it in AWS S3, ensuring clean, consistent data that reduced missing values and outliers by 15%, improving model accuracy
  • Performed exploratory data analysis using Matplotlib and Seaborn to identify trends such as customer tenure and contract types, which increased the model's interpretability by 20%, allowing for more informed business decisions
  • Engineered features using Scikit-learn and FeatureTools to enhance the model’s prediction performance, resulting in a 10% boost in accuracy, with data processed in AWS S3 for scalable storage and retrieval
  • Built and trained machine learning models (Random Forest, XGBoost, Logistic Regression) in AWS SageMaker, achieving a 5% increase in precision through hyperparameter tuning and cross-validation, with trained models stored for future use
  • Deployed machine learning models in AWS Lambda and exposed them via API Gateway, enabling real-time customer churn predictions, with API Gateway handling up to 1,000 requests per minute
  • Monitored deployed models using AWS CloudWatch and SageMaker Model Monitor to track model performance and identify data drift, reducing performance degradation over time by 25%.

Python Developer

Nagarro
04.2015 - 09.2017
  • Increased team efficiency by 30% through automation of repetitive processes using Python scripts
  • Designed and implemented RESTful APIs using FastAPI, enhancing system integration and communication
  • Built data pipelines for real-time data ingestion and processing, utilizing Python, Pandas, and NumPy
  • Developed machine learning models with scikit-learn and TensorFlow to drive data-driven decision-making
  • Developed and maintained scalable Python applications, handling large-scale data processing and automation tasks
  • Designed and implemented RESTful APIs using FastAPI, enabling seamless integration between systems
  • Developed scalable APIs using FastAPI for efficient handling of asynchronous web services and real-time data processing
  • Implemented data validation and serialization using Pydantic models, ensuring strict adherence to type safety and schema requirements
  • Designed RESTful APIs that efficiently handled CRUD operations, including authentication, data validation, and error handling
  • Integrated with external services and databases, enabling seamless API interactions and third-party data ingestion
  • Optimized API performance by implementing asynchronous requests, minimizing latency, and ensuring responsiveness of the services
  • Deployed FastAPI applications on cloud platforms, ensuring reliable and scalable deployment using Docker and CI/CD pipelines
  • Worked extensively with modern Python frameworks, leveraging Pydantic for high-performance data parsing and validation
  • Utilized FastAPI's dependency injection system to streamline request lifecycle management and enhance testability
  • Designed and implemented error handling and exception management strategies to provide clear and consistent API responses
  • Ensured adherence to REST API standards, improving interoperability and maintainability of services.

Education

Masters of Science - Artificial Intelligence

University of North Texas
Denton, Texas

Bachelor of Technology - Computer Science

Amrita Vishwa Vidyapeetham

Skills

Python

Timeline

Data Scientist/ML Engineer

Oncor
01.2020 - 07.2022

Data Scientist

Indeed
10.2017 - 12.2019

Python Developer

Nagarro
04.2015 - 09.2017

Data Scientist Lead

Elevance Health
8 2023 - Current

Masters of Science - Artificial Intelligence

University of North Texas

Bachelor of Technology - Computer Science

Amrita Vishwa Vidyapeetham
Prasanth DSData Scientist /AI-ML Engineer