Accomplished Senior Data Scientist at United Health Group, leveraging expertise in machine learning and data visualization to enhance patient outcomes. Skilled in SQL and Python, I developed AI-driven models and automated ETL pipelines, achieving significant operational efficiencies while ensuring compliance with HIPAA regulations. Strong collaborator with a focus on impactful data-driven solutions.
Overview
8
8
years of professional experience
Work History
SENIOR DATA SCIENTIST / ML
UNITED HEALTH GROUP
MINNESOTA, UNITED STATES
04.2024 - Current
Developed and deployed AI-driven predictive models for healthcare analytics to improve patient outcomes and operational efficiency.
Utilized machine learning (ML) algorithms such as Random Forest, XGBoost, and Neural Networks for disease prediction and risk stratification.
Designed NLP models for processing and analyzing clinical notes, improving data extraction efficiency from unstructured healthcare data.
Built automated ETL pipelines using Python (Pandas, NumPy, SQLAlchemy) to process large-scale healthcare datasets.
Integrated Azure ML and cloud-based solutions to streamline model training, deployment, and monitoring for scalable AI applications.
Implemented explainable AI (XAI) techniques to enhance model transparency and ensure compliance with healthcare regulations (HIPAA).
Developed AI chatbots using ChatGPT for patient engagement and customer support automation.
Applied reinforcement learning techniques for optimizing hospital resource allocation.
Improved model interpretability using SHAP values and LIME analysis for better stakeholder communication.
DATA SCIENTIST
Citibank
CHENNAI, INDIA
06.2020 - 11.2022
Developed and optimized scalable data pipelines using Python (Pandas, PySpark) and SQL to process large-scale financial datasets while ensuring adherence to data governance, data privacy, and compliance standards.
Built and deployed machine learning models for fraud detection, credit risk scoring, and customer segmentation, improving operational decision-making accuracy by 20%.
Designed and implemented ETL workflows for structured and semi-structured banking data using PySpark, SQL, and Apache Airflow, integrating data into centralized analytics environments.
Applied statistical and machine learning algorithms including Logistic Regression, Random Forest, XGBoost, and Time Series Forecasting to model transaction patterns, predict churn, and optimize credit decisions.
Created interactive dashboards using Power BI and Tableau to track real-time KPIs, financial trends, and model performance, enabling proactive business insights.
Applied Natural Language Processing (NLP) techniques using SpaCy and NLTK to extract insights from unstructured customer feedback, support tickets, and transaction descriptions.
Built data validation and anomaly detection frameworks to improve data quality, reduce inconsistencies by 25%, and enhance trust in data products across business teams.
Supported the migration and deployment of analytical workflows on cloud platforms including AWS (S3, EC2, SageMaker) and Azure (Blob Storage, Azure ML) to enhance model scalability and reduce infrastructure costs.
Collaborated with cross-functional teams—including compliance, product, engineering, and finance—to align ML solutions with regulatory, operational, and strategic business objectives.
Participated in code versioning and CI/CD practices using Git and Azure DevOps, ensuring reproducibility and deployment reliability of data science models.
DATA ANALYST
NAVIGANT
THIRUVANANTHAPURAM, INDIA
01.2019 - 05.2020
Analyzed large volumes of healthcare claims and operational data using Python (Pandas, NumPy) and SQL to identify trends, reduce inefficiencies, and support cost-saving initiatives across care management programs.
Developed and maintained ETL pipelines for structured healthcare datasets using Python and SQL, ensuring accurate data ingestion, transformation, and validation.
Conducted exploratory data analysis (EDA) and statistical analysis to evaluate patient outcomes, treatment effectiveness, and program impact, supporting strategic healthcare planning.
Built dashboards in Tableau and Power BI to visualize key performance indicators (KPIs), cost drivers, and program metrics, enabling leadership to make informed, data-driven decisions.
Automated recurring reporting processes and data extraction tasks, reducing manual effort and increasing reporting speed by 20%.
Wrote and optimized complex SQL queries for real-time reporting, ad hoc analysis, and large-scale data retrieval from clinical and financial databases.
Collaborated with cross-functional teams, including clinical stakeholders and data engineers, to translate business questions into actionable analytical insights.
PYTHON DEVELOPER
IKS HEALTH
HYDERABAD, INDIA
06.2017 - 12.2018
Designed and developed scalable data pipelines and ETL workflows using Python (Pandas, NumPy, PySpark), optimizing data processing efficiency.
Built and deployed machine learning models using Scikit-learn, TensorFlow, and PyTorch, applying feature engineering, hyperparameter tuning, and validation to improve accuracy.
Created and optimized RESTful APIs with Flask/Django to serve ML models and enable seamless integration with business applications.
Developed and maintained SQL and NoSQL database solutions (MySQL, PostgreSQL, MongoDB), optimizing queries for faster data retrieval and analysis.
Deployed ML models and data pipelines on cloud platforms (AWS, Azure) using Docker, Kubernetes, and MLOps best practices for scalability and automation.
Conducted EDA, statistical analysis, A/B testing, and data visualization using Matplotlib, Seaborn, and Power BI to extract actionable insights.
Implemented deep learning architectures for NLP, computer vision, and time series forecasting, improving automation and decision-making.
Collaborated with cross-functional teams for code reviews, model documentation, and stakeholder presentations, ensuring transparency and alignment with business goals.