
Data Scientist with experience delivering machine learning models, data pipelines, and analytics solutions across healthcare and consulting environments. Strong background in Python-based data processing, statistical modeling, feature engineering, and large-scale data systems. Experienced in building classification, regression, and clustering models, working with cloud data warehouses, and supporting production-ready analytics workflows. Adept at collaborating with data engineering, analytics, and operations teams to translate data into measurable business value.
Programming & Data Processing
Python, SQL, Apache Spark, Kafka
Machine Learning & AI
Classification, Regression, Clustering, Feature Engineering, XGBoost, Random Forest, SVM, CNN, Transformers, BERT, Word2Vec, TF-IDF
Data Engineering & Pipelines
ETL/ELT Pipelines, Apache Airflow, AWS Glue, Data Quality Checks
Databases & Warehousing
Snowflake, Azure Synapse, MySQL, PostgreSQL, MongoDB, Cassandra, Oracle
Data Architecture & Formats
Data Lakes, Dimensional Modeling, JSON, Avro, Parquet
Visualization & BI
Tableau, Power BI, Matplotlib
Version Control & Collaboration
GitHub, GitLab
Cloud Platforms
AWS, Azure, GCP
Distributed AI-Powered Personalized Job Recommendation System, 01/2025 – 05/2025
Python, Hadoop (HDFS), Apache Spark, Spark Streaming, BERT, Word2Vec, TF-IDF, XGBoost, SQL
CNN-Based Human Activity Recognition using Wearable Sensors, 09/2024 – 12/2024
Python, TensorFlow, CNN, HAR, MEMS sensors (Accelerometer, Gyroscope)
Efficient Sentiment Analysis using Encoder-Only Transformer 08/2024 – 12/2024
Python, PyTorch, NLP, SQL
Video Analysis for Weapon Detection and Alerting, 01/2024 – 04/2024
Python, OpenCV, TensorFlow, Keras, CNN, RCNN, YOLO
Detection of Stress in IT Employees using Machine Learning Techniques 09/2023 – 12/2023
Python, Scikit-learn, SVM, Random Forest, Tableau