Summary
Overview
Work History
Education
Skills
Project Experience
Extracurricular
Certification
Timeline
Generic

VENKAT SAI THANOOJ BALAGIRI

Albany

Summary

Aspiring Data Analyst and Machine Learning Enthusiast with a strong foundation in data analytics, statistical modeling, and predictive analysis. Proficient in Python, SQL, Power BI, and Excel for transforming raw data into actionable insights. Skilled in data preprocessing, feature engineering, A/B testing, cohort analysis, and time-series forecasting. Experienced in applying machine learning and deep learning techniques to solve real-world problems. Adept at using PyTorch Geometric, Scikit-learn, Pandas, NumPy, and NetworkX for efficient data manipulation and analysis. Comfortable with building interactive dashboards, managing databases (PostgreSQL), and generating interpretable AI outputs using LLMs. Strong understanding of model validation, cross-validation, hyperparameter tuning, and communicating data-driven insights effectively.

Overview

1
1
Certification

Work History

Product Analyst Intern

ProYuga Advanced Technologies Ltd.
Hyderabad
05.2023 - 07.2023
  • Analyzed player engagement and sales data from iB Cricket using SQL and Python, identifying trends that improved user retention and revenue growth.
  • Developed Power BI dashboards to track key game metrics, such as match participation, in-game purchases, and session durations, enhancing decision-making for product strategy.
  • Performed cohort analysis to understand player drop-off patterns, leading to feature optimizations that improved retention by 10%.
  • Conducted A/B testing to assess new gameplay mechanics, ensuring data-driven improvements in iB Cricket's user experience.
  • Automated data reporting using Python (Pandas, Matplotlib), reducing manual reporting effort by 30%.

Education

Master of Science - Information Science - Data Analytics

State University of New York-Albany
Albany, NY
12-2025

Bachelor of Technology (B.Tech) - Electronics And Communications Engineering

CMR College of Engineering And Technology
Hyderabad,Telangana
04-2023

Skills

  • Programming :Python, SQL,R,JavaScript (Reactjs), C,C
  • Data Analysis & Modeling:Pandas, NumPy, Data Cleaning, Statistical Analysis, Predictive Modeling, SPSS
  • Data Visualization:Power BI, SPSS,Tableau, Matplotlib, Seaborn
  • Databases & Data Engineering:MySQL, PostgreSQL, SQL Server, Snowflake, ETL Pipelines, Data Transformation, Real-Time Data Processing
  • Machine Learning: Regression, Decision Trees, Random Forests, SVM, Neural Networks, NLP, K-Means, KNN,
  • Statistics Tools/OS: Google Analytics, Mixpanel,Jupyter Notebook, GitHub,Windows, Linux, Mac

Project Experience

Schizophrenia Detection Using Knowledge Graphs, GNNs, and LLMs

  • Constructed a heterogeneous knowledge graph using GWAS genetic data to model gene-disease relationships.
  • Applied Node2Vec for node embeddings and trained GNN models (MedGCN, EdgeCWGCN, EdgePRGNN) for multi-class classification of schizophrenia, bipolar disorder, and depression.
  • Achieved highest performance with MedGCN, improving predictive accuracy through edge-type-aware aggregation.Integrated BioGPT and Mistral to generate natural language explanations, enhancing clinical interpretability.
  • Tools used: PyTorch Geometric, NetworkX, Node2Vec, BioGPT, Mistral, Python.

Predicting Crypto Prices using Machine Learning

  • Built machine learning models (LSTM, XGBoost) to forecast Bitcoin prices using historical and sentiment data.
  • Led end-to-end pipeline including data preprocessing, sentiment analysis, model tuning, and evaluation using Python (Pandas, NumPy, Scikit-learn).
  • Enhanced model accuracy with cross-validation and hyperparameter optimization. Project supports smarter trading decisions in volatile crypto markets.

Breast Cancer Image Classification Using Advanced Deep Learning Techniques

  • Developed CNN, MobileNetV3, EfficientNet-B7, ResNet101, and Vision Transformers to classify ultrasound breast images (BUSI dataset) into benign, malignant, and normal classes.
  • Applied transfer learning, data augmentation, and class imbalance techniques, achieving up to 97% test accuracy and ROC-AUC of 1.00 for malignant cases.
  • Vision Transformers outperformed traditional CNNs, highlighting their potential in clinical diagnostics.

Retail Sales Forecasting & Analytics Dashboard

  • Analyzed 4 years of global superstore data using Python (pandas, seaborn) to uncover trends and seasonality. Built a time series model (ARIMA/Prophet) achieving ~90% forecast accuracy.
  • Designed a PostgreSQL database for efficient querying and developed an interactive Power BI dashboard showcasing sales trends, regional insights, and 7-day forecasts.

Extracurricular

  • Analyzed donation patterns for a local NGO using SQL and Power BI, helping them optimize fundraising campaigns.
  • Attended PyData NYC 2024, focusing on ML interpretability and real-time analytics.

Certification

  • Google Data Analytics Professional Certificate (Coursera)
  • Microsoft Power BI Certification (PL-300)
  • IBM Data Science Professional Certificate

Timeline

Product Analyst Intern

ProYuga Advanced Technologies Ltd.
05.2023 - 07.2023

Master of Science - Information Science - Data Analytics

State University of New York-Albany

Bachelor of Technology (B.Tech) - Electronics And Communications Engineering

CMR College of Engineering And Technology
VENKAT SAI THANOOJ BALAGIRI