...FOCUSED ...DATA-DRIVEN ...ANALYTICAL MINDSET ...CONSISTENT. ...PASSIONATE...PROCESS DRIVEN (AGILE)…
...TEAM PLAYER
Recent M.S. graduate with hands-on experience eager to apply a strong command of Python, R, SQL, and DAX to analyze and interpret complex data. Skilled in advanced analytics tools like Power BI and Tableau, Specialize in statistical modeling, predictive analytics, and harnessing big data through cloud solutions and programming libraries, including Pandas and TensorFlow. With 5 years of proven track record in business intelligence, i am poised to deliver innovative solutions and strategic insights in a dynamic, data-driven landscape.
Technical Skills
Programming Languages:
Python, R, SQL, DAX
Tools: Power BI, Tableau, SSIS,
Alteryx, Rstudio, Jupyter
Notebook, MS Excel (Pivot table,
Power Query, VLOOKUP),
PowerPoint
Data Science & Machine
Learning: Statistical modeling,
Quantitative Analysis, Predictive
Analysis, Time Series
Forecasting, Regression
Analysis, Classification,
Clustering, Bayesian Methods,
Decision Trees, SVM, Random
Forest, Naïve Bayes, GLM,
Kmeans, gboosting
Libraries & Packages: NumPy,
Pandas, Matplotlib, Seaborn,
Scikit-learn, TensorFlow,
PySpark, NLTK, dplyr, tidyverse,
ggplot2, E1071, Rpart
Databases/Cloud Technologies:
MSSQL, MySQL, Looker, Google
Data Studio, S3, AWS Data
Pipeline, AWS Glue, Redshift,
Snowflake, Databricks
Big Data: Hadoop, MapReduce,
Apache Spark
Strategy Methodologies: SWOT
Analysis, Agile methodology,
Scrum
Hotel Cancellation Prediction Project, R, Tableau, Machine Learning, 10/01/21, 12/01/21
Used R for data cleaning and analysis, and Tableau for sensitivity analysis, leading to a 30% reduction in hotel cancellations by identifying key cancellation drivers and customer demand trends. Devised machine learning models including association rule mining, SVM, and decision trees, achieving 87% accuracy in predicting future hotel cancellations.
Heart Disease Prediction, Python, Supervised Learning Models, Feature Scaling, 03/01/22, 05/01/22
Facilitated comprehensive data cleansing and exploratory. Analysis in Python, coupled with the use of SMOTE sampling and machine learning techniques like K-Nearest Neighbors and Random Forest, to develop a heart disease prediction model with a 90% accuracy rate.
NBA Player Stats, PySpark, Regression/Clustering, Grid Search, Feature Engineering, 10/01/22, 12/01/22
Performed linear and random forest regression with hyperparameter tuning in PySpark and applied feature engineering and logistic regression techniques to predict a player's net rating and offensive capabilities, achieving an AUC of 71.21%.