Data Analyst with 5 years of expertise in infrastructure projects, now equipped with a Master’s in Data Science and ready to transition into the tech industry. Skilled in applying data-driven solutions to enhance project planning, scheduling, and resource optimization. Accomplished in predictive modeling, time-series analysis, and machine learning, with a proven ability to reduce project delays and improve resource efficiency. Expert in developing interactive dashboards for real-time tracking of project progress and resource allocation, enabling proactive decision-making and automating reporting processes to reduce manual efforts. Adept at creating KPI-driven insights that facilitate better project outcomes and resource utilization.
Predictive Maintenance Classification for Water Wells August 2023
• Designed a classification model to identify the functionality class of wells for informed
allocation of maintenance budgets and determined the top 5 reasons for water well failure.
• Generated predictions, confusion matrix, and map visualizations using the Cartopy library.
• Conducted feature engineering and utilized feature importance functions to automate the
classification modeling and tuning process.
• Evaluated and compared models including K-Nearest Neighbors, Random Forest, and
XGBoost.
• Results: Achieved 80% accuracy with the XGBoost model and a 0.12 error ratio in predicting
well functionality
Time Series Analysis for Real Estate Investment Optimization
• Conducted a time series analysis to identify the top 5 zip codes for investment by a real
estate company, considering investment period and resilience to unforeseen events like
the 2008 recession.
• Applied the Auto-ARIMA process to optimize return on investment (ROI).
• Identified investment strategies with a risk-to-return ratio (Coefficient of Variance) below
0.35 and an annual ROI of at least 2.5%.
• Results: Predicted ROI ranges between 2.5% - 14.06% for a 3-year investment period and
8% - 14.27% for 5-10 years, with every $500k invested.
Bridge Condition Prediction Using Multi-Class Classification
• Completed an external project to identify bridges in critical condition across the US, aiding
in future bridge condition predictions using data from the Department of Transportation
and NASA's MERRA-2 program.
• Collected and analyzed 618k data points of bridges dating back to 1920.
• Implemented CRISP-DM methodology and feature engineering, utilizing Folium libraries
for data visualization of bridge locations and assessing the impact of snow days on
substructure corrosion.
• Applied the SMOTE oversampling technique to address class imbalance (47%, 45%, 7%)
and used multi-classification metrics to evaluate model performance.
• Results: Achieved an F1 score of 64% with the Random Forest model and a 0.18 error ratio
in predicting bridge conditions.