Dynamic Business/Data analyst with proven expertise in data cleaning, analysis, and visualization, leveraging tools like Python, Excel, and Power BI. Excelled at the University of Alabama in Huntsville by developing and deploying advanced machine learning models, enhancing decision-making and operational efficiency. Demonstrates strong analytical skills and Agile methodology, driving impactful workforce development strategies.
Data Cleaning and Organization: Leveraged Excel to clean and structure large datasets, ensuring data integrity by correcting errors and excluding metro areas with fewer than 50,000 workers.
Data Analysis and Visualization: Used Excel to perform in-depth analysis, calculating workforce percentages, ranking job concentrations, and creating visual representations (charts/graphs) to illustrate workforce trends and distributions.
STEM Categorization: Classified occupations into STEM and non-STEM categories using SOC codes, enabling focused analysis of Huntsville’s job market strengths and areas for improvement.
Development of Reporting Tool: Created a user-friendly reporting tool to provide stakeholders with accessible insights, supporting strategic workforce development decisions.
Tools Utilized: Excel (for data cleaning, analysis, and visualization), Reporting Tools (for streamlined presentation of findings), SOC Codes (for occupational classification).
Data Collection and Cleaning: Sourced and cleaned transaction data, addressing missing values and outliers to ensure data quality using Python.
Exploratory Data Analysis (EDA): Conducted detailed EDA to uncover patterns and correlations in the data, leveraging Matplotlib and Seaborn for visualization.
Feature Engineering: Engineered new features, including time-based attributes and categorical encoding, to enhance model performance, using Pandas and NumPy.
Model Development: Built and optimized multiple machine learning models, such as Logistic Regression, Random Forest, and XGBoost, using scikit-learn.
Model Evaluation and Selection: Evaluated models with metrics like Accuracy, Precision, and ROC-AUC, selecting the most effective model using GridSearchCV from scikit-learn.
Implementation and Testing: Deployed the selected model in a real-time fraud detection system and tested it extensively for reliability and accuracy, using Python.
Python
R
Power Bi
Tableau
SQL
PL/SQL
UML Diagrams
Excel
Agile Methodology
AWS