Results-driven Data Analyst with 3+ years of experience in transforming complex data into actionable insights across healthcare and financial domains. Skilled in SQL, Python, and R for data extraction, cleaning, and statistical analysis. Proficient in developing ETL workflows, predictive models, and interactive dashboards using Power BI and Tableau. Experienced in collaborating with cross-functional teams to support data-driven decision-making. Strong background in machine learning, data visualization, and regulatory compliance in agile environments.
Overview
4
4
years of professional experience
2
2
Certification
Work History
Data Analyst
Mckesson Corporation
, TX
06.2024 - Current
Designed and implemented automated ETL pipelines using Python to ingest over 2 million patient records from EMR systems and Komodo Health's claims database, improving data availability for downstream analytics by 65%.
Wrote complex SQL Server queries to standardize and join clinical datasets, including lab results, diagnosis codes, and treatment history, streamlining data preparation and reducing manual processing time by 40%.
Applied Python (pandas, NumPy) to clean and transform high-dimensional data, ensuring consistency across features used in machine learning models that supported early-stage cancer detection.
Developed over 100 engineered features from structured data and implemented correlation filtering and variance thresholding, improving model performance and increasing predictive sensitivity from 76% to 87%.
Built and validated classification models (Random Forest, XGBoost) using scikit-learn, optimizing hyperparameters through cross-validation and reducing false negatives by 21% in cancer risk predictions.
Created interactive Tableau dashboards to present patient risk stratification and geographic trends, enabling faster decision-making and reducing clinical reporting time from 3 days to under 6 hours.
Partnered with clinical, compliance, and product teams to ensure all data processes adhered to HIPAA regulations, contributing to two successful audits and supporting a 20% increase in early cancer referral rates during pilot deployment.
Data Analyst
Cybage Software
01.2021 - 07.2023
Consolidated over 50 million transaction records and 3 million customer profiles into a centralized SQL Server data warehouse, establishing a reliable foundation for fraud analytics and real-time monitoring.
Developed and automated ETL workflows using SQL Server Integration Services (SSIS) and Python to clean, transform, and standardize large datasets from diverse financial systems, reducing data processing time by 40%.
Implemented machine learning models in Python (Isolation Forest, DBSCAN, Local Outlier Factor) to detect anomalous transaction patterns, improving fraud detection accuracy by 28% and minimizing false positives.
Designed real-time Power BI dashboards to track suspicious transactions, high-risk accounts, and regional fraud trends; enabled compliance teams to respond within 10-15 minutes, significantly reducing financial risk.
Translated fraud risk indicators into SQL-based logic and analytical models, enabling early identification of high-risk transactions and contributing to a 35% increase in proactive fraud detection.
Performed historical fraud case analysis using Python (pandas, seaborn, matplotlib) to identify behavioral trends and system vulnerabilities, which led to the discovery of three previously undetected fraud methods.
Implemented security protocols in Power BI and SQL, including role-based access control and data masking, to ensure compliance with GDPR and internal data governance policies during cross-functional reporting.
At McKesson, designed and automated ETL pipelines using Python and SQL to ingest over 2M patient records, improving data availability by 65% and reducing manual processing time by 40%.
At McKesson, enhanced predictive model performance for early-stage cancer detection by engineering 100+ features and applying Random Forest and XGBoost, increasing sensitivity from 76% to 87% and reducing false negatives by 21%.
At Cybage Software, built a centralized SQL Server data warehouse integrating 50M+ transactions and 3M+ customer profiles, enabling real-time fraud analytics and contributing to a 35% rise in proactive fraud detection.
At McKesson and Cybage, developed interactive dashboards using Tableau and Power BI to monitor patient risk and fraud trends, cutting clinical reporting time from 3 days to 6 hours and enabling fraud response within 15 minutes.
At Cybage Software, implemented anomaly detection models (Isolation Forest, DBSCAN, LOF) in Python, improving fraud detection accuracy by 28% and uncovering previously undetected fraud patterns.
At both McKesson and Cybage, ensured data governance aligned with HIPAA and GDPR, contributing to successful audits and enabling secure, compliant cross-functional reporting.