Summary
Overview
Work History
Education
Skills
Languages
Timeline
BusinessAnalyst
Anthony Arbaiza

Anthony Arbaiza

Hershey,PA

Summary

Data Scientist familiar with gathering, cleaning and organizing data for use by technical and non-technical personnel. Advanced understanding of statistical, algebraic and other analytical techniques. Highly organized, motivated and diligent with significant background in healthcare.

Overview

13
13
years of professional experience

Work History

Data Analytics Manager

JMT
2024.06 - Current
  • Led efforts to decommission legacy data warehouses, overseeing report rewrites and optimizing data governance processes to improve data quality and compliance
  • Spearheaded the creation of comprehensive data documentation, including a data dictionary and schema overviews, ensuring transparency and accuracy in data management practices
  • Established release mechanisms and version control systems to facilitate efficient data engineering workflows and enhance collaboration between team members
  • Researched and implemented strategies for improving data governance, focusing on enhancing data security, availability, and accessibility across the organization
  • Managed and mentored a team of data analysts, aligning project goals with the organization's strategic vision to enhance overall productivity and data-driven decision-making
  • Collaborated with cross-functional teams to define project requirements and ensure that data analytics solutions are aligned with business objectives.

Senior Data Scientist

Natural Resources Conservation Service, USDA
2023.09 - 2024.05
  • Initiated and led the annual NPAD data refresh process by developing and implementing R scripts
  • This initiative significantly improved the efficiency and accuracy of data management processes, reducing processing time by 42% and enhancing data quality
  • The project involved collaborating with cross-functional teams to understand data requirements and ensuring the alignment of data processing with NRCS's strategic objectives
  • Upgraded an existing R script to enhance the security and compliance of sensitive USDA data
  • This involved analyzing data requirements, implementing robust data suppression techniques, and conducting rigorous testing to validate script effectiveness
  • This effort safeguarded critical information and demonstrated my commitment to maintaining high data privacy standards
  • Developed a novel Python-based solution for processing CPAMS (Conservation Practice Adoption Motivations Survey) survey data, translating intricate calculation rules into executable data processing steps
  • This project required a deep understanding of the survey's objectives and the application of advanced data manipulation techniques
  • The resulting process streamlined data analysis, facilitating more accurate and timely insights for decision-making
  • Regularly managed and responded to diverse ad-hoc data requests, employing advanced statistical analysis in R
  • This included aligning datasets with geospatial coordinates for ArcGIS, creating data visualizations in Tableau, and developing custom solutions to meet various data analysis needs
  • This role underscored my versatility and ability to deliver actionable insights from complex data sets
  • Played a pivotal role in the overall data management and analysis strategy at the NRCS
  • My contributions have been instrumental in enhancing data-driven decision-making processes, particularly in the realms of agricultural and environmental conservation
  • This role involved not only technical proficiency in R and Python but also a strategic understanding of how data analytics can drive policy and operational improvements in conservation efforts.

Senior Data Scientist / Engineer

Penn State University, Hershey Medical
2016.11 - 2023.09
  • Identified bottleneck with capacity audits, built centralized data warehouse (PostgreSQL) to connect separate systems (Cerner, Powerscribe), enabling quantification of patient care capacity
  • Managed ETL processes to ensure smooth data transition from disparate sources into our centralized PostgreSQL database, enhancing operational efficiency and data consistency
  • Spearheaded the transition from SAS to R in the initial years, optimizing data processing capabilities and ensuring a seamless transition with no operational downtime
  • Designed and built dashboards (Grafana) to highlight hospital overnight capacity and staffing deficiencies, reduced time to insight from 5+ days to real time and enabled radiology throughput increase by 23%
  • Developed time-series forecasting models (ARIMA, Python, R) to predict radiology demand and revenue, implemented custom departmental models that enabled strategic decisions on radiology center staffing
  • Collaborated with department heads to determine financial reporting requirements, built pipelines and dashboards (Python, Pandas, SQL, Grafana, R) to track progress on meeting revenue targets >500M$
  • Built staff-level reporting (Grafana Dashboards, Tableau) to highlight number of radiology exams completed and capture historic data, generated insights used to inform strategic staffing and organizational decisions
  • Implemented new data record-keeping and audit trails, built reporting and monitoring (PGAdmin) to ensure compliance with health privacy regulations (HIPAA, PII)
  • Acted as technical mentor and instructor to junior analysts / researchers, taught SQL and data visualization skills to enable self-servicing of basic insight generation.
  • Utilized advanced querying, visualization and analytics tools to analyze and process complex data sets.
  • Discovered stories told by data to present information to scientists and business managers.
  • Assessed accuracy and effectiveness of new and existing data sources and data analysis techniques.
  • Created and implemented new forecasting models to increase company productivity.

Database Administrator III / Architect (Top Secret Clearance)

Defense Information Systems Agency (DISA) / DDCITS
2016.02 - 2016.11
  • Architected data models (Microsoft SQL Server) to store records for US Navy Depot
  • Established change management processes in migration from legacy database systems to Microsoft SQL Server, identified system dependencies and designed solutions to minimize system outages
  • Identified query bottlenecks (SQL) leading to decreased database performance, refactored and deployed optimized queries that reduced query runtimes by 50+%
  • Built automated ETL pipelines to ingest data from disparate sources into centralized warehouse
  • Utilized PowerBI to design and develop interactive dashboards and reports, providing valuable insights and facilitating data-driven decision-making processes
  • Conducted data audits and updated documentation of all data systems to ensure compliance with military data privacy regulations
  • Reviewed database error logs to determine root cause and design fixes, ensured maximum system uptime.

Database Architect (Secret Clearance)

Lockheed Martin, Air National Guard
2011.11 - 2016.02
  • Conducted end-to-end development of new grading / academic recordkeeping features (SQL, PHP, HTML) to track student graduation progress, reduced time to insight from 1+ week to near real time
  • Deployed new database queries (SQL) and aggregations enabling population level analysis on student cohorts, surfaced insights impacting graduation rates and enabled identification of graduation roadblocks
  • Maintained SQL database (MS SQL Server), coordinated with stakeholders gather requirements and create ad-hoc custom queries for reporting needs
  • Developed and launched new database update and testing process, designed new unit testing to ensure accurate data, reduced data bugs that could impact student graduation rates
  • Acted as product manager, partnered with stakeholders to define new database features and requirements, constructed end-to-end project plan for development, deployment, and change management
  • Functioned as technical consultant, collaborated with end-users to identify new analytics use cases, designed and architected new technical reporting solutions to solve for the customer.

Education

Masters Degree - Data Science & Applied Statistics -

Pennsylvania State University
University Park, PA
05.2023

Bachelors Degree - Information Sciences & Technology - undefined

Pennsylvania State University
University Park, PA
05.2011

Skills

  • Python (Pandas, scikit-learn, Numpy, Tensorflow, Keras, SciPy)
  • R
  • SQL (MS SQL Server, PL/SQL, PostgreSQL)
  • MapReduce
  • Apache Pig / Hadoop
  • Oracle
  • Java
  • Tableau
  • PowerBI
  • Microsoft Word
  • Excel
  • Powerpoint
  • Grafana
  • A/B Testing
  • Consulting
  • Machine Learning (Random Forest, XGBoost, Support Vector Machine, NN, k-Means Clustering, PCA, NaiveBayes, Decision Trees)
  • Statistical Analysis
  • Modeling
  • Machine Learning
  • Predictive modeling
  • Business Intelligence
  • Data Quality Management
  • Data Warehousing
  • ETL development
  • Data Security
  • Decision trees
  • Advanced analytics
  • API Integration
  • Report Generation
  • SQL Programming
  • NoSQL Databases
  • Algorithm development
  • Advanced Excel
  • Python Programming
  • Statistical modeling
  • Decision-Making
  • User Support
  • Adaptability and Flexibility
  • Multitasking Abilities
  • Team building
  • Budget Control
  • Problem-solving aptitude
  • IT infrastructure proficiency
  • Department management
  • Goal Setting
  • Relationship Building
  • Effective Communication
  • Time Management
  • Project lifecycle management
  • Staff hiring

Languages

Spanish
Full Professional

Timeline

Data Analytics Manager

JMT
2024.06 - Current

Senior Data Scientist

Natural Resources Conservation Service, USDA
2023.09 - 2024.05

Senior Data Scientist / Engineer

Penn State University, Hershey Medical
2016.11 - 2023.09

Database Administrator III / Architect (Top Secret Clearance)

Defense Information Systems Agency (DISA) / DDCITS
2016.02 - 2016.11

Database Architect (Secret Clearance)

Lockheed Martin, Air National Guard
2011.11 - 2016.02

Masters Degree - Data Science & Applied Statistics -

Pennsylvania State University

Bachelors Degree - Information Sciences & Technology - undefined

Pennsylvania State University
Anthony Arbaiza