Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Mobeen Butt

Data Architect | Principal Data Engineer | Big Data Engineer
Buffalo,NY

Summary

Strategic Data Architect with 9+ years of experience designing and scaling cloud-native platforms, big data ecosystems, and real-time streaming systems. Expert in multi-cloud (AWS, Azure, GCP), Kafka/Spark/Flink, and modern data stacks (Snowflake, Databricks, dbt, Delta Lake). Skilled in healthcare data standards (FHIR, HL7, HIPAA, GDPR), enabling secure, compliant solutions that power AI/ML and enterprise analytics. Adept at leading teams, modernizing legacy systems, and aligning data strategies with business goals, delivering cost savings, performance gains, and mission-critical insights.

Overview

10
10
years of professional experience
4
4
Certifications

Work History

Data Architect

ApTask
12.2023 - Current
  • Designed and deployed a cloud-native healthcare data lakehouse on Snowflake + GCP, centralizing clinical, claims, and payer data for 30+ enterprise stakeholders.
  • Integrated HL7/FHIR datasets using NiFi and MuleSoft, ensuring interoperability across EPIC, Cerner, and payer systems.
  • Built real-time ingestion frameworks (Kafka, Flink, Dataflow) reducing alert latency from minutes to
  • Established enterprise-wide data governance with OpenMetadata, dbt tests, and Great Expectations, improving trust and compliance.
  • Partnered with analytics teams to develop semantic data models, enabling unified metrics and self-service BI.
  • Reduced data platform costs by 35% through tiered storage, autoscaling compute, and partitioning strategies.
  • Led architecture reviews and mentored junior engineers to enforce best practices in streaming, governance, and compliance.

Principal Data Engineer

Petra Power
01.2021 - 11.2023
  • Spearheaded multi-cloud migration (AWS + Azure), reducing operating costs by $1M annually through optimization and consolidation.
  • Architected AI-ready pipelines with Delta Lake, MLflow, and Feature Stores for real-time predictive scoring.
  • Directed and mentored a team of 8 engineers; introduced CI/CD with Terraform + GitHub Actions, cutting deployment time by 80%.
  • Delivered real-time streaming solutions with Kafka + Flink, powering fraud detection and real-time decision-making.
  • Created observability dashboards (Grafana, Monte Carlo), improving system reliability and achieving 99.9% SLA compliance.
  • Standardized data contracts & lineage tracking, enabling cross-functional collaboration between product and engineering.
  • Automated infrastructure provisioning with Terraform, achieving consistent zero-downtime releases.

Senior Data Engineer – Team Lead

Flatiron Health
01.2018 - 12.2020
  • Engineered HIPAA-compliant pipelines processing 10M+ healthcare records/day across AWS and GCP.
  • Designed CDC pipelines (Debezium + Kafka) for near real-time synchronization, cutting ETL lag from hours to seconds.
  • Built self-service BI with Looker & Power BI, reducing manual reporting workload by 70%.
  • Enforced security with IAM roles, encryption, and dynamic policy controls, ensuring HIPAA & GDPR compliance.
  • Modernized data pipelines with Databricks Feature Store for real-time ML workflows.
  • Led a team of 5 engineers, improving delivery efficiency by 40% through agile best practices.
  • Automated compliance reporting and audit trails with Unity Catalog, improving regulatory readiness.

Data Engineer

ProCogia
09.2015 - 12.2017
  • Built ETL pipelines with Airflow, Informatica, and Talend, supporting healthcare and retail data workloads.
  • Designed dimensional models (Kimball) and dashboards in Tableau & Power BI, enabling executive-level decision-making.
  • Strengthened governance with automated lineage + validation scripts in dbt and Python, cutting defects by 30%.
  • Optimized SQL & Spark queries, reducing pipeline latency by 50% across large datasets.
  • Delivered A/B testing and predictive models for revenue and user behavior analytics.
  • Developed self-service dashboards embedding KPIs across multiple business units.
  • Collaborated with cross-functional teams to align analytics with business goals.

Education

Bachelor of Science - Computer Science

Skills

Cloud Platforms & Data Services: AWS (EC2, S3, Lambda, Redshift, Kinesis, Glue), Azure (Synapse, Data Factory, Databricks, AKS), GCP (BigQuery, Dataflow, Pub/Sub, Vertex AI), Snowflake, Palantir Foundry, Multi-Cloud Deployments, Cost Optimization

Certification

Google Cloud – Professional Data Engineer

Timeline

Data Architect

ApTask
12.2023 - Current

Principal Data Engineer

Petra Power
01.2021 - 11.2023

Senior Data Engineer – Team Lead

Flatiron Health
01.2018 - 12.2020

Data Engineer

ProCogia
09.2015 - 12.2017

Bachelor of Science - Computer Science

Mobeen ButtData Architect | Principal Data Engineer | Big Data Engineer