Summary

Overview

Work History

Education

Skills

Timeline

John Peterson

Santa Clara,CA

Summary

Accomplished Senior Data Engineer with over 10 years of expertise in designing and optimizing cloud data platforms in healthcare, FinTech, and e-commerce. Specializes in Snowflake data warehousing, ELT architecture, and performance tuning for analytical workloads. Developed ACID-compliant financial pipelines and HIPAA-compliant healthcare data solutions using Snowflake, AWS, and Python, while also leveraging Azure and GCP for hybrid-cloud environments.

Overview

years of professional experience

Work History

Senior Data Engineer

Cardinal Health

Remote

02.2021 - Current

Architected and enhanced Snowflake-based healthcare and supply chain data platforms supporting clinical, operational, logistics, and reporting workloads, integrating raw data from APIs, flat files, EDI-style interfaces, cloud object storage, and downstream enterprise systems.
Designed ELT pipelines using Snowflake, AWS S3, AWS Glue, Python, and SQL to ingest and transform high-volume healthcare datasets into governed analytical models optimized for regulatory reporting, operational analytics, and executive dashboards.
Built clinical medallion-style lakehouse pattern for healthcare data domains, organizing raw HL7/FHIR and operational feeds into bronze, silver, and gold layers, enhancing traceability and data quality for analytics teams.
Developed standardization and transformation pipelines for healthcare interoperability data, including HL7-to-FHIR ingestion and normalization, enabling downstream consumers to utilize modern, analytics-friendly structures.
Implemented PHI de-identification and masking logic using hashing, field suppression, role-aware access patterns, and generalized attribute transformations to support HIPAA-aligned analytics without exposing sensitive patient identity data.
Partnered with business and data stakeholders to improve data provenance and audit trails, capturing source-to-target lineage, transformation history, and operational checkpoints so regulated datasets could be traced back to original systems during audit and compliance reviews.
Built automated healthcare data standardization flows supporting clinical terminology normalization and reference mapping concepts aligned with ICD-10, SNOMED-CT, and LOINC-style reporting needs, improving consistency of downstream measures and reporting logic.
Designed Snowflake models and transformation logic supporting quality measure aggregation and population-level reporting use cases, enabling business teams to evaluate screening, adherence, utilization, and operational health metrics more efficiently.
Tuned Snowflake workloads through warehouse right-sizing, clustering strategy improvements, query rewrites, pruning-aware design, and workload separation, improving performance for large-scale joins, curated marts, and dashboard refresh jobs.
Established reusable ingestion and transformation patterns for semi-structured data using JSON, VARIANT columns, Snowflake tasks, streams, and metadata-driven orchestration, reducing manual development effort for new data domains.
Enhanced production reliability through implementation of pipeline observability, load validation, data reconciliation checks, and failure alerting, increasing trust in regulated datasets for analysts and operations teams.
Supported secure enterprise analytics by applying role-based data access, controlled sharing models, and audit-conscious warehouse design, balancing performance with privacy and compliance requirements.

Senior Data Engineer

Fiserv

Remote

03.2016 - 01.2021

Designed and scaled Snowflake-centric financial data platforms supporting transaction analytics, reconciliation, risk reporting, audit workflows, and downstream operational data products in a high-security FinTech environment.
Engineered ACID-aware ingestion and transformation patterns for financial datasets, ensuring transaction completeness, deterministic processing, rollback-safe logic, and reconciliation-friendly lineage across sensitive money movement workflows.
Built robust ELT pipelines with Snowflake, Python, SQL, dbt, Kafka, and Spark to ingest high-volume transaction, settlement, account, and customer activity data from batch and event-driven sources.
Developed real-time and near-real-time streaming ingestion patterns for financial event processing using Kafka and Spark Structured Streaming, enabling faster fraud analysis, event enrichment, and operational monitoring of payment activity.
Supported real-time fraud detection scoring workflows by preparing low-latency transaction features and trusted event streams for analytics and downstream decisioning systems.
Implemented PCI-DSS tokenization-oriented data handling patterns, replacing or masking sensitive payment card data before landing in accessible analytical environments to enhance data security.
Built ledger reconciliation pipelines comparing internal transaction records with external processor, settlement, and downstream reporting feeds to identify mismatches, timing gaps, and duplicate events.
Designed transformation layers to support KYC / customer due diligence and risk-oriented analytics, integrating customer profile, transaction, and reference data into curated datasets used for compliance and operational review.
Contributed to AML-focused data engineering patterns, including entity relationship preparation and graph-friendly data outputs that could support suspicious activity analysis and multi-hop transaction investigation.
Developed parsing and normalization logic for financial messaging and standards-based exchange patterns, including XML-heavy and institution-oriented data interchange concepts aligned with ISO 20022-style message processing.
Designed Snowflake dimensional and analytical models for transaction volumes, settlement status, dispute analysis, customer behavior, and operational KPIs, improving reporting performance and streamlining transformation logic in BI layers.
Tuned Snowflake performance using warehouse isolation, clustering, query profile analysis, caching-aware SQL design, pruning optimization, and staged transformation decomposition, improving throughput and lowering compute waste.
Applied dbt-based transformation patterns to improve modularity, testing, documentation, and version control discipline for SQL models used in heavily audited financial reporting pipelines.
Partnered with platform and security teams to strengthen RBAC, encryption-aware design, secrets discipline, and controlled data access, improving compliance posture while maintaining usability for engineering and analytics teams.
Integrated Snowflake with broader cloud services and adjacent enterprise tooling, including AWS S3, Glue, Lambda, and selective Azure/Fabric interoperability, to support hybrid reporting and partner-facing analytics workflows.

Data Engineer

Chewy

Onsite

05.2013 - 02.2016

Built and maintained large-scale analytics and reporting pipelines for e-commerce, integrating clickstream, order, product, customer, fulfillment, and inventory data into analytical models that enabled data-driven decision-making.
Developed customer 360-style data models by merging customer profile, browsing, cart, order, and engagement data across multiple digital touchpoints to improve marketing, personalization, and retention analytics.
Engineered clickstream sessionization pipelines that transformed raw web and app events into structured user sessions, enabling funnel analytics, navigation-path analysis, and conversion measurement.
Built ingestion and transformation flows supporting near-real-time inventory synchronization, improving alignment between warehouse stock availability and customer-facing inventory signals.
Integrated product, competitor, and demand-related data into curated datasets supporting dynamic pricing and merchandising analytics, facilitating more informed pricing strategies and operational optimizations.
Developed foundational datasets for recommendation and product affinity analytics, incorporating customer-product interaction features and market-basket relationships to enhance personalization and cross-sell strategies.
Built pipelines for A/B testing and experimentation analytics, ensuring test/control traffic could be measured cleanly and tied to conversion, basket size, and behavioral metrics.
Implemented SQL and modeling improvements to optimize report performance, reduce duplicate business logic, and improve trust in KPI reporting used by business and operations stakeholders.
Worked with cloud and warehouse technologies that laid the groundwork for later Snowflake-centric patterns, including structured data modeling, secure handling of customer data, and scalable analytics engineering practices.

Education

Bachelor of Science - Computer Science

Florida Institute of Technology

Melbourne, FL

05-2013

Skills

Snowflake architecture
Virtual warehouses
Multi-cluster warehouses
Micro-partitioning
Clustering keys
Automatic clustering
Time Travel
Zero-Copy Cloning
Secure data sharing
Snowpipe
Snowpark
Tasks
Streams
Materialized views
External tables
Semi-structured data
VARIANT
Query profile analysis
Warehouse sizing
Workload isolation
Resource monitors
Result caching
Metadata-driven ELT
Data retention
Fail-safe
Access control
Masking policies
Row access policies
ETL
ELT
Batch pipelines
Streaming pipelines
CDC
Incremental data loads
Full refresh
Backfills
Schema evolution
Data contracts
Data alignment
Orchestration
DAG design
Failure recovery
Idempotent pipelines
Data standardization
Ingestion framework design
Metadata management
Lineage
Observability
SLA monitoring
Data quality validation
Reference data management
Advanced SQL
SQL for Snowflake
Python
PySpark
Spark SQL
Scala
Shell scripting
Dbt
Stored procedures
UDFs
Query tuning

Window functions
MERGE/UPSERT patterns
SCD Type 1/2
Partition-aware transformations
Modular pipeline design
AWS S3
AWS Glue
Lambda
Kinesis
EC2
IAM
CloudWatch
Redshift
RDS
Aurora
Azure Data Factory
Azure Synapse
Microsoft Fabric
Fabric Pipelines
Fabric Lakehouse
Power BI
BigQuery
Pub/Sub
Kafka
Kafka Connect
Spark Structured Streaming
Event-driven architecture
Real-time processing
Event ingestion
Stream enrichment
Data delivery guarantees
Event replay
Stream observability
HIPAA
PCI-DSS tokenization
RBAC
Encryption at rest
Encryption in transit
Audit trails
Lineage for audits
Data provenance
Secret management
Compliance data access
Dimensional modeling
Star schema
Snowflake schema
Conformed dimensions
Semantic layer support
Customer 360
Financial ledger modeling
Healthcare analytics
Quality aggregation
KPI modeling
Report optimization
Git
CI/CD
Jenkins
GitHub Actions
Terraform
CloudFormation
Docker
Monitoring

Timeline

Senior Data Engineer

Cardinal Health

02.2021 - Current

Senior Data Engineer

Fiserv

03.2016 - 01.2021

Data Engineer

Chewy

05.2013 - 02.2016

Bachelor of Science - Computer Science

Florida Institute of Technology

John Peterson

Summary

Overview

Work History

Senior Data Engineer

Senior Data Engineer

Data Engineer

Education

Bachelor of Science - Computer Science

Skills

Timeline

Senior Data Engineer

Senior Data Engineer

Data Engineer

Bachelor of Science - Computer Science

Similar Profiles

Venkata PST ImmadesttyVenkata PST Immadestty

Anand SharmaAnand Sharma

Dushyant Kumar SinghDushyant Kumar Singh

Veereshwar SolankiVeereshwar Solanki