Summary
Overview
Work History
Education
Skills
Timeline
Generic

Tobias Caouette

Camas,WA

Summary

Experienced Data Engineer with a strong background in developing data pipelines, ETL/ELT processes, and data modeling. Specialized in implementing robust data infrastructures and optimizing data flows, particularly in healthcare and clinical and pharmaceutical regulated settings.

Overview

5
5
years of professional experience

Work History

Sr. Data Engineer

Trinity Life Sciences
08.2024 - Current
  • Developed a Python-based Data Lake Library for Azure Data Bricks, streamlining the configuration of extractors and pipelines
  • Implemented a custom data quality suite ensuring only QC-passed data is published
  • Extensively utilized PySpark, configurable Spark sessions, catalogs, Iceberg format, and Data Lake v2 for advanced data modeling

Sr. Clinical Informatics Engineer, Data Engineering

Helix
02.2024 - 07.2024
  • Architected and implemented AWS-based data solutions for integrating Clinical EHR datasets with Genomic data using OMOP common data model, improving data accessibility by 40%
  • Developed a comprehensive data profiling system using complex analytics, enhancing data visibility across the entire schema and reducing data discrepancies by 25%
  • Led migration from PostgreSQL to Apache Iceberg, resulting in 50% improved query performance and enhanced scalability
  • Optimized ETL processes using AWS Glue and Spark, increasing data retrieval efficiency by 30% and overall system performance by 25%
  • Implemented IaC using AWS CDK (TypeScript) for consistent and secure infrastructure provisioning
  • Designed and orchestrated data pipelines with Airflow, ensuring 99% accuracy and reducing manual interventions by 70%

Data Analyst, Research

Truveta
05.2023 - 10.2023
  • Engineered data analytics pipelines using Azure Synapse, SQL, PySpark, and Python, streamlining complex healthcare data analysis and reducing processing time by 40%
  • Developed custom utility packages that increased the speed of Real World Data analysis by 35%
  • Implemented robust data quality processes, improving research data reliability by 20%

Sr. Clinical Data Quality Analyst

OM1
04.2021 - 05.2023
  • Designed and implemented a data quality testing suite using Pytest and SQL, improving data integrity checks by 50%
  • Leveraged dbt for healthcare data modeling and transformation, streamlining the ETL process and reducing data load times by 30%
  • Created Tableau dashboards for data profiling, enhancing stakeholder visibility into data quality metrics

Clinical Data Specialist III

Biorad Laboratories
07.2019 - 04.2021
  • Architected ETL processes using SQL Server, SSIS, AWS S3, and AWS Glue, facilitating seamless data integration for COVID-19 clinical trials
  • Developed and managed a custom LIMS system using Django and SQL Server, improving laboratory workflow efficiency by 25%

Education

Bachelor of Science - Physics

University of Washington
Seattle, WA
01-2015

Skills

  • Data Quality
  • Data Analysis
  • Real World Data (EHR)
  • ETL development
  • Big Data Processing
  • Python Programming
  • Data Pipeline Design
  • Data Warehousing
  • Spark Development
  • Advanced SQL
  • Git Version Control

Timeline

Sr. Data Engineer

Trinity Life Sciences
08.2024 - Current

Sr. Clinical Informatics Engineer, Data Engineering

Helix
02.2024 - 07.2024

Data Analyst, Research

Truveta
05.2023 - 10.2023

Sr. Clinical Data Quality Analyst

OM1
04.2021 - 05.2023

Clinical Data Specialist III

Biorad Laboratories
07.2019 - 04.2021

Bachelor of Science - Physics

University of Washington
Tobias Caouette