Summary
Overview
Work History
Education
Skills
Timeline
Generic

Thomas Lynch

St. Louis

Summary

Principal Engineer with deep experience leading DevOps and Site Reliability transformations across large-scale, data-intensive, and mission-critical platforms. Proven production readiness steward with a track record of partnering across development, product, and operations teams to design, deploy, and operate highly available, secure, and observable systems. Expert in shifting reliability left through automation, CI/CD enablement, incident response maturity, and observability architecture. Known for calm, systematic incident leadership, blameless post-mortems, and improving platform resilience, velocity, and customer experience across globally distributed environments.

Overview

17
17
years of professional experience

Work History

Lead Platform Engineer / BizOps

Nestlé Purina
01.2023 - Current
  • Served as production readiness steward for enterprise cloud platforms across AWS, Azure, and GCP supporting high-availability digital properties.
  • Led design and rollout of a developer experience and platform enablement ecosystem (Backstage, GitHub, GitHub Actions), increasing adoption of standardized automation by 45% and reducing time-to-delivery by ~30%.
  • Drove adoption of golden-path infrastructure patterns, improving API and microservice discoverability and reuse, saving thousands of engineering hours and approximately $1M annually in external agency development costs.
  • Architected and implemented a centralized observability platform, reducing mean time to detect incidents by ~3 days and mean time to recover by ~1 week across previously opaque platforms.
  • Enabled faster root cause analysis, compliance visibility, and health monitoring through unified metrics and logging, significantly improving operational confidence.
  • Designed and led refactoring of a legacy platform, delivering ~$500K in annual cost savings through improved efficiency and reduced operational overhead.
  • Architected and intrasourced multiple platforms and sites, optimizing delivery models and resource utilization, resulting in ~$3.5M in annualized cost savings.
  • Established enterprise incident response processes with runbooks, escalation paths, change management controls, and blameless post-mortems, shifting reliability from reactive to proactive.
  • Implemented ITSM feedback loops by analyzing incident and change data to identify resiliency gaps, inform platform improvements, and reduce repeat incidents.
  • Supported CI/CD operational gating and release readiness processes to ensure quality, stability, and compliance before promotion to higher environments.
  • Partnered with global development, product, and operations teams to shift reliability, change management, and operational requirements left into system design and delivery.

Senior Platform Engineer / Site Reliability Engineer

National Geospatial-Intelligence Agency (NGA)
01.2018 - 01.2023
  • Supported mission-critical, globally distributed AI/ML and computer vision platforms with strict availability, security, and performance requirements.
  • Improved production readiness and operational reliability, reducing recurring failure patterns and increasing platform stability.
  • Designed and operationalized MLOps workflows improving deployment reliability and reducing manual intervention.
  • Enhanced observability and monitoring, accelerating detection of failure conditions and root cause analysis.
  • Led and participated in incident response and post-incident reviews, contributing to improved recovery times and mission continuity.
  • Automated deployment and operational workflows under strict security and compliance constraints.
  • Partnered with engineers and stakeholders to integrate operational and reliability concerns earlier into system design.

Data Scientist / Geospatial Analytics Engineer

Patch Terra Geoexploration
01.2014 - 01.2018
  • Developed geospatial analytics and data science solutions supporting geoexploration initiatives.
  • Built data pipelines for large spatial datasets using Python.
  • Operationalized analytics workflows with a focus on reliability and reproducibility.

Geospatial Intelligence (GEOINT) Imagery Analyst

United States Army
01.2009 - 01.2014
  • Conducted geospatial and imagery analysis in support of operational and strategic missions.
  • Produced time-sensitive intelligence products under high-pressure conditions.
  • Maintained strict accuracy, reliability, and security standards.

Education

Bachelor of Arts - Anthropology

Florida Atlantic University
Boca Raton, FL
12-2014

Skills

  • Cloud infrastructure management
  • DevOps methodologies
  • Observability
  • Change Management,
  • SRE Practices

Timeline

Lead Platform Engineer / BizOps

Nestlé Purina
01.2023 - Current

Senior Platform Engineer / Site Reliability Engineer

National Geospatial-Intelligence Agency (NGA)
01.2018 - 01.2023

Data Scientist / Geospatial Analytics Engineer

Patch Terra Geoexploration
01.2014 - 01.2018

Geospatial Intelligence (GEOINT) Imagery Analyst

United States Army
01.2009 - 01.2014

Bachelor of Arts - Anthropology

Florida Atlantic University