Summary
Overview
Work History
Education
Skills
Websites
Timeline
Generic
Samrin Zafar

Samrin Zafar

Bellevue,United States

Summary

Lead SRE and Operations Analyst with over 6 years of experience at Expedia Group. Leveraging data science, AIOps, and automation strategies that transition legacy operations into high-velocity, resilient environments. Built comprehensive monitoring frameworks that eliminate manual toil and ensure 24/7 system availability.

Overview

8
8
years of professional experience

Work History

Lead - Site Reliability Engineer

Expedia Group
Seattle, United States
03.2025 - 01.2026
  • Incident Management: Actively led high-priority incident resolutions in 'war rooms,' ensuring rapid triage while mentoring newer engineers on troubleshooting. Maintained peak readiness by keeping the team calibrated to live-site performance.
  • In-House Primary Monitoring Tool Development: Identified critical monitoring gaps and orchestrated the end-to-end development of an enterprise scale in-house Primary monitoring tool. By leading cross-functional POCs with PagerDuty, BigPanda, Datadog, Dell, Splunk, and FireHydrant etc., defined requirements that replaced third-party limitations with a proprietary AIOps system capable of automated incident correlation and real-time revenue impact analysis.
  • Automation & Correlation: Optimized the internal monitoring tool for advanced incident correlation and automation, reducing manual effort. Latest key enhancements included AI-generated summaries, incident/bridge creation directly from the tool, bidirectional sync with other platforms to prevent duplication, similar incident grouping, change-related incident tracking, suspected root-cause identification, and automated audit reports.
  • Global Gap Remediation Framework: Designed and led the implementation of global 'Gap Process' for 24/7 NOC pods in partnership with the Problem Management team, establishing accountability standards and ensuring timely corrective actions across Service Owners, enabling data-driven identification of bottlenecks and SLA breaches.
  • Directed executive alignment: During major site events, performing revenue impact analysis, assisted translating complex failures into financial insights for audits and high-loss incidents.
  • Team Formation & Management: Built a high-performance NOC Tier 1 team from the ground up, overseeing the full lifecycle of hiring, onboarding, and technical training. I applied pedagogical strategies to accelerate 'time-to-autonomy', transforming the unit into a robust first-line response. Designed a sustainable 24/7 global coverage model with co-pods and utilizing empathetic 1-on-1 mentorship, I successfully maintained high retention and ensured the team remained calibrated to live-site performance without burnout.
  • AI/ML Literacy Program: Proactively designed and delivered a 'Fundamentals of Data Science & AI' curriculum to introduce core concepts, terminology, and internal AI tools. The program demystified AI capabilities and shifted the team mindset from fear of replacement to a culture of augmentation, driving higher operational efficiency and aligning the organization with company-wide AI adoption goals. Establishing guardrails and auditing standards for the integration of AI in operational workflows.

Site Reliability Engineer II

Expedia Group
Seattle, United States
04.2024 - 02.2025

Two-time Travel Excellence Award recipient for contributions in the Event Management space.

  • Incident Management: Triage and incident management in 'war rooms,' ensuring rapid resolution while mentoring newer NOC engineers on troubleshooting.
  • Event Analysis: Maintained peak readiness by conducting booking trend monitoring and real-time event analysis for production readiness. Splunk, Datadog, Catchpoint, Grafana, Graphite, and PagerDuty, etc.
  • Event Management: Lead event management strategies, improve correlation models in the primary monitoring tool to reduce alert noise, and accelerate root cause identification.

Site Reliability Engineer I

Expedia Group
Seattle, United States
03.2021 - 05.2024
  • Data Analysis: Analyze large-scale telemetry and logs, building optimized queries to detect anomalies, trends, and early indicators of failure. Tested new releases, code checks with engineers, define and evaluate capabilities for enhanced event correlation models for the external event management tool to reduce noise, improve signal accuracy, and accelerate issue identification.
  • Incident Management: Lead and support incident response by rapidly assessing impact, identifying root causes, coordinating responders, service owners, releases and driving resolution to minimize MTTR.

Operations and Traffic Analyst - IOTA

Expedia Group
Seattle, United States
08.2017 - 09.2019
  • Monitoring: Proactively monitor and detect customer-impacting issues using observability tooling (Splunk, Catchpoint, Grafana/Graphite, PagerDuty, AWS).
  • Triage: Partner with service owners, engineering teams, and incident managers to restore reliability and reduce MTTR.
  • 6-month tenure with Release Management filling in for a personal absence of the release office, overseeing the releases in test and prod, monitoring the environment, impact analysis, and rollbacks as needed.

Education

Data Science Immersive Course -

General Assembly
Seattle, WA
01.2020

Master's of Physics-Nuclear Physics - Nuclear Physics

Fergusson College
Pune, India
01.2011

Bachelor's degree Physics-Astrophysics - Astrophysics

Fergusson College
Pune, India
01.2009

Skills

  • Incident management
  • Event management
  • Change management
  • Root cause analysis
  • Infrastructure automation
  • Data analysis
  • Data visualization
  • Event correlation strategies
  • Automation optimization
  • AI/ML integration
  • Training programs
  • Team building

Timeline

Lead - Site Reliability Engineer

Expedia Group
03.2025 - 01.2026

Site Reliability Engineer II

Expedia Group
04.2024 - 02.2025

Site Reliability Engineer I

Expedia Group
03.2021 - 05.2024

Operations and Traffic Analyst - IOTA

Expedia Group
08.2017 - 09.2019

Data Science Immersive Course -

General Assembly

Master's of Physics-Nuclear Physics - Nuclear Physics

Fergusson College

Bachelor's degree Physics-Astrophysics - Astrophysics

Fergusson College
Samrin Zafar