Summary
Overview
Work History
Education
Skills
Websites
Certification
Timeline
Generic

Shanmuga Priyan Vadugan Murugan

Frisco,TX

Summary

Strategic and customer-obsessed senior leader with 16+ years of experience leading customer-facing technical support and engineering teams. Proven track record in managing escalation processes, driving Major Incident Management (MIM), and delivering against SLAs, SLOs, and customer satisfaction KPIs. Skilled at developing and mentoring high-performing teams of Field Engineers, Solution Consultants, and Support Leaders. Adept at engaging executive stakeholders to resolve critical escalations and foster long-term customer success. Experienced in partnering cross-functionally with Product Management, Professional Services, and Engineering to identify gaps, implement best practices, and continuously improve service quality and operational outcomes.

Overview

17
17
years of professional experience
1
1
Certification

Work History

Manager, Site Reliability Engineering

7-Eleven Inc
Irving, Texas
05.2019 - Current
  • Lead the SRE function for 7-Eleven’s end-to-end digital ecosystem — the 7NOW customer app, store operations app, and web platform — managing a team of 24 and owning complete monitoring, observability, incident management, and AI-driven automation across AWS, GCP, and multi-cloud environments. Architected and operationalized enterprise-wide observability infrastructure with custom telemetry dashboards across New Relic, AWS CloudWatch, and ELK Stack; established proactive alerting for customer-facing APIs, mobile endpoints, and store systems across all digital platforms.
  • Led a comprehensive New Relic cost optimization initiative — auditing data ingestion pipelines, rationalizing custom attributes, reducing redundant log forwarding, and right-sizing ingest contracts — delivering $160K in annual cost savings while maintaining full observability coverage across customer, store, and web platforms.
  • Re-architected New Relic ingest pipelines by implementing targeted filtering, cardinality reduction, and sampling strategies at the agent and pipeline level, eliminating unnecessary data volume without degrading alert fidelity or dashboard accuracy.
  • Designed and deployed an automated observability model using Microsoft Power Automate, enabling self-service alerting workflows, automated incident notification routing, and scheduled business KPI reporting — significantly reducing manual toil for the SRE team.
  • Drove end-to-end agentic cloud monitoring integration on GCP, extending observability coverage across multi-cloud environments and enabling intelligent, event-driven monitoring workflows.
  • Owned the complete incident management lifecycle — defining and maintaining SLIs and SLOs with a structured severity model, enforcing Runbook discipline, and driving Postmortem culture; improved MTTR by 40% and reduced recurring escalations through blameless root cause analysis.
  • Tracked and drove attainment of KPIs including SLA/SLO compliance, MTTR, MTTD, availability, and error rates; improved issue resolution time by 35% through data-driven escalation trend analysis and process improvements.
  • Spearheaded AI-driven initiatives to reduce manual observability toil, automating alert triage, anomaly detection, and batch KPI reporting workflows — improving team efficiency and accelerating incident detection and response across the 7NOW platform.

Senior AWS/DevOps Engineer

Capital One, Wipro Technologies
Richmond, VA
03.2015 - 05.2019
  • Worked on end-to-end migration from on-premises infrastructure to AWS for the Capital One Card organization, collaborating with the initial DevOps team to design and implement enterprise-grade CI/CD pipelines.
  • Designed and standardized CI/CD frameworks using Jenkins and AWS services, enabling automated, scalable deployments across multiple environments.
  • Transformed legacy system administration workflows into DevOps practices, adopting automation, infrastructure as code, and cloud-native tooling.
  • Gained deep hands-on expertise across AWS services through a 2-year large-scale cloud migration initiative, supporting compute, networking, security, and storage services.
  • Implemented and managed escalation management and incident resolution processes using New Relic, ELK Stack, and Splunk to improve system visibility, SLA adherence, and support reliability across enterprise cloud platforms.
  • Partnered with application, security, and platform teams to ensure secure, compliant, and highly available cloud deployments.

Software Engineer

Capital One, Wipro Technologies
Chennai, India
09.2010 - 03.2015
  • Provided onsite UNIX/Linux system administration support for customer card–related ETL and batch processing systems, ensuring adherence to SLA and SLO commitments.
  • Managed incident response, triage, and root cause analysis, consistently meeting MTTR targets for high-priority production issues.
  • Supported and maintained ETL pipelines and application infrastructure on UNIX/Linux platforms, including performance tuning and system optimization.
  • Implemented proactive monitoring and alerting, reducing MTTD and MTTR while improving system stability and availability.
  • Collaborated with application, database, and network teams to support financial data accuracy, security, and regulatory compliance.

Data Analyst

CMC Ltd, Ikya Human Capital Solutions
Chennai, India
08.2009 - 09.2010
  • Data ingestion, cleansing, validation, and quality assurance for high-volume financial datasets
  • Reporting and presentation of insights to leadership, ensuring data consistency and accuracy
  • Collaboration with cross-functional teams to support financial data integrity and reporting standards

Education

Master’s - Software Engineering

Anna University
Chennai

Skills

  • Cloud Platforms: Azure, AWS (EC2, S3, Route 53, CloudFront, EBS, VPC, RDS, CloudWatch, ELB, Auto-Scaling, IAM, Lambda, Security Policies, CloudFormation)
  • Languages: Python, JavaScript, Bash, Nodejs, Ruby, HTML, CSS
  • DevOps Tools: Jenkins, Docker, Kubernetes, Chef, Ansible, Puppet, Terraform, CloudFormation
  • Monitoring: NewRelic, CloudWatch, ELK Stack, Datadog, Prometheus, Grafana, GCP, Mixpanel, Log Rocket, Splunk, Intune, Kiali
  • Messaging: SNS, SQS, Kafka
  • Databases: Oracle, MS SQL Server, MySQL, MongoDB, PostgreSQL
  • Network Protocols: TCP/IP, UDP, DHCP, HTTP, VPN, DNS, NTP, FTP, SSH, Telnet
  • Application Servers: WebLogic, Apache Tomcat, JBoss, WebSphere
  • Operating Systems: Linux (Red Hat, Ubuntu), UNIX, Windows, Mac OS

Certification

Certified AI Practitioner – Amazon Web Services (AWS), 2025

Timeline

Manager, Site Reliability Engineering

7-Eleven Inc
05.2019 - Current

Senior AWS/DevOps Engineer

Capital One, Wipro Technologies
03.2015 - 05.2019

Software Engineer

Capital One, Wipro Technologies
09.2010 - 03.2015

Data Analyst

CMC Ltd, Ikya Human Capital Solutions
08.2009 - 09.2010

Master’s - Software Engineering

Anna University
Shanmuga Priyan Vadugan Murugan
Want your own profile? Create for free at MyPerfectResume.com