Summary
Overview
Work History
Education
Skills
Certifications and awards
Prior experience
Timeline
Generic

Sharidha Ambalath

Site Reliability Engineering Lead/ Manager
Alpharetta,GA

Summary

Credible track record of directing cloud infrastructure optimization initiatives and improving system uptime through strategic implementation of AWS and GCP solutions, automated recovery processes, and enhanced monitoring coverage. Agile-minded, technically inclined professional known for partnering with cross-functional SRE teams throughout cloud transformation projects, implementing Kubernetes clusters, and establishing robust CI/CD pipelines to decrease deployment times. Well-versed in designing and executing mission-critical reliability processes, championing blameless post-mortems, and orchestrating incident response strategies, while maintaining high-availability standards across multi-cloud environments. Noted for implementing cost-optimization strategies across AWS and GCP platforms, leveraging expertise in Terraform, Ansible, and container orchestration to build secure cloud-native infrastructures. Adept at translating complex technical requirements into actionable roadmaps, mentoring teams in cloud-native best practices, and building automated infrastructure solutions to accelerate overall operational efficiency.

Collaborative leader partners with coworkers to promote engaged, empowering work culture. Documented strengths in building and maintaining relationships with diverse range of stakeholders in dynamic, fast-paced settings.

Overview

14
14
years of professional experience

Work History

Sr Lead System Reliability Manager

ADP
01.2022 - Current
  • Modernized incident response model, leading effective resolutions and conducting thorough blameless post-mortems to pinpoint root causes and bolster future system responses
  • Facilitated continuous integration and ensured smooth transitions into production environments by orchestrating CI/CD workflows utilizing Jenkins, Docker, and Git
  • Significantly enhanced setup speeds and resource management efficiencies by automating infrastructure provisioning for cloud-native applications using Terraform
  • Spearheaded optimization of AWS-based cloud infrastructure, leveraging tools like EC2, S3, VPC, and CloudWatch to boost performance and slash costs by 30%
  • Strengthened operational security practices to ensure rigorous daily security protocols and enhance overall system defenses

Site Reliability Engineer

Equifax
01.2017 - 01.2022
  • Steered design and deployment of cloud-native applications and infrastructures on platforms like AWS and Google Cloud
  • Accomplished a 20% improvement in system uptime and a 30% reduction in operational costs by leading cloud-native infrastructure implementations in AWS and Google Cloud
  • Architected cloud-native applications utilizing AWS services such as EC2, Lambda, and RDS, and incorporated monitoring solutions for comprehensive visibility into production systems
  • Attained major improvement in deployment and orchestration of microservices and containerized applications with Helm by directing setup of Kubernetes clusters in AWS cloud using CloudFormation
  • Slashed deployment times and reduced manual errors by 80% via automation of CI/CD pipelines using Jenkins and GitLab
  • Directed seamless, end-to-end migration of on-premises infrastructure to AWS Cloud, minimizing disruption and applying AWS best practices for optimal cost and performance
  • Amplified system reliability by automating recovery strategies and expanding monitoring capabilities

Application Support Engineer

Bridge2Solutions
01.2016 - 01.2017
  • Guaranteed high availability and astute handling of traffic surges by engineering and deploying auto-scaling and load-balancing solutions for microservices architectures
  • Automated configuration of auto-scaling groups, load balancers, and cloud resources by administering infrastructure provisioning using Ansible playbooks

Production Support Engineer/ Lead Observability Engineer

AT&T
01.2011 - 01.2016
  • Expedited deployment of microservices by reducing manual efforts by 75% and accelerating production rollouts through development and launch of a CI/CD pipeline using Jenkins and GitLab CI
  • Led a team of engineers in designing, implementing, and optimizing monitoring systems to ensure proactive issue detection, enhance system performance, and maintain high availability across production environments
  • Collaborated closely with engineering, product, and operations teams to improve overall system reliability and performance while aligning with business goals
  • Defined and tracked key performance indicators (KPIs) related to service reliability, capacity, and efficiency, reporting regularly to leadership on team and system performance
  • Enabled detailed analysis and actionable insights from application logs and system metrics via configuration and maintenance of log aggregation systems using ELK Stack (Elasticsearch, Logstash, Kibana) and Splunk
  • Established real-time monitoring frameworks using tools such as Nagios, Prometheus, Datadog, and CloudWatch to continuously monitor system health and performance metrics

Education

Master of Science (MS) - IT Management

Grand Canyon University

Bachelor of Technology - undefined

SCMS School of Engineering and Technology

Skills

Site Reliability Engineering Management

Certifications and awards

Certified Kubernetes Administrator (CKA), Google Cloud Certified Associate Cloud Engineer, AWS Solution Architect Associate, CompTIA Security+, One Equifax Award, Awesome SRE Award, Equifax

Prior experience

  • Software Engineer, Temenos
  • Software Developer, NTT Data(Keane)

Timeline

Sr Lead System Reliability Manager

ADP
01.2022 - Current

Site Reliability Engineer

Equifax
01.2017 - 01.2022

Application Support Engineer

Bridge2Solutions
01.2016 - 01.2017

Production Support Engineer/ Lead Observability Engineer

AT&T
01.2011 - 01.2016

Bachelor of Technology - undefined

SCMS School of Engineering and Technology

Master of Science (MS) - IT Management

Grand Canyon University
Sharidha AmbalathSite Reliability Engineering Lead/ Manager