Collaborated with developers, analysts, and project managers to expedite incident resolutions for absence / leave management software.
- Worked extensively with Customer Success Managers and engineering teams to resolve complex issues and enhance customer experience.
Performed software application and disaster recovery testing to guarantee business readiness following system failures.
- Managed on-call support for web- and client-based applications, files, and data feeds to prevent impact on business.
- Troubleshoot incidents reported by end users to schedule system changes and identify permanent solutions.
- Utilized monitoring tools daily to understand baselines and detect potential issues to help with decreasing P1 and P2 outages.
- Strived to exceed SLAs and KPIs, consistently maintaining high standards of service quality, including an SLA achievement rate of over 85%. Conducted root cause analysis on recurring incidents to identify improvement opportunities in application design or support procedures.
- Led the Major Incident Management process responsibilities to include the following: incident classification and prioritization, escalation, collaboration, incident logging, and major incident review.
- Analyzed production issues from business and the application/code perspective and outlined corrective actions.
- Maintained defined incident and outage response and resolution SLAs by having individual SLO % of 85% or higher.
- Reduced tickets sent to L3 by having individual FCR % of 35% or higher.
- Ensured RCAs for all P1 outages were completed within two business days and RCAs for all P2 outages were completed within three business days.
- Cooperated with team to meet and exceed SLAs based on priority: P1 – 85% resolved in less than four hours; P2 – 80% resolved in less than eight hours; p3 – 80% resolved within three business days; and P4 – resolved within five business days.
- Maintained a monthly escalation rate of