Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Esther Fathima Rakesh

Summary

Innovative and results-driven Reliability Engineering Manager with over 13 years of progressive experience across software development, modernization, and cloud operations. Demonstrated success in AWS migration, AI-driven automation, and SRE leadership at The Hartford. Skilled in converting legacy systems into robust, cloud-native architectures while guiding high-performing teams. Skilled in AWS, GCP, Terraform, and observability tools, with a passion for building reliable, scalable, and intelligent systems.

Overview

13
13
years of professional experience
1
1
Certification

Work History

Manager, Reliability Engineering

The Hartford
Connecticut
06.2025 - Current
  • Managed cross-functional team of Reliability Engineers focused on maintaining secure, scalable, and robust AWS-based systems. Accountable for end-to-end cloud operations, cost optimization, and innovation initiatives driving resilience, performance, and automation.
  • Drive cloud operations strategy across AWS and GCP platforms.
  • Strengthen observability, resiliency, and incident response through AI-assisted automation.
  • Partner with business, architecture, and DevOps teams to prioritize modernization roadmaps.
  • Implement automated recovery systems reducing MTTR by 30%.
  • Oversee compliance, audit readiness, and production governance for mission-critical assets.
  • Champion AI integration using AWS Bedrock and GCP Vertex AI to enhance operational insights.
  • Lead a multi-disciplinary team of SREs and cloud engineers through mentoring, goal setting, and technical coaching.
  • Foster a continuous improvement culture emphasizing ownership, accountability, and knowledge sharing.
  • Spearhead quarterly innovation sprints, encouraging engineers to prototype automation and monitoring solutions.
  • Partner with leadership and HR to define career growth paths for engineers within SRE and CloudOps.
  • Recognized for empowering teams through empathy-driven leadership and strong communication.

Senior Staff Software Engineer, Reliability Engineering

The Hartford
Connecticut
04.2023 - 05.2025
  • Technical lead for AWS migration and cloud operations modernization.
  • Designed infrastructure automation with Terraform and CloudFormation to streamline provisioning.
  • Integrated Splunk ITSI, Dynatrace, and CloudWatch for unified observability.
  • Implemented Bedrock-based AI models for incident prediction and alert auto-resolution.
  • Collaborated across teams to embed SRE principles into CI/CD workflows.
  • Drove runbook automation and incident reduction initiatives.
  • Coached engineers and earned distinction as primary subject matter expert for AWS reliability patterns.
  • This role will have end-to-end accountability for IT assets within a defined application portfolio, ranging from advanced digital storefronts to more traditional platforms.

Site Reliability Engineer

The Hartford
Connecticut
05.2020 - 04.2023
  • This role will have end-to-end accountability for IT assets within a defined application portfolio, ranging from advanced digital storefronts to more traditional platforms.
  • Drive stability and performance of the systems in application portfolio to top quartile in the industry across both production and test environments.
  • Continuously improve the observability of our technology stack by evolving the usage of logs, metrics, tracing, dashboards, monitoring alerts, and integration of metrics with CI/CD pipelines.
  • Independently oversee triaging and service restoration of all high-impact incidents to facilitate permanent root cause elimination and reduce service restoration time.
  • Ensure transition from delivery to run (DTR) meets or exceeds all operational requirements.
  • Champion the migration of applications to open source platforms and enable containerization and deployment to cloud platforms such as Openshift, AWS and Azure.
  • Own and drive adoption of DevOps tools and best practices across the application portfolio.
  • Ensure strict adherence to IT regulatory and compliance requirements for all applications, drive remediation of all internal, regulatory and SoX audit findings.
  • Review and influence major technical design changes and participate in the review of change management activities.
  • Ensure that all technical documentation and artifacts are accurately prepared, maintained and cataloged.
  • Accountable for keeping the IT application currency and infrastructure metadata repositories (PlanIT and CMDB) current.

Tech Lead

Cigna
Connecticut
05.2019 - 05.2020
  • Perform Level 2 Production Support (L2) activities for Cigna's Customer Service Group applications suite.
  • Production problem resolution - respond to, investigate, and resolve problems with permanent countermeasures.
  • On call support for all the production related problems after business hours.
  • Handle project related escalations and work out resolutions of incidents, issues reported by application users using HP Service Management tool.
  • Engage with Architects and Development teams as necessary to design the best solution for production problems.
  • Lead investigation and resolution efforts for critical, high impact problems, defects and incidents.
  • Coordinate with offshore team on production support activities and provide necessary guidance for day-to-day support using Skype and other collaboration tools.
  • Provide and drive all run time improvements for all applications pre and post release.
  • Perform production checkout activities after production releases, infrastructure maintenance, scheduled recycles and fix releases.
  • Monitor day-to-day application performance, response time and availability.
  • Interact with Enterprise Project Team and participate in their release/project meetings.
  • Provide Monthly report of incidents and known root causes.

Tech Lead

The Hartford
Connecticut
01.2018 - 05.2019
  • Responsible to adopt the agile/Scrum framework, prioritize enhancements and new initiatives based on dependencies and business priority.
  • Based on the user requirement, design Service Oriented architecture and develop SOAP/REST API services in Java, IBM WebSphere Transformation Extender (WTX) and Data power.
  • Integrate Jenkins/Udeploy component deployment into WebSphere and Data power server to implement continuous integration and continuous deployment mode.
  • Guide and help offshore team to complete sprint and deliver the services to production.
  • Work with Architects to deliver the security fixes.
  • Provide high level estimates and plan for the applications to production move.

Developer L3

The Hartford
Connecticut
02.2017 - 12.2017
  • Provide consultation and strategic recommendations to development teams by quickly assessing and remediating complex issues.
  • Diagnose software problems and provide solutions/work around to ensure the highest level of availability for critical business application.
  • Lead investigation and resolution efforts for critical, high impact problems, defects and incidents.
  • Coordinate with developers and operations professionals to drive improvements to the deployment process and environment for delivering software.

Developer L2

The Hartford
Bangalore
10.2014 - 01.2017
  • Design and Develop enterprise application service interfaces using web-based standards such as SOAP, Xpath, XSD, XSLT and XML.
  • Optimize the existing Legacy system to handle web-based standards.
  • Responsible for designing and developing optimized transformation logic which helps in improving the performance of all the critical applications.
  • Define service contracts for SOAP/REST based services that are consumed by various business applications and user interfaces so they interact seamlessly.
  • Transform complex business requirements into highly scalable technical solutions.
  • Review the test plan and test strategy built by QA team and identify opportunities for test optimization.
  • Responsible for designing and developing automated SOAP UI/Ready API scripts to implement continuous testing strategy.
  • Responsible for setup, configure and maintain projects in Source code management systems such as GitHub and Tortoise SVN.
  • Create technical design specification document and own the sign off for development/system team.

Developer L2

The Hartford
Bangalore
08.2013 - 09.2014
  • To analyze various modules by understanding the requirements and raise queries.
  • Involved in design activities like preparing Technical Specification Document (TSD).
  • Developing program based on the specific requirements of client.
  • User interface Design and development.
  • Supporting User Acceptance Testing (UAT) during testing phase.
  • Involved in resolution of issues faced during various phrases of testing.
  • Involved in performance tuning of application.

Developer L1: Project Engineer

Nationwide Insurance
Bangalore
10.2012 - 07.2013
  • Responsible for development, support, maintenance and implementation of components of a project module.
  • Works on problems on relatively complex scope.
  • Unit testing the modules.
  • Fixing the integration defects.

Education

Master’s of Science - software engineering

Birla institute of Technology
Pilani

Bachelor’s - computer application

Kristu Jayanti College
Bangalore

Skills

  • AWS
  • ECS
  • Lambda
  • Step Functions
  • S3
  • CloudWatch
  • Glue
  • Bedrock
  • GCP
  • Vertex AI
  • Cloud Storage
  • Terraform
  • AWS CloudFormation
  • Python
  • Shell
  • Groovy
  • XSLT
  • JavaScript
  • Jenkins
  • GitHub Actions
  • AnthillPro
  • UDeploy
  • Splunk ITSI
  • Dynatrace
  • ServiceNow Integrations
  • Tableau Desktop
  • Sumo Logic
  • GitHub
  • SVN
  • IBM DB2
  • IMS DB
  • MS SQL Server

Certification

  • Google Cloud Certified – Generative AI Leader, 08/01/25
  • AWS Certified AI Practitioner, 02/01/25 - 02/01/28
  • AWS Certified AI Practitioner – Early Adopter, 02/01/25
  • AWS Certified Solutions Architect – Associate, 07/01/23 - 07/01/26
  • Terraform Associate (003) – HashiCorp, 11/01/24 - 11/01/26
  • Splunk IT Service Intelligence Certified, 06/01/23
  • Tableau Desktop Specialist, 10/01/21
  • Duck Creek Author – Basic Developer, 08/01/20
  • AWS Certified Cloud Practitioner, 11/01/20 - 11/01/23
  • Sumo Logic Certified – Fundamentals, 05/01/20 - 05/01/22
  • Blaze Advisor – FICO
  • Adaptability & Resilience (McKinsey & Company), 12/01/23
  • Problem Solving (McKinsey & Company), 02/01/24
  • Business Strategy (McKinsey & Company), 04/01/24
  • Management Accelerator (McKinsey & Company), 04/01/24

Timeline

Manager, Reliability Engineering

The Hartford
06.2025 - Current

Senior Staff Software Engineer, Reliability Engineering

The Hartford
04.2023 - 05.2025

Site Reliability Engineer

The Hartford
05.2020 - 04.2023

Tech Lead

Cigna
05.2019 - 05.2020

Tech Lead

The Hartford
01.2018 - 05.2019

Developer L3

The Hartford
02.2017 - 12.2017

Developer L2

The Hartford
10.2014 - 01.2017

Developer L2

The Hartford
08.2013 - 09.2014

Developer L1: Project Engineer

Nationwide Insurance
10.2012 - 07.2013

Master’s of Science - software engineering

Birla institute of Technology

Bachelor’s - computer application

Kristu Jayanti College
Esther Fathima Rakesh