A highly experienced software professional with more than 17 years in the software industry, specializing in Site Reliability Engineering (SRE) and Performance Engineering. Certified in SAFe Agile and ISTQB, with a strong focus on maintaining system reliability, optimizing performance, and ensuring scalability. Expertise includes implementing SRE principles, conducting performance testing, and driving improvements to achieve operational excellence. Known for problem-solving skills, and a commitment to delivering high-quality, resilient software systems.
· Design, implement, and manage observability solutions using tools like Dynatrace, Prometheus, Grafana.
· Develop metrics, alerts, and silences for comprehensive system monitoring.
· Automate infrastructure tasks using Ansible and Terraform .
· Script solutions using Python, Bash to enable automation across the infrastructure.
· Propose and implement innovative ideas to reduce manual workload and improve operational efficiency through automation.
· Troubleshoot and resolve system issues with an SRE (Site Reliability Engineering) mindset, focusing on root cause analysis and corrective actions.
· Develop and enhance documentation, including application guides, runbook, and system configurations.
· Toil Reduction using reducing the manual effort by automation the manual efforts.
· Plan, design, and execute scalable and redundant system architecture to meet organizational goals.
Observability Tools: Hands-on experience with Dynatrace, Prometheus, and Grafana
Infrastructure Automation: Proficiency in Ansible, Terraform, and GitLab CI/CD
Scripting Languages: Advanced skills in Python, Bash
Cloud Platforms: Proficient in provisioning and configuring resources AWS
SRE Practices: Familiarity with troubleshooting using SRE principles, root cause analysis, and corrective action planning
Documentation: Strong ability to write clear, concise, and detailed technical documentation and runbooks
System Architecture: Solid understanding of scalability and redundancy principles
Incidence Management: Investigate and analyze the incident and address the issue
Production support:
Performance tool : Load Runner, Jmeter, Gremlin for Choast testing
Certified Safe 4 Practitioner provided by SCALED AGILE.
Certified in ISTQB (International software testing qualification Board).
Ceritfied with NCC B Certificate from 51 Assam Technical Squadrant senior devision (Airforce Wing). As National Cadet corp done the priviyou sainik Camp and Vayosainik Camp. Got Parasailer certificate after completing the parasailing trainig in Jorhat Airfoce camp.
Received Award from American Express for execellent performance.
Recieved PayPal appreciate Talent(PAT) award for 2nd quarter in PayPal.
Recieved Cookie award from Tech Mahindra Ltd.