Summary
Overview
Work History
Education
Skills
Personal Information
Timeline
Generic
Pavan Gudhe

Pavan Gudhe

Manchester,CT

Summary

DevOps and Site Reliability Engineer (SRE) with over 7 years of experience in automating complex infrastructure, ensuring high availability, and maintaining system reliability across cloud and on-premises environments. Expertise in containerization, CI/CD pipeline development, cloud infrastructure, and monitoring solutions. Proven ability to optimize deployment processes, enhance system performance, and lead seamless production rollouts, reducing downtime, and improving overall system reliability.

Overview

4
4
years of professional experience

Work History

Infrastructure Administrator (SRE/DevOps)

Pragma Edge
Jacksonville, FL
11.2022 - Current
  • Architected and implemented CI/CD pipelines for MFT applications using Terraform, Helm, and Ansible, automating the provisioning of infrastructure in OpenShift and reducing manual intervention by 50%.
  • I designed deployment pipelines to promote applications across environments (development to production), ensuring consistency and minimizing deployment risks.
  • Automated environment patching and upgrades through Ansible-based pipelines, improving deployment speed and reducing downtime for critical services.
  • I collaborated with development teams to integrate observability best practices into the CI/CD pipeline, ensuring early detection of issues and aligning observability tools with security compliance requirements.
  • Automated the collection and processing of observability data using custom scripts and integrations with Grafana and Splunk, reducing manual intervention and improving monitoring accuracy.
  • I troubleshot and debugged issues across environments, working closely with development and operations teams to identify root causes and resolve system failures.
  • I integrated Sysdig and Splunk with a unified Grafana dashboard for real-time monitoring and log analysis, improving system health visibility and troubleshooting efficiency.
  • I analyzed system performance trends using Prometheus and Grafana, identifying bottlenecks and collaborating with development teams to optimize resource allocation, reducing response times by 20%.
  • I conducted regular system health checks and tuning to maintain optimal performance and reliability.
  • I managed on-call support to handle incidents, ensuring quick recovery and minimal downtime during unexpected issues.
  • Defined and tracked SLOs and SLIs for system performance and availability, achieving a consistent uptime of 99.99% and ensuring proactive monitoring of critical metrics.
  • I implemented auto-scaling strategies and managed resource allocation to address traffic spikes, ensuring high system availability and performance.
  • I managed OpenShift clusters, including administration of Calico networking for efficient pod-to-pod communication and network security.
  • I built custom services to calculate and monitor SLA performance metrics, ensuring system uptime and stability by proactively addressing potential bottlenecks and failures.
  • I implemented capacity planning and fault tolerance strategies to enhance system scalability and reliability, ensuring minimal downtime during high-traffic events.
  • Automated the planned downtimes and system maintenance tasks using Ansible and shell scripts, minimizing manual intervention and ensuring consistent processes across environments.
  • I actively participated in retrospective (retro) calls, contributing to continuous improvement by discussing system reliability, deployment challenges, and ways to enhance pipeline efficiency and incident response.
  • I introduced proactive measures and improvements to reduce incident frequency and improve system stability.
  • Performed Disaster Recovery (DR) activities every 6 months to validate system reliability and recovery processes, ensuring operational resilience in case of system failures or outages.
  • Ensured Recovery Time Objective (RTO) of 10 minutes during DR activities, maintaining strict recovery timelines and validating system recoverability during outages.
  • I performed postmortem reports after each incident, documenting changes made and lessons learned to prevent similar issues in the future, and continuously improve system stability.

Senior Build Engineer (DevOps)

Pragma Edge
Jacksonville, FL
11.2020 - 05.2023
  • Developed end-to-end release pipelines integrated with SonarQube for code quality scanning and WhiteSource for vulnerability scanning, ensuring secure and high-quality code deployments.
  • Architected and built Docker images for deployment in various environments, optimizing for performance and security.
  • Created and maintained Helm charts to deploy multiple products in Kubernetes and OpenShift, simplifying application management and version control.
  • Automated the deployment and management of Kubernetes clusters using Amazon EKS, ensuring secure and scalable environments for containerized applications.
  • Built comprehensive CI/CD pipelines encompassing integration, deployment to test servers, and regression testing, streamlining the SDLC process for faster and more reliable releases.
  • Automated the build and deployment process using Jenkins, Ansible, and Terraform, enhancing operational efficiency and consistency across environments.
  • Developed release pipelines that handled deployment post-JIRA ticket approvals, ensuring proper change management and documentation.
  • Implemented advanced monitoring for Amazon EKS clusters using Prometheus and Grafana, ensuring real-time insights into cluster health and application performance. Developed auto-scaling strategies to dynamically adjust resources during traffic spikes, maintaining high availability and optimizing costs.
  • Developed custom Ansible modules to automate repetitive tasks, streamlining operations, and improving efficiency across teams. Published these modules to the internal organization for wider use, enabling standardized automation practices.
  • Achieved a 25% reduction in post-release defects by integrating comprehensive testing and quality checks into the pipeline.
  • Updated release changelogs to Confluence pages using APIs, improving transparency and tracking of release changes and progress.
  • Integrated JMeter for performance testing and Selenium for automated functional testing in the pipelines, ensuring robust application quality and performance.
  • Utilized Ansible for infrastructure provisioning, automating setup, and configuration tasks to support continuous integration and deployment processes.
  • Implemented automated regression testing within the pipelines to identify issues early, improving the overall reliability and stability of releases.

MFT Build Engineer (DevOps/SRE)

Pragma Edge
Jacksonville, FL
01.2022 - 10.2022
  • I developed custom shell scripts to automate partner onboarding via ServiceNow tickets using REST APIs, streamlining integration processes and enhancing efficiency.
  • I integrated partner onboarding processes across multiple data centers through a unified ServiceNow ticket, utilizing Github Actions to build pipelines to ensure a centralized and consistent approach to partner integration and deployment.
  • I implemented a deployment pipeline to support the automation of onboarding processes and PCM framework installations, ensuring consistency and reliability across different environments and data centers.
  • I developed custom PEM activities by leveraging B2B REST APIs and IBM Sterling File Gateway (SFG) APIs, enhancing integration and automation capabilities within the MFT environment.

Infrastructure Administrator (DevOps/SRE)

Pragma Edge
Jacksonville, FL
03.2021 - 01.2022
  • I installed and configured Sterling B2B Integrator using Ansible, automating deployment across on-prem virtual machines to ensure high availability and consistent configuration, contributing to a reliable and resilient system.
  • I developed custom Ansible roles for managing adapters and services in Sterling B2B Integrator, optimizing operational efficiency and system performance through standardized automation practices.
  • I utilized B2B APIs to automate service management and optimize Sterling B2B operations, reducing manual maintenance tasks and ensuring robust system functionality with minimal downtime.
  • I monitored system performance and availability, implementing proactive measures to address potential issues before they impacted service levels, and ensuring compliance with SRE principles.
  • Enhanced B2B Integrator stability by automating configuration management, resulting in a 20% decrease in system downtime.
  • I documented system configurations and operational procedures, creating detailed records for internal use to facilitate troubleshooting, maintenance, and knowledge sharing within the team.
  • Contributed to post-incident reviews, analyzing system performance and issues to identify improvements and enhance the overall reliability and stability of the B2B integration environment.

Infrastructure Administrator (DevOps/SRE)

  • Deployed PEM applications in Docker containers using Ansible, automating the setup and scaling of application environments to ensure consistent, repeatable deployments across development, testing, and production environments.

Fullstack Developer

Mirus Systems, India
, India
  • I developed a Java Spring Boot web application, designing and implementing features to meet business requirements and enhance functionality.
  • I hosted the Spring Boot application on AWS EC2 instances, ensuring scalable and reliable deployment for production environments.
  • I Dockerized the Spring Boot application, creating Docker images to streamline deployment processes and enable consistent environments across development, testing, and production.
  • I configured and optimized AWS EC2 instances to support the performance and scaling needs of the containerized application, enhancing overall reliability and efficiency.
  • I developed Java APIs to support application functionality, providing robust and scalable endpoints for integration with various services and platforms.
  • I integrated a payment gateway into the application, facilitating secure and efficient transaction processing for users.
  • I added Google Firebase integration, leveraging its services for real-time data synchronization, user authentication, and application analytics to enhance overall functionality and user experience.

AWS Cloud Support

Indian Institution of Hardware and Technology, India
, India
  • Provisioned and managed EC2 instances, ensuring scalable and efficient compute resources for applications and services.
  • I managed EBS volumes, including creation, attachment, and optimization, to support data storage and performance requirements.
  • I created and configured Route 53 DNS entries to ensure reliable domain name resolution and support for high-availability architectures.
  • I managed Virtual Private Cloud (VPC) settings, including subnet configuration and security group management, to maintain secure and well-architected network environments.
  • I created and managed S3 buckets, implementing proper access controls and data lifecycle policies to ensure secure and cost-effective storage solutions.
  • I assisted in the development of Lambda functions, providing support for serverless computing tasks and integrating with other AWS services to enhance application functionality.

Education

Master of Science - Computer And Information Sciences

New England College
Henniker, NH
07-2024

Bachelor of Science - Computer Science

Gayatri Vidya Parishad College of Engineering
06-2020

Skills

  • Windows
  • RHEL and CentOS Expertise
  • Shell Scripting
  • Java Development Expertise
  • Python
  • Skilled in Go Development
  • SQL Proficiency
  • XML
  • Splunk Implementation Experience
  • Otel
  • Datadog
  • Prometheus Monitoring Expertise
  • Grafana Data Visualization
  • Disaster Recovery Planning
  • Storage Management
  • Containerization Expertise
  • Kubernetes Management
  • OpenShift Administration
  • Incident Management
  • Problem diagnosis

Personal Information

Title: Devops/SRE Engineer

Timeline

Infrastructure Administrator (SRE/DevOps)

Pragma Edge
11.2022 - Current

MFT Build Engineer (DevOps/SRE)

Pragma Edge
01.2022 - 10.2022

Infrastructure Administrator (DevOps/SRE)

Pragma Edge
03.2021 - 01.2022

Senior Build Engineer (DevOps)

Pragma Edge
11.2020 - 05.2023

Infrastructure Administrator (DevOps/SRE)

Fullstack Developer

Mirus Systems, India

AWS Cloud Support

Indian Institution of Hardware and Technology, India

Master of Science - Computer And Information Sciences

New England College

Bachelor of Science - Computer Science

Gayatri Vidya Parishad College of Engineering
Pavan Gudhe