Summary
Overview
Work History
Education
Skills
Accomplishments
Certification
Languages
Timeline
Generic

Aravind Kumar S

Dallas,TX

Summary

AWS DevOps Site Reliability Engineer with hands-on experience in designing and managing scalable infrastructure on AWS and GCP. Proficient in deploying and managing containerized applications using Kubernetes (EKS), creating and optimizing CI/CD pipelines, and automating infrastructure with Terraform and CloudFormation. Strong background in Site Reliability Engineering (SRE) practices including incident response, root cause analysis, and service monitoring. Experienced in setting up observability platforms using Prometheus, Datadog, CloudWatch, and Grafana for end-to-end infrastructure and application monitoring. Adept at implementing agile DevOps practices to increase deployment speed, system resilience, and operational efficiency.

Overview

6
6
years of professional experience
4
4
Certification

Work History

Devops/Associate Site Reliability Engineer

Boost Moblie
08.2022 - Current
  • Expertise in architecting and maintaining Datadog and CloudWatch for centralized log, metric, and trace collection.
  • Architected & Deployed Prometheus for metrics collection and integrated it with Grafana to build real-time dashboards monitoring Kubernetes (EKS) clusters, enabling proactive infrastructure insights and reducing mean time to resolution (MTTR) & Alert via when App is down or Infra issues
  • Proven ability to develop monitoring dashboards for EKS clusters, APIs, databases, and microservices.
  • Implemented SRE best practices, including incident response runbooks and post-incident RCA documentation.
  • Experienced in 24/7 on-call production support and resolving critical outages.
  • Proficient in developing and deploying infrastructure using Terraform for VPCs, EKS clusters, IAM roles, and CI/CD pipelines.
  • Automated the provisioning and lifecycle management of EKS clusters and node groups (versions 1.19-1.25).
  • Designed, built, and maintained CI/CD pipelines for infrastructure and application onboarding.
  • Skilled in developing custom Helm charts and Python/Shell scripts for DevOps automation.
  • Experienced in designing and optimizing network configurations (NLBs, Route Tables, subnets).
  • Demonstrated ability to collaborate with application teams on requirements and architectural decisions.

Devops Engineer

Idelic
01.2022 - 08.2022
  • Creating alarms in Cloud watch service for monitoring the servers' performance, CPU Utilization, disk usage etc.
  • Provisioned the highly available EC2 Instances using Terraform and cloud formation and wrote new plugins to support new functionality in Terraform.
  • Worked in an IAAS environment called Terraform, to manage application infrastructures such as storage and networking.
  • Configured Elastic Load Balancers with EC2 Auto scaling groups with terraform.
  • Design roles and groups using AWS identity and access management (IAM), and manage network security using AWS Network Access Control Lists with services provided by IAM with terraform.
  • Built declarative pipeline in Jenkins for free style jobs using groovy.
  • Responsible for Datadog for monitoring.
  • Automated application deployment in the cloud using Docker technology using Elastic Container Service scheduler.
  • Managed Kubernetes charts using Helm, Created reproducible builds of the Kubernetes applications, managed Kubernetes manifest files and Managed releases of Helm packages.
  • Created and managed a Docker deployment pipeline for custom application images in the cloud using Jenkins.
  • Used Terraform scripts to Automate Instances for Manual Instances that were launched before for testing, staging, production.
  • Extensively involved in infrastructure as code, execution plans, resource graph and change automation using Terraform.
  • Created and wrote shell scripts (Bash), Python and Power shell for automating tasks.

DevOps Engineer

PayPal
05.2021 - 01.2022
  • Working with Xoom TechOps teams on the role images to determine any special use cases.
  • Migrating the Packer, Puppet, and Terraform code to work with Oracle Linux
  • As part of the migration the testing suites need to be updated to conform to Zoom & Jenkins pipeline standards.
  • Migrated legacy 300 puppet modules from AWS to GCP cloud.
  • Responsible for managing the gcp services such as compute Engine, app engine, cloud storage, vps, load balancing, big query, firewalls.
  • Worked on terraform for provision environment in gcp platform.
  • Troubleshooting of Vm;s in gcp
  • Experience with Configuration Management tools like Puppet and Ansible.
  • Able to write manifests and Ruby scripts to customize the Puppet as per requirement configuration.
  • Modifying Puppet modules & Ruby libraries within Puppet.
  • As part of the migration the testing suites need to be updated to conform to Zoom & Jenkins pipeline standards.
  • Setup puppet master, client and wrote scripts to deploy applications on Dev, QA, production environment.
  • Wrote shell scripts (Bash), Ruby
  • Migration of Xoom Bitbucket repositories.

Jr Data Analyst

MARCH FIRST SYSTEMS. LLC
07.2020 - 05.2021
  • Worked on creating and removing files and directories from Linux servers.
  • Performed data analysis to evaluate the data quality and resolve the data related issues.
  • Responsible for Improving data quality and for designing, presenting conclusions gained from analyzing data using excel as a statistical tool.
  • Performed data analysis and data profiling using SQL queries.
  • Performed hardware configuration, operating system loads and assisted with troubleshooting installation issues.
  • Developed programs with arrays using libraries like NumPy and python.
  • Handled tableau admin activities granting access, managing extracts and Installations.

AWS Cloud Administrator

SYNERGY TECHNOLOGIES
05.2018 - 05.2019
  • Created AWS Route53 to route traffic between different regions.
  • Created users and groups using IAM and assigned individual policies to each group.
  • Created SNS notifications and assigned ARN to S3 for object loss notifications.
  • Created load balancers (ELB) and used Route53 with failover and latency options for high availability and fault tolerance.

Education

Bachelor’s degree - computer science and Engineering

Gandhi Institute of Technology and Management
01.2019

Master’s degree - Data science in Management science

university of North Texas
01.2021

Skills

  • Languages: shell scripting, Python, Java, SQL, Ruby Scripting
  • Source code control tools: GIT, AWS code commit
  • CI Tools: Jenkins, AWS code pipeline, GitLab
  • CM Tools: Puppet, chef, Ansible
  • Container services: Docker
  • Container orchestration: Kubernetes
  • Cloud Technologies: Amazon Web Services, Google Cloud Platform
  • Monitoring tools: Datadog, CloudWatch, Grafana,prometheus
  • Web servers: RESTful
  • Databases: Oracle, MS SQL, IBM DB2, MySQL server
  • Infrastructure as code: Terraform, cloud formation, vault
  • Development Tools: IntelliJ, Eclipse, Visual Studio, Notepad
  • Tools & Framework: Kafka, Microservices
  • Containerization technologies
  • Version control systems
  • Scripting languages proficiency
  • Load balancing techniques
  • Software development lifecycle
  • Microservices architecture

Accomplishments

Academic Project Achievement

  • Completed hands-on cloud-focused coursework and projects during Master’s program, including designing scalable architectures, implementing DevOps pipelines, and deploying containerized applications on AWS and GCP.

Operational Excellence

  • Successfully led root cause analysis and infrastructure troubleshooting in production environments, reducing system downtime by 40% through proactive monitoring and automation.

Cross-Team Impact

  • Partnered with development and QA teams to streamline CI/CD workflows and improve deployment reliability, accelerating release cycles by 30% while enhancing system resilience.

Certification

  • Aws Solution Architect
  • Prometheus Certified Associate
  • Aws Ai Practitioner Certification
  • Kubernetes Certified Administrator

Languages

English
Native or Bilingual

Timeline

Devops/Associate Site Reliability Engineer

Boost Moblie
08.2022 - Current

Devops Engineer

Idelic
01.2022 - 08.2022

DevOps Engineer

PayPal
05.2021 - 01.2022

Jr Data Analyst

MARCH FIRST SYSTEMS. LLC
07.2020 - 05.2021

AWS Cloud Administrator

SYNERGY TECHNOLOGIES
05.2018 - 05.2019

Master’s degree - Data science in Management science

university of North Texas

Bachelor’s degree - computer science and Engineering

Gandhi Institute of Technology and Management
Aravind Kumar S