Experienced Site Reliability Engineer with over 5+ years in AWS Cloud environments. Adept at improving system reliability, automating infrastructure, and ensuring optimal performance of applications. Proven expertise in cloud architecture, monitoring, and incident management. Strong background in scripting, DevOps practices, and cloud-native technologies.
Excellent knowledge of data mapping, extract, transform and load from different data sources.
Well versed experience in Amazon Web Services (AWS) Cloud services like EMR, S3, Lambda, Triggers, Glue, Step functions, Cloud watch events, Redshift, RDS, SNS, SQS, Athena, Kinesis firehouse, data streams and EC2 web services.
Skilled in Infrastructure as Code (IaC) using Terraform, AWS cloud architecture, CI/CD pipelines, and Kubernetes management to ensure scalability, reliability, and high availability.
Proficient in monitoring, logging, and security practices aligned with industry standards such as SOC 2 and ISO 27001.
Proven ability to collaborate effectively with development teams and mentor junior engineers to foster a culture of excellence.
Strong work ethics with desire to succeed and make significant contributions to the organization.
Experience working both independently and collaboratively to solve problems and deliver high quality results in a fast-paced, unstructured environment.
Overview
6
6
years of professional experience
1
1
Certification
Work History
Site Reliability Engineer
Toyota Motors Ltd
03.2022 - Current
Designed and implemented highly available and scalable infrastructure on AWS
Automated infrastructure provisioning using Terraform and AWS CloudFormation
Developed CI/CD pipelines with Jenkins, leading to 40% faster deployments
Architected and managed AWS cloud environments, optimizing for cost-efficiency, security, and performance.
Monitored system performance and reliability using Prometheus and Grafana
Conducted root cause analysis and incident management, reducing downtime by 30%
Microservices Migration to AWS: Led a project to migrate legacy monolithic applications to a microservices architecture on AWS.
Skilled in Infrastructure as Code (IaC) using Terraform, AWS cloud architecture, CI/CD pipelines, and Kubernetes management to ensure scalability, reliability, and high availability.
Proficient in monitoring, logging, and security practices aligned with industry standards such as SOC 2 and ISO 27001.
Proven ability to collaborate effectively with development teams and mentor junior engineers to foster a culture of excellence.
Utilized Docker for containerization and Kubernetes for orchestration, resulting in improved scalability and fault tolerance
Automated Disaster Recovery: Designed an automated disaster recovery solution using AWS Lambda, S3, and CloudFormation
Deployed and managed Kubernetes clusters for microservices-based applications, ensuring reliability and efficient resource utilization.
Integrated monitoring solutions using Prometheus and Grafana, enhancing system visibility and proactive issue resolution.
Implemented cloud security best practices, ensuring systems adhered to SOC 2 and ISO 27001 compliance standards.
This reduced recovery time objectives (RTO) from hours to minutes
Worked closely with cross-functional teams to ensure seamless integration of infrastructure and applications.
Site Reliability Engineer
RBC Bank
02.2020 - 01.2022
Managed AWS infrastructure and optimized cost by implementing best practices
Created and maintained Docker containers for microservices
Designed scalable infrastructure using Terraform, ensuring high availability across production and staging environments.
Developed automated scripts in Python and Bash for routine tasks
Automated deployment processes, reducing manual intervention and increasing release frequency by 30%.
Set up logging and monitoring solutions with ELK Stack and CloudWatch
Collaborated with development teams to improve application performance and reliability
Implemented Helm and ArgoCD to streamline Kubernetes application deployment and GitOps workflows.
Built centralized logging solutions with the ELK stack and AWS CloudWatch, enabling effective troubleshooting and analytics.
Implemented a CI/CD pipeline using Jenkins, Docker, and Kubernetes, which automated the deployment process and reduced deployment times by 50%
Managed IAM roles and policies in AWS, enhancing system security and reducing unauthorized access.
Conducted a comprehensive analysis of AWS usage and implemented cost-saving measures such as instance right-sizing and reserved instances, achieving a 20% reduction in monthly cloud expenditure
Mentored junior engineers, promoting knowledge sharing and driving team excellence.
BI Developer
Icrea Info Tech Private Limited
12.2018 - 01.2020
Automated configuration management using Ansible
Implemented Kubernetes for container orchestration
Managed version control using Git and integrated with CI/CD tools
Ensured compliance with security standards and conducted regular audits
Provided on-call support and resolved high-severity incidents
Infrastructure as Code (IaC) Initiative: Spearheaded the transition to Infrastructure as Code using Terraform, leading to more consistent and reproducible infrastructure deployments
Security Enhancement Project: Implemented security best practices across the AWS environment, including setting up VPCs, security groups, and IAM roles, which enhanced the overall security posture of the organization
Education
MS in Computer Information Systems -
Christian Brothers University
Memphis, TN
Bachelor in Computer Science Engineering -
KL University
Guntur, India
Skills
AWS Monitoring and Automation
Containerization Proficiency
Kubernetes Deployment Management
Continuous Integration with Jenkins
Infrastructure Automation
Ansible Configuration Management
Python Programming Expertise
Bash Scripting
Proactive Engagement
Java Programming Expertise
Prometheus Metrics Analysis
Grafana Performance Monitoring
ELK Stack Proficiency
Splunk Data Analysis
Source Code Management
Repository Maintenance Skills
Bitbucket Version Control
MySQL Database Management
PostgreSQL
DynamoDB Optimization
Linux System Administration
Windows Operating System Knowledge
Jira Workflow Optimization
Confluence Documentation
Effective Slack Communication
Accomplishments
Achieved a 40% improvement in deployment speed by developing and implementing CI/CD pipelines using Jenkins.
Achieved enhanced scalability and fault tolerance by containerizing applications with Docker and orchestrating them with Kubernetes.
Achieved 30% reduction in downtime by performing root cause analysis and managing incidents with Prometheus and Grafana.
Documented and resolved issues in legacy applications during the migration, leading to a seamless transition to AWS microservices architecture.