Summary
Overview
Work History
Education
Skills
Websites
Certification
Timeline
Generic

ASHOK REDDY N

Tampa,FL

Summary

Dynamic AWS Cloud DevOps Support Engineer with 8.5 years of extensive experience in monitoring, maintaining, and optimizing cloud infrastructure. Proven track record of promptly resolving incidents within SLA using Service Now and Jira, while enhancing system performance through effective troubleshooting and scalable solutions. Skilled in CI/CD pipelines, infrastructure deployment with Terraform, and proficient in monitoring tools such as Dynatrace Datadog, AWS Cloud Watch, Splunk, and ELK Stack. Adept at incident, problem, and change management, adhering to ITIL standards, and driving continuous improvement initiatives.

Overview

11
11
years of professional experience
1
1
Certification

Work History

Sr. AWS Cloud Support Engineer

ATOS Syntel GDC
08.2021 - 04.2024

Client : American Family Insurance USA


  • Provided robust technical support and maintenance for AWS cloud infrastructure, ensuring optimal performance and reliability for American Family Insurance USA
  • Collaborated closely with cross-functional teams to design and implement scalable solutions, enhancing operational efficiency
  • Resolved complex technical issues promptly, minimizing downtime and improving system stability
  • Conducted proactive monitoring and troubleshooting of AWS services and applications, ensuring adherence to SLAs
  • Contributed to disaster recovery planning and execution, ensuring business continuity during critical incidents
  • Documented processes, procedures, and best practices, facilitating knowledge sharing and team training
  • Proactively monitor systems on Dynatrace and Datadog Monitoring for potential issues and resolve incidents promptly within SLA in ServiceNow and Jira
  • Maintaining AWS, GCP Cloud Infrastructure and checking availability of Critical Services
  • Incident, Problem, Change Management using ServiceNow, managing defects in Jira Atlassian and creating documents and Knowledge Articles in Confluence
  • Monitoring AWS Cloud Infrastructure with the help of Dynatrace, Datadog and checking the Logs in AWS Cloudwatch, Splunk and ELK Stack for root cause resolution
  • Hands on experience in infrastructure deployment with the help of Terraform
  • Incident Management using ServiceNow, managing defects in Jira Atlassian
  • Debugging tools like ELK (Log Management Tool), Postman, Dynatrace Purepath
  • Understanding and experience with Hybrid-Cloud, Multi-Cloud scenarios & Hosting options like IaaS, PaaS and Serverless Technologies
  • Optimize system performance by identifying bottlenecks and implementing scalable solutions
  • Create and maintain detailed documentation for system architecture, incident reports, and operational procedures
  • Hands-on experience with AWS Devops Tools, Jenkins CI-CD, Docker, Terraform, Kubernetes.

Senior App Support Engineer

Fix Stream
10.2020 - 08.2021

Client : Resolve Systems USA


  • Installed, upgraded, and troubleshooted Resolve Automation products across diverse customer environments, ensuring seamless functionality and customer satisfaction
  • Troubleshot DB connectivity and configuration issues on Linux servers, resolving issues promptly to minimize downtime and enhance performance
  • Utilized Elasticsearch commands to monitor cluster health, indices, and shards, effectively troubleshooting Elasticsearch issues
  • Analyzed logs using log4j method to identify and resolve customer issues efficiently
  • Created and managed Run Books in Resolve UI, customized action tasks per customer requirements, and monitored Run Book alerts using Worksheets
  • Administered user accounts in Salesforce, managed tickets, and adhered to SLAs to prioritize and resolve issues promptly
  • Addressed P1 outages, raised enhancement requests based on customer needs, and escalated issues to DevOps and product teams for resolution
  • Troubleshot connectivity issues in RabbitMQ and Tomcat logs, resolving load balancing and basic networking issues in cluster environments
  • Managed SNOW gateways, EWS, and DB gateway issues, resolving issues to ensure seamless product operation
  • Authored KB articles on new issues, ensuring knowledge dissemination and proactive issue resolution
  • Provided on-call support on weekends as per roster, joining customer calls via Zoom, Teams, and Ring Central to resolve issues promptly
  • Participated in weekly scrums with developers to discuss customer requests and updates, contributing to product enhancements
  • Created escalation tickets in Jira, tracked updates and bugs in Jira, and utilized Confluence for the latest product updates and bug fixes
  • Distributed product installation files to customers using Nextcloud, ensuring seamless deployment and customer satisfaction
  • Monitored SAAS server information and billing using Lumen portal, ensuring operational efficiency and customer satisfaction
  • Troubleshot SOAP and REST API issues, resolved Gateway Filter UI rendering issues, and addressed TLS/SSL and JVM issues
  • Resolved import/export issues and customer Run Book issues through effective remote server troubleshooting and support.

Cloud Systems Engineer

Oracle
05.2019 - 11.2019

Project : Oracle Retail Cloud Project


  • Managed working environments using chef for configuration management, writing Playbooks and Roles to provision machines in diverse environments
  • Utilized Ansible Modules to automate infrastructure changes, ensuring efficient deployment and configuration management
  • Deployed WAR files in WebLogic application servers and monitored deployments through WebLogic console, ensuring application uptime and performance
  • Conducted system health checks, implemented proactive alerting, and maintained historical performance records for customer environments
  • Troubleshot memory, backup, and storage issues based on alerts, ensuring system reliability and performance optimization
  • Addressed SFTP connectivity and password reset issues promptly to minimize downtime and maintain data security
  • Demonstrated proficiency in networking protocols such as HTTP, DNS, and TCP/IP, ensuring seamless network operations
  • Executed daily tasks including CPU patch updates, upgrades, and password rotations during maintenance windows, minimizing security risks and maintaining system integrity
  • Restarted services in Tomcat and WebLogic servers, resolving application downtime and ensuring continuous service availability
  • Leveraged Confluence for documentation and collaboration, ensuring up-to-date information sharing and knowledge management
  • Provided 24/7 support in rotational shifts, promptly resolving tickets based on severity levels and meeting SLAs consistently
  • Restarted and monitored adapter services, ensuring integration and data flow continuity
  • Communicated effectively with customers and internal teams via Slack, desk phones, and emails, fostering collaborative problem-solving and maintaining high customer satisfaction.

Senior Production Support Engineer

Epsilon
04.2017 - 05.2019

Project : Epsilon Harmony Email Campaign USA


  • Installed and configured AWS services including EC2, S3, ELB, IAM, AMI, Snapshots, EBS, and Auto Scaling, ensuring optimized performance and scalability
  • Maintained production servers (core, pipeline, RTM, reporting, MTA, etc.) and resolved issues promptly based on customer tickets, minimizing downtime and ensuring smooth operations
  • Managed IAM roles and permissions, troubleshooting access level issues and ensuring security compliance
  • Monitored API calls and system services, resolving issues related to server availability, disk space, CPU, memory, and processes
  • Automated application deployment and configuration management using Ansible, streamlining deployment processes and ensuring consistency across environments
  • Created and managed S3 buckets, implemented policies, and utilized S3 for customer file import/export storage and backup solutions
  • Setup and managed EBS volumes, attached them to EC2 instances, and performed backup and recovery using snapshots
  • Created AMI images of critical EC2 instances as backup using AWS CLI, ensuring disaster recovery readiness
  • Deployed end-to-end solutions from EC2 instance creation to infrastructure setup, ensuring seamless integration and functionality
  • Monitored AWS services including EC2 and S3 through CloudWatch, ensuring proactive identification and resolution of performance issues
  • Utilized Nagios and Dynatrace for Linux server monitoring, creating and closing tickets as per SLAs in a 24x7 support environment
  • Tracked project bugs and tasks using Jira, facilitating efficient collaboration and issue resolution
  • Implemented and tested RESTful APIs using Postman, ensuring reliable API functionality and integration.

Linux System Administrator

Soft Brij IT Solutions
06.2013 - 09.2014
  • Ensured high availability and performance of on-premise systems while enhancing automation and scalability
  • Deployed and managed configuration with Puppet, improving efficiency and reliability of system deployments
  • Deployed and troubleshooted Jenkins builds, ensuring smooth integration and delivery pipelines in on-premise environments
  • Managed provisioning and de-commissioning of on-premise infrastructure, optimizing resource utilization and reducing costs
  • Applied patches and performed RHEL upgrades on Linux servers according to maintenance schedules, ensuring system security and stability
  • Managed file systems, performed SQL queries on production databases, and developed shell scripts for automating health checkups and reducing manual tasks
  • Conducted audit checks including RCA, log analysis, cleanup, disk space checks, backups, and security checks using scripting tools
  • Worked with protocols such as HTTP, SMPP, UCP, TCP/IP, ensuring smooth operation of network services
  • Monitored system resources, key processes, and scheduled jobs such as backups using Nagios and ELK for centralized log monitoring
  • Addressed infrastructure alerts triggered by Nagios, ensuring prompt resolution and minimal downtime
  • Scheduled regular tasks using Crontab, ensuring timely execution and maintenance of critical system operations
  • Provided rapid resolutions for production issues, creating change requests, work orders, and problem tickets using Epsi, ServiceNow, and obtaining necessary approvals
  • Tracked project bugs and tasks using Jira, facilitating efficient collaboration and issue resolution among teams.

Education

Bachelor of Technology -

Jawaharlal Nehru Technological University
05.2013

Skills

  • Code Management System: GIT, Bit Bucket
  • Build Tools: Maven
  • Continuous Integration Tools: Jenkins
  • Configuration Deployment Tool: Chef
  • Configuration Management Tool: Ansible for Linux
  • Virtualization: Docker, Kubernetes, Amazon AWS/EC2
  • Application server : Tomcat
  • Middleware : Web logic
  • Cloud Platforms: Amazon Web Services
  • Monitoring Tools: Nagios, Datadog, ELK, Dynatrace, Grafana Monitoring
  • Database System: Oracle, My SQL
  • Platforms/Operating System: Linux
  • Software Methodologies (SDLC): ITIL
  • Project and Ticket Management: JIRA, Service Now, sales force
  • Content Management: Confluence and SharePoint
  • Messaging: Rabbit MQ,Kafka

Certification

DevOps Certified from Collabera TACT


Timeline

Sr. AWS Cloud Support Engineer

ATOS Syntel GDC
08.2021 - 04.2024

Senior App Support Engineer

Fix Stream
10.2020 - 08.2021

Cloud Systems Engineer

Oracle
05.2019 - 11.2019

Senior Production Support Engineer

Epsilon
04.2017 - 05.2019

Linux System Administrator

Soft Brij IT Solutions
06.2013 - 09.2014

Bachelor of Technology -

Jawaharlal Nehru Technological University

DevOps Certified from Collabera TACT


ASHOK REDDY N