Managed MSSQL databases, optimized performance, and manipulated data via CLI
Worked on full-stack software development projects, contributing to design, coding, troubleshooting, and deployment processes across multiple platforms
Demonstrated expertise in Linux administration (Oracle, Red Hat, CentOS), including scripting, managing network configurations, and setting up virtualized server environments, especially in cloud-based deployments
Enhanced system reliability by leveraging DevOps tools such as Ansible, Containers, and Kubernetes, and optimizing log processing workflows using Grafana and the ELK Stack (Elasticsearch, Kibana, Logstash)
Monitored system performance and troubleshoot issues in the production environment
Conducted network protocol analysis and reverse engineering for troubleshooting and improving system performance
Sr Site Reliability Engineer
Cohesity
San Jose, CA
09.2019 - 07.2024
Developed and provided exemplary technical support across a diverse spectrum of hardware and software stack
Resolved unexpected technical difficulties and communicated solutions with clients and representatives
Triaged escalation and Priority calls and provided resolution
Contributed as a vital member of the Support Escalations team specializing in Cohesity Data Platform, with a focus on Data backup, restore, cloud and file services
Monitored, debugged, and managed the deployment, installation, upgrade and troubleshooting of the Cohesity platform, addressing network and storage-related issues
Designated Engineer for Platinum accounts, overseeing deployment planning and conducting Proof of Concept (POC) sessions in collaboration with the Sales Team
Developed and maintained automation scripts in Python, Bash, and PowerShell to streamline repetitive tasks, automate deployments, and enhance operational efficiency
Managed containerized applications in Kubernetes clusters, ensuring high availability, fault tolerance, and efficient resource utilization
Worked with Internal QA team to develop test scripts in Lambda (AWS) for internal Lab servers and testing purpose
Mentored new hires and junior engineers, fostering their professional development
Open Product improvement and regression issue with the development team using JIRA
Supervise hardware replacement like Nodes, NIC, Cables, SSD, HDD, Power Supply
Performed root cause analysis of production incidents and provided recommendations for improvement
System Reliability Engineer
Nutanix
San Jose, CA
09.2018 - 09.2019
Provided technical assistance and troubleshooting support to customers, resolving issues related to operating systems, networking, and cloud technologies
Collaborated with engineering teams to escalate and resolve complex technical issues, ensuring minimal downtime and disruption for customers
Contributed to the development of knowledge base articles, FAQs, and other support documentation to assist customers in self-service troubleshooting
Work with technology partners (ex VMware, Citrix, Microsoft) to resolve issues and push improvements in our ecosystem
Provide support on weekdays and off-hours on an as needed and as per scheduled rotational basis
Technical Support Engineer
Panasas Inc
Sunnyvale, CA
05.2016 - 09.2018
Troubleshoot NFS, CIFS, Direct Flow, Networking (layer 2-5) issues, Client (Linux, Windows, MAC) connectivity and Storage problems
Developed documentation of customer's technical environment including network topology diagrams and asset inventories
Provided technical support to customers by troubleshooting and resolving hardware and software issues
Education
Master of Science - Electrical Engineering
San Jose State University
San Jose, CA
12.2015
Bachelor of Science - Electrical and Electronics Engineering
SRM University
Tamil Nadu, India
05.2013
Skills
Operating Systems: Linux (Ubuntu, CentOS), Windows, macOS, Unix
Monitoring and Visualization: Datadog, Grafana, Kibana, ELK Stack
Containers and Orchestration: Docker, Kubernetes, cluster setup, and management