Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Madhumitha Karthikeyan

Buffalo,NY

Summary

Skilled and Accomplished Site Reliability / DevOps Engineer with a robust background at Yahoo, adept in cloud computing and virtualization, ensuring 24/7 uptime with proactive risk management. Good scripting skills with Bash and Python. RHCSA Linux+ certified to enhance system reliability. Renowned for capacity planning and project management expertise, improving operational efficiency. Proven track record of successfully managing projects, troubleshooting complex issues and providing technical support. Experienced in developing and maintaining Linux systems and K8's cluster.

Overview

8
8
years of professional experience
1
1
Certification

Work History

Production Engineer

Yahoo
02.2020 - Current
  • Monitored server performance to ensure availability of mission critical services.
  • Experience operating & owning end-to-end availability and performance system servers (Bare metal and VM) in a Production environment for Yahoo's Media properties (4Sole DevOps Engineer / SRE for media properties) with a strong understanding of Linux systems.
  • Proficient in using Ansible, Chef , using them to automate routine tasks and deploy changes to frameworks.
  • Proven understanding and Hands-on experience odf application containers and container orchestration system such as Docker and Kubernetes. SRE support for caching stack fully on k8's
  • Experience running virtual machines under open source (Openstack and Yahoo managed provisioning).
  • Building, automating, and maintaining infrastructure in Amazon Web Services and GCP
  • A firm grasp of IP networking, load balancing, DNS, DHCP, storage concepts, basic networking, IP Tables.
  • Proficient understanding of networking principles, transport, and application protocols, especially TCP/IP, TLS, and HTTP/S and RHCE knowledge.
  • Expertise using monitoring tools and problem ticketing systems.
  • Strong source control experience including branching, merging and rebasing (git)
  • Strong problem-solving, analytical, and troubleshooting abilities. Engage in capacity planning, demand forecasting, software performance analysis, and systems tuning.
  • Knowledge of and proven experience with CDNs and HTTP cache/proxy technologies. Experience supporting CDN delivery for media properties
  • Experience writing code in Bash and python to diagnose and identify appropriate solutions for routine infrastructure problems while partnering across with multiple teams
  • Maximized productivity by 80% by developing and implementing innovative production processes for new and existing products there by improving performance, scalability and reliability as well.
  • Train, guide and delegate work to the Operations team by breaking down information in a systematic and communicable manner from leadership position
  • Aggressively triaging and diagnosing user facing service outages, providing guidance and technical advice to the Ops team and get involved as required, to resolve medium and higher severity incidents
  • Designed and maintained system documentation for infrastructure technology installation, configuration, and troubleshooting
  • Provide on call support to pings, pages, and alerts to investigate issues in our systems.

Production Engineering Ops (NOC)

Yahoo
10.2016 - 02.2020
  • Responsible for monitoring, maintaining and stability of Yahoo's production and corporate infrastructure 24x7x365
  • To escalate and follow up leading to incident closure problem within internal Oath properties
  • Complex and dynamic system application monitoring and resolution and triage of alerts
  • Full stack support of Oath products, services and processes
  • Partner with development, operations and business counterparts to ensure near 100% uptime of all services
  • Report generation and data analysis on key metrics using Splunk
  • Work closely with SRE team to help set up monitoring and maintain 90% or higher resolution rate of Incidents managed for each property to provide continuous availability of services
  • Use automation tools such as Chef and screwdriver to decrease end-to-end deployment times and increase reliability
  • Develop dashboards which help in monitoring network and infrastructure
  • Write simple bash scripts to automate everyday work
  • Work on retiring unused system for proper utilization of resources and reduce maintenance cost
  • Identify and correct root cause of various system alarms
  • Good knowledge on Openstack and worked on lot of projects involving CI/CD
  • Additional experience with industry standard monitoring and ticketing tools such as Moogsoft, Netcool, ServiceNow and JIRA
  • Experience with Linux, HTTP/HTTPS, DNS, NFS, RAID and network protocols such as TCP/IP and OSPF.

Education

Master of Science (M.S.) - Computing Security

Rochester Institute of Technology
Rochester, NY
05.2016

B.Tech in Information Technology - undefined

SSN College of Engineering
Tamil Nadu, India
05.2014

Skills

  • Cloud Platforms: AWS (EC2, S3, Lambda, VPC)
  • Containerization: Docker, Kubernetes
  • Infrastructure as Code: Terraform, Ansible, Chef
  • CI/CD: Screwdriver, Jenkins, GitLab CI
  • Scripting: Python, Bash
  • Monitoring and Logging: Grafana, Prometheus, Splunk
  • Security: IAM, VPC, Security Groups, Encryption
  • Tools: JIRA, Moogsoft, ServiceNow, Netcool

Certification


  • Red Hat Certified System Administrator

Timeline

Production Engineer

Yahoo
02.2020 - Current

Production Engineering Ops (NOC)

Yahoo
10.2016 - 02.2020

B.Tech in Information Technology - undefined

SSN College of Engineering


  • Red Hat Certified System Administrator

Master of Science (M.S.) - Computing Security

Rochester Institute of Technology
Madhumitha Karthikeyan