Summary
Overview
Work History
Education
Skills
Timeline
Generic

Laxmi Mulka

Little Elm,TX

Summary

A highly skilled and results-driven IT professional with over 7+ years of comprehensive experience in Site Reliability Engineering (SRE), Cloud Engineering, and Application Support. Specializing in automation, incident management, monitoring, and troubleshooting of complex production environments, I bring a deep understanding of managing cloud infrastructure, optimizing system reliability, and ensuring seamless application performance. My expertise spans across Linux, AWS Cloud, Autosys job scheduling, CI/CD automation, and multi-tiered application support. Proficient in leveraging industry best practices, I excel in driving improvements in system reliability, availability, and performance through proactive monitoring and automation. Adaptable professional with a quick-learning ability and a talent for adjusting to new environments. Skilled in rapidly acquiring new knowledge and applying it effectively. Driven by a passion for continuous learning and successfully navigating change.

Overview

11
11
years of professional experience

Work History

SRE

Mastercard
11.2023 - Current
  • Created custom monitoring dashboards using Splunk and Dynatrace to track system performance metrics, resulting in a 50% reduction in time to detect issues
  • Led efforts in defining and managing on-call support behaviors and troubleshooting protocols, reducing MTTR (Mean Time to Recovery) by 30%
  • Implemented and managed SLIs, SLOs, and SLAs across critical services, ensuring reliable service delivery and tracking system health metrics
  • Led on-call rotation and managed production incidents, driving resolution through collaboration with development and operations teams to ensure minimal service disruption
  • Spearheaded post-mortem analysis for critical incidents, identifying root causes and implementing preventive measures to mitigate future risk
  • Configured Dynatrace monitoring for cloud-based infrastructure (AWS, Azure, GCP) to ensure high availability and performance of containerized applications running on Kubernetes, improving system uptime
  • Integrated Dynatrace with cloud environments (AWS, GCP, Azure) to monitor infrastructure health, resource utilization, and application performance, providing a unified view of the full-stack ecosystem
  • Used Dynatrace's user experience monitoring (Real User Monitoring - RUM) to track user interactions and optimize application performance, improving page load times by 20%
  • Created custom Dynatrace dashboards to monitor KPIs, service health, and application performance metrics, providing key insights to stakeholders and reducing time spent on manual monitoring
  • Monitored compliance with SLAs and security standards using Dynatrace’s AI-driven insights and alerts, ensuring adherence to service level agreements (SLAs) and regulatory requirements
  • Led the migration of on-premises infrastructure to AWS, utilizing services like EC2, S3, and VPC to reduce infrastructure costs and improve scalability
  • Implemented AWS IAM policies and roles to enforce fine-grained access control, ensuring compliance with security standards
  • Integrated AWS CloudTrail and CloudWatch for comprehensive logging and auditing of API calls and system changes, ensuring security and compliance in highly regulated environments
  • Conducted post-mortem analysis for critical incidents and integrated preventive measures to minimize future disruptions
  • Managed SLIs, SLOs, and SLAs for critical services to guarantee performance and reliability
  • Involved in Implementation Plan Review meetings
  • Technologies: Splunk, Dynatrace, AWS (EC2, S3, VPC, IAM), CloudWatch, AWS Systems Manager, Kubernetes, Docker, Autosys V11, Oracle DB, ServiceNow, Jira, Shell Scripting, Bash Scripting, Remedy, PostgreSQL
  • Client: Fannie Mae, Reston, VA

Production Support Engineer

FannieMae
Virginia, VA
10.2021 - 11.2023
  • Provided on-call support for critical production systems.
  • Ensured that all IT infrastructure is up-to-date with latest security patches and upgrades.
  • Developed and implemented procedures for production support operations.
  • Compiled and submitted monthly and yearly reports.
  • Performed troubleshooting tasks on complex distributed systems.
  • Participated in post-incident reviews to document lessons learned from outages and failures.
  • Scheduled maintenance activities during off peak hours.
  • Developed scripts for automation of routine operational tasks such as backups, monitoring.
  • Analyzed logs from multiple sources to identify patterns or trends in system behavior.
  • Monitored system performance, identified issues and implemented solutions.
  • Implemented change management processes to ensure consistency across environments.
  • Created detailed documentation of process flows and configuration changes.
  • Participated in cross-functional teams with manufacturing, quality, marketing, and procurement departments.
  • Generated reports on system availability, performance metrics and SLAs compliance.
  • Provided training and guidance to junior members within the team on production support activities.
  • Performed root cause analysis to identify underlying causes of incidents.
  • Handled root cause analysis, implementing corrective actions when necessary.
  • Coordinated with development teams to ensure timely resolution of incidents.
  • Demonstrated strong problem-solving skills, resolving issues efficiently and effectively.
  • Worked successfully with diverse group of coworkers to accomplish goals and address issues related to our products and services.
  • Worked with cross-functional teams to achieve goals.

Production Support Engineer

Verizon
Irving, TX
09.2020 - 09.2021
  • Provide Level 1 and 2 support (24x7) production applications, as well as deployment of new software, upgrades and application code to remediate known defects
  • Effectively escalate to Level 3 in-house and 3rd party support teams as needed
  • Work with business and I.T
  • Associates, and outside vendors to effectively plan, deploy, document and maintain the distributed application/middleware environments’ additions and changes
  • Troubleshooted various issues with Applications and assisted with Root cause analysis of the issue and documented the known issues in knowledge Base and shared to Team for better assistance
  • Experienced with various procedures and policies required for escalation and outage resolution with Strong documentation skills
  • Strictly following ITIL standards for incident and problem management and adhere for feature occurrence
  • Provide training and supporting documentation for end users
  • Experience working with Service Now, Remedy for incident tracking/ticketing systems
  • Conducting weekly incident/problem management calls with the development team
  • Provided 24/7 on-call support for production, strong team player, good analytical skills
  • Performed daily production system checks, monitored and analyzed resources, and pro-actively participated in weekly team meetings with managers
  • Have excellent logical, analytical and debugging skills and possess high working qualities with good interpersonal skills
  • Assisted customers with more difficult technical issues requiring a greater level of personalized care and in greater length
  • Onboarded and trained all incoming junior tech support specialists
  • Fast learner, good team player and very proactive in problem solving with providing best solutions
  • Monitoring Tools: Splunk, Kibana, Grafana, Datadog
  • Databases: Oracle 19.8/12c/11g, MSSQL v2012/v2014/v2016/V2019
  • Environment: RHEL V5/6/7, LDAP, Autosys V11, oracle DB, ServiceNow, Jira, Shell Scripting, Bash Scripting, CRON, WinScp, LDAP, NFS, Storage SAN, Outlook, Putty, NotePad++, Microsoft Skype for business

L2 Production Support Engineer

Advansoft International Inc
Hyderabad, India
07.2016 - 03.2019
  • Day to day responsibilities include providing level 2 Application and Infrastructure Support in a Global environment
  • Managing major Incidents and problem resolution, ensuring that all incidents are logged, progressed, updated, authorized, expedited, and resolved within the scope of the service
  • Chairing conference calls with Stakeholders
  • Automating manual month-end processes, Job creation using Autosys JIL scripting
  • Performing additions, changes, or deletions to the scheduled batch workload as authorized by the application developers
  • Monitoring batch workflow and performing Root Cause Analysis
  • Performed Application/Infrastructure monitoring using monitoring like NewRelic
  • Installed NewRelic agents, AutoSys agents on various RHEL Linux and Windows Servers
  • Deployed Jils on Autosys and tweaked the Jil by correcting the minor issues and redeploy on Autosys
  • Put the jobs on ice and hold during release time and office and off hold as post releases
  • Monitoring system resources, logs, disk usage, scheduling backups and restore
  • Involved in troubleshooting multiple application and Linux issues and documented the issues that reoccur
  • Add, modify and delete user/group accounts and set up a user’s work environment using Shell/Bash scripts
  • Monitored and re-submitted failed batch jobs on Agent and analyzed logs to find the root cause of the issue and document it
  • Responsible for Modifying and optimizing backup schedules and making Unix Ksh Scripts for it
  • Involved in Improving automation and productivity of backups through Scripting enhancements
  • Monitored Application System Performance, DB monitoring and acknowledge escalations within defined SLA using various monitoring tools like NewRelic
  • Scheduling of automatic and repetitive Jobs using commands and Shell Scripts on Crontab
  • Written Shell/Bash Scripts to automate day to day administration activities where it is possible
  • Responsible for monitoring daily NetBackup activity and reporting to proactively avoid issues with the use of Unix Shell Scripts
  • Work closely with War room for prioritizing the normalization issues
  • Environment: RHEL V7.5.x, LDAP, Autosys r11.3, oracle V12c, DB2 V 9.x, ServiceNow, Jira, Shell Scripting, Bash Scripting, CRON, WinScp, LDAP, NFS, SolarWinds, NewRelic V4.11/6.4.x, Putty, HTTP, HTTPS, FTP, Microsoft Outlook, Microsoft Teams

Application Support Engineer

Value Labs
Hyderabad, India
06.2014 - 06.2016
  • Analyzed logs and helped the application team in identifying the root cause of the issue and also proactive in avoiding reoccurrence
  • Scheduled Autosys Jobs using Jil for various applications
  • Created Automation Scripts using Shell/Bash Script to automate reoccurring activities like disk space clean up, log rotations, send email alerts for High CPU utilization
  • Identifying the security vulnerabilities and applying the patches on various Linux servers
  • Identifying the list of AutoSys jobs and put them on ice, hold and off force start (sometimes) hold during the maintenance and release activities
  • Responsible for setting up NFS file systems for NetApp volumes exported to large number of Linux Operating Systems
  • Troubleshooted various issues related to disk space, CPU utilization, mount issues, ownership issues, permission issues, networking problems (DNS not resolving)
  • Documented the complex issues and knowledge transferred to internal team during weekly team meeting
  • Stopped, started and restarted the Batch jobs during scheduled maintenance and Release activities
  • Monitored servers and application performance tuning via various stat commands (VMSTAT, nfsstat, iostat etc)
  • Track ownership for Incident, problem, Change management
  • Deployed large number of VMs with Linux desktop images with RHEL 5/6/7/7.5.x
  • Provided production support for both Linux desktops and also servers running RHEL 5/6/7
  • Created clones and Templates of Virtual Machines and Deploying Oracle VM's through Templates
  • Responsible for setting up Linux desktop VMs for application teams
  • Environment: RHEL V5/6/7, LDAP, Autosys V11, oracle DB, ServiceNow, Jira, Shell Scripting, Bash Scripting, CRON, WinScp, LDAP, NFS, Storage SAN, Outlook, Putty, NotePad++, Microsoft Skype for business

Education

Master of Science - Embedded Systems And VLSI Design

Scient College of Engineering And Technology
Hyderabad , India
04-2014

Skills

  • RHEL (5x/6x/7x)
  • HP UNIX
  • Ubuntu
  • Windows
  • AWS (EC2, S3, Lambda, RDS, VPC, IAM)
  • Azure
  • GCP
  • Jenkins
  • GIT
  • Artifactory
  • Nexus
  • Shell
  • Bash
  • JIL
  • Autosys v110/v113/v120
  • Cron
  • Oracle 12c/19c
  • MSSQL (2012/2014/2016)
  • DB2
  • Elasticsearch
  • Logstash
  • Kibana (ELK Stack)
  • Splunk
  • Datadog
  • Grafana
  • Dynatrace
  • AppDynamics
  • New Relic
  • AWS CloudWatch
  • ServiceNow
  • Jira
  • Remedy
  • HTTP
  • HTTPS
  • TCP/IP
  • FTP
  • SFTP
  • NFS
  • DNS
  • VPN
  • MS Office
  • Office 365 suite
  • Microsoft Teams
  • Notepad
  • WinRar
  • WinZip
  • CISCO
  • PulSecure VPN
  • Skype for Business
  • SharePoint
  • Confluence
  • Microsoft LYNC
  • NX Client
  • Putty
  • WinMerge
  • WinSCP
  • Excellent communication
  • Customer relations
  • Time management

Timeline

SRE

Mastercard
11.2023 - Current

Production Support Engineer

FannieMae
10.2021 - 11.2023

Production Support Engineer

Verizon
09.2020 - 09.2021

L2 Production Support Engineer

Advansoft International Inc
07.2016 - 03.2019

Application Support Engineer

Value Labs
06.2014 - 06.2016

Master of Science - Embedded Systems And VLSI Design

Scient College of Engineering And Technology
Laxmi Mulka