Cloud Site Reliability Engineer with 7+ years of IT experience and currently working with Bank of America. Improved reliability of production systems on cloud platforms. Proficient in troubleshooting and debugging technical systems, swiftly resolving complex issues for seamless functionality. Skilled in Kubernetes and Azure cloud platforms. Strong problem-solving, leadership, and communication skills.
Overview
13
13
years of professional experience
1
1
Certification
Work History
Site Reliability Engineer
Bank of America
Dallas, TX
09.2024 - Current
Provided technical guidance on the design, implementation, and maintenance of cloud infrastructure.
Developed and implemented monitoring solutions to improve system reliability.
Monitored systems performance using various metrics such as latency, throughput, availability.
Ensured high availability and scalability of applications across multiple environments.
Researched and evaluated new technologies to enhance platform reliability and stability.
Troubleshooted complex issues related to application architecture and system configurations.
Created automated scripts for software deployments and configuration management tasks.
Technical Lead
Infinite Solution
Austin, Texas
07.2024 - 09.2024
Company Overview: Currently working as Technical Lead at Infinite solution for the client Brightspeed
Primary work involves managing the offshore and on-site team
Providing leave 3 production support for the application hosted in GCP cloud and Kubernetes
Work on setting up dashboards in Dynatrace and Splunk for observability and monitoring
I have experience managing and supporting software releases through continuous integration and continuous deployment (CI/CD) pipelines, ensuring smooth and efficient delivery of updates
Possess strong proficiency in executing Linux commands for effective troubleshooting
Extensive experience in managing IT incidents, resolving root causes, and overseeing change management processes to ensure seamless and efficient operations
Developed and maintained scripts for monitoring and automation of routine tasks in Linux shell scripting, resulting in an overall improvement of operations performance
Experience with the use of version control systems for code management and artifact tracking, including GitHub and JIRA
Worked closely with product owners and business analyst for setting up monitoring and supporting new the deliverables
Currently working as Technical Lead at Infinite solution for the client Brightspeed
Lead Application Support Engineer
Bank of America
Dallas, Texas
12.2022 - 05.2024
Company Overview: Provided level 3 application support for critical natural language processing application
Provided level 3 application support for critical natural language processing application, including troubleshooting and resolving issues related to functionality, performance, and integration
Conducted root cause analysis for recurring issues by working with multiple technology teams such as Cloud Support, Container support, HBase to prevent them from reoccurring
Work in analyzing risk and remediating within the deadline provided
Have worked on fixing container vulnerability and mitigating the risk associated with it
Managing Kubernetes clusters, ensuring scalability, high availability, fault tolerance, and load balancing for containerized applications
Designed and maintained Splunk dashboards, reports, and alerts to provide actionable insights into system performance and security
Collaborated with cross-functional teams to integrate Splunk with various data sources and enterprise systems
Implemented Splunk best practices for data ingestion, indexing, and searching to optimize system performance
Automated routine tasks using Splunk's advanced features and scripting capabilities to improve operational efficiency
Strong working experience working with sql query for production issue analysis and reporting
Skilled in leveraging Splunk for comprehensive data analysis, developing insightful dashboards, and conducting real-time system monitoring
Experienced in troubleshooting complex issues and optimizing performance
I have experience managing and supporting software releases through continuous integration and continuous deployment (CI/CD) pipelines, ensuring smooth and efficient delivery of updates
Have good experience executing Linux commands for troubleshooting
I have extensive experience managing IT incidents, resolving underlying problems, and overseeing change management processes to ensure smooth and efficient operations
Worked closely with product owners and business analyst for setting up monitoring and supporting new the deliverables
Coordinated and implemented Disaster Recovery site in cloud to improve site reliability & resiliency for business-critical application
Developed and maintained scripts for monitoring and automation of routine tasks in linux shell scripting, resulting in an overall improvement of operations performance
Experience with the use of version control systems for code management and artifact tracking, including GitHub and JIRA
Have good experience using monitoring tools like Splunk for log monitoring and Dynatrace for system monitoring
Ability to understand the whole flow of the application and project objectives which helps in improving the operation excellence and performance of the application
Provided level 3 application support for critical natural language processing application
Lead Software Engineer
Bank of America
Chennai, INDIA
01.2015 - 01.2018
Company Overview: Support the trading environment for the GSL group and ensure that all the trades are flowing as per requirement
Primarily involved in Application support and administering of the application
Skilled in the execution of Linux commands to troubleshoot and resolve issues, with a proven track record of proficiency in navigating Linux environments
Experienced in leveraging Splunk for robust data analysis, developing and managing insightful dashboards, and conducting real-time system monitoring
Proficient in troubleshooting complex issues, optimizing performance, and implementing security best practices to ensure efficient and secure data handling within Splunk environments
Provided support to the front-end developers for technical issues which might come across
Perform daily SOD’s and EOD’s as daily operation before start and end of the market
Monitor the ITRS for and alerts and act accordingly to trouble shoot the issue
Monitor the informatica workflow jobs and AutoSys batch jobs
Have performed various releases coordinating with development team, Deployed applications using command and performed application testing
I have a proven track record in efficiently managing IT incidents, solving recurring issues, and overseeing smooth change implementations
Utilized SQL Server, Sybase Database for querying of trade report and for validation on trade status
Produced and maintained knowledge base and documentation for WIKI implementation
Support the trading environment for the GSL group and ensure that all the trades are flowing as per requirement
Systems Executive
Cognizant Technology Solutions
Chennai, INDIA
01.2012 - 12.2014
Company Overview: Installed, configured and administered Oracle WebLogic 11g and WebLogic servers
Installed, configured and administered Oracle WebLogic 11g and WebLogic servers like Oracle HTTP & Apache server in UAT and Production environment
Proficient in executing Linux commands for effective troubleshooting purposes, leveraging extensive experience in navigating and utilizing Linux-based systems to diagnose and resolve technical issues
Deployed WAR, JAR using Jenkins tool, and Handled enterprise level code releases supporting deployment across all environments
Maintained load balancing, high availability and failover of servers
Involved in performing tuning of JVM and garbage collection algorithms
Provide operational excellence support for critical issues & outages
Notify effected teams and higher management
Diagnose issue and provide/suggest patches or intermediate solutions
Facilitate performance load/stress testing on various applications
Establish SSL handshake between systems using certificates, to support https
Handled disaster recovery drill on half-yearly basis recovering entire CCC applications in DR bubble
Installed, configured and administered Oracle WebLogic 11g and WebLogic servers