Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

ARHAM ZIA

Cary,NC

Summary

  • Bilingual Senior Cloud Platform Engineer proudly offering over 7+ years of experience , I bring expertise in automating software development, deployment, and infrastructure management, enabling companies to improve efficiency, reduce costs, and enhance product quality. Possessing in-depth knowledge of Kubernetes, I have designed and architected infrastructure using Terraform, implemented CI/CD pipelines with Harness and Jenkins, created security measures with Wiz, and established observability solutions using New Relic and CloudWatch, while integrating PagerDuty for alert management, and utilizing GitHub Actions for streamlined workflows. With experience in implementing DevOps processes, Transformed continuous integration and delivery initiatives, resulting in faster and more reliable software releases. Possess a deep understanding of cloud technologies and have successfully built and managed highly scalable cloud-based architectures. Additionally, my skills in containerization and orchestration technologies have allowed me to break down monolithic applications into microservices, improving developer workflow, scalability, and performance. Overall, I am a results-driven and collaborative team player, committed to delivering innovative solutions that drive business success.

Overview

8
8
years of professional experience
1
1
Certification

Work History

Senior Cloud Platform Engineer / SRE

Cloud Software Group
Raleigh, NC
07.2023 - Current
  • Expertise in Kubernetes and Helm charts to manage and orchestrate containerized applications, including setting up clusters, deploying to production environments, and ensuring scalability, reliability, and high availability.
  • Craft infrastructure-as-code solutions using Terraform and develop automation scripts with Bash, enhancing operational efficiency.
  • Architect and implement robust CI/CD pipelines, streamlining application build, test, and deployment processes in cloud ecosystems.
  • Engineer resilient, self-healing infrastructure and networking solutions for PaaS and IaaS on AWS, prioritizing security and reliability.
  • Deployed, argocd, and managed the entire lifecycle to deploy a microservice on the Kubernetes clusters from scratch.
  • Spearhead GitOps-based deployment strategies, with particular proficiency in ArgoCD implementation.
  • Spearheaded the migration of critical collaboration and version control tools (Jira, Bitbucket, Confluence, GitHub) during a complex corporate acquisition, ensuring seamless transition and minimal disruption.
  • Led a strategic cost-optimization initiative by implementing comprehensive AWS resource tagging and orchestrating the migration of legacy applications to Kubernetes, significantly reducing technical debt and improving operational efficiency.
  • Architected a transition from AWS Secrets Manager to HashiCorp Vault for microservices, enhancing security posture and playing a pivotal role in negotiating favorable licensing terms for the organization.
  • Developed and implemented automated scripts to streamline local development environments, dramatically improving developer productivity, and reducing setup time for new team members.
  • Prioritize and execute performance optimization initiatives, ensuring the platform consistently meets or exceeds high-performance benchmarks.
  • Developed and implemented a comprehensive monitoring and observability framework for applications and microservices on Kubernetes, integrating New Relic and Splunk with PagerDuty and Slack for proactive alerting, defining SLOs and SLAs, and resolving performance and scalability issues to enhance site reliability and customer satisfaction.
  • Architected and implemented high-availability load balancing solutions using HAProxy, optimizing application performance and reliability across distributed systems while ensuring seamless traffic distribution and failover capabilities.
  • Architected and implemented comprehensive AWS networking infrastructure for new accounts, including VPC design, subnet configuration, route tables, security groups, and network ACLs, establishing a secure and scalable foundation for deploying resources and applications.
  • Served as the primary point of contact for MongoDB infrastructure, designing and automating pipelines to create high-availability clusters for microservices across multiple regions, ensuring robust data management and operational efficiency.
  • Migration from Splunk to OpenSearch for logging, deploying Fluent Bit on application servers and Kubernetes, redesigning alerting mechanisms, and ensuring smooth adoption by development teams through continuous monitoring and support.
  • Implemented Okta integration for new applications, managing user access and permissions to ensure secure and streamlined authentication for designated teams.
  • Enforce cloud security best practices, including IAM policies, security groups, and VPC configurations, ensuring compliance and data protection.

Lead DevOps Engineer

Prometheus
Raleigh, NC
03.2022 - 06.2023
  • Experienced in automating infrastructure deployment (IaC) using tools such as Terraform, resulting in a 40% reduction in manual configuration errors and improved system reliability, and managing cloud-based systems at scale.
  • Designed and implemented a CI/CD pipeline for a Kubernetes-based microservices architecture, reducing deployment time by 50% and increasing overall system reliability by 30%.
  • Streamlined a Kubernetes-based autoscaling solution, resulting in a 25% reduction in infrastructure costs and a 15% increase in application performance during peak traffic periods.
  • Implemented automated backup and recovery procedures, reducing data loss by 80% and improving system availability by 25%.
  • Automate, using Ansible and Python, the configuration, installation, and deployment setup of many systems within Cloud Services, including the monitoring system.
  • Systemized scripts and tools to automate operational tasks, reducing manual effort, and improving team productivity by 40%.
  • Automated the deployment and configuration of cloud-based applications using the ArgoCD tool, reducing deployment time and minimizing human error.
  • Collaborated with development teams to ensure applications were designed for scalability and reliability, resulting in a 30% increase in customer satisfaction and a 25% reduction in support tickets.
  • Designed and implemented an automated solution for provisioning, configuring, and managing AWS services, reducing manual effort by 80% and improving system reliability by 30%.
  • Led the migration of on-premise infrastructure to a cloud-based environment, resulting in a 60% reduction in hardware costs and improved system scalability.
  • Led and participated in DevOps-related projects, resulting in a 25% improvement in project success rate and a 10% increase in team morale.
  • Developed automated deployment processes and scripts to ensure a smooth transition from development to production.
  • Configured, managed, and monitored cloud-based services such as AWS EC2, S3, EBS, ELB, RDS using Terraform and Ansible.
  • Analyzed existing applications for performance bottlenecks and implemented solutions to improve scalability.

Linux System Engineer

AT&T
Middletown, NJ
01.2019 - 03.2022
  • Deploying, Configuring, maintaining IT infrastructure with Linux-based systems such as RHEL/ CentOS 6, 7 and 8.
  • Managing and automating ESXi host configuration across many hosts and clusters.
  • Automating the patching of Linux servers by deploying Ansible playbooks to harden against security vulnerabilities.
  • Building and deploying Docker containers to break up monolithic app into micro services.
  • Building servers using AWS Cloud Formation scripts i.e. launching EC2, assigning roles and policies using IAM, implementing Auto-Scaling, ELBs, Security groups in the defined VPC.
  • Provisioning, configuring, monitoring, troubleshooting and managing various storage such as AWS S3, glacier, EBS, EFS.
  • Working with the GIT to push/ pull and commit source code files such as yaml playbooks from the master GIT server.
  • Implementing and deployed optimal RAID clustering configurations on production systems.
  • Authenticating user access controls with LDAP and Active Directory application profiles, respectively.
  • Remediating network stability using Tcpdump and analyzing using Wireshark.
  • Configuring NFS server and mount exported NFS resources fixing any mounting issues.
  • Conducting Root Cause Analysis to remediate and implement future proofing measures.

Linux Support Technician

Boeing
Chicago, MI
01.2017 - 11.2018
  • Implemented and administered RHEL virtual and physical server infrastructure
  • Collaborated with teams to upgrade equipment and design new techniques
  • Installed and configured Linux OS and hyper version for virtual machines , created Golden Images and customized configurations .
  • Configured Autofs server and NFS shares and Built software packages using Red Hat Linux, YUM and RPM
  • Troubleshot server performance issues and resolved memory problems
  • Developed automation framework using Crontab and Anacron
  • Configured NIC bonding for redundancy and improved performance
  • Transferred data using SCP and maintained technical documentation.

Education

Trained in RHCSA -

Corvit Institute of Technology
01.2016

BS - Computer Science

Superior College
01.2015

Skills

Hard Skills

  • CI/CD, Jenkins, and Harness
  • AWS services (EC2, S3, EKS, RDS, Lambda, etc)
  • Oracle (VCN, OKE, Object Storage, and vault)
  • Docker and Kubernetes
  • Ansible
  • Agile/Scrum methodology
  • Continuous Integration (Kaniko Builds, Trivy scan, and Maven)
  • Continuous Deployment ( ArgoCD )
  • GitLab and GitHub
  • Disaster Recovery Planning
  • Application security
  • Infrastructure as Code (IaC), Terraform
  • HashiCorp Cloud and Vault
  • Cloud AWS (EC2, EKS, IAM), S3, NACL, DynamoDB, VPC, EBS, Route 53, AMI, Glacier, Cloud Watch
  • Performance optimization HA (Redis and RabbitMQ)
  • System reliability and scalability optimization
  • Security policy development and implementation
  • Linux Server (RHEL 6, 7, and 8), (CentOS 7 and 8), (Debian Bullseye), (Ubuntu)
  • Virtualization: VMware, vSphere, vCenter, vMotion, ESXi 55 & 6, KVM, VDI
  • Apache , Tomcat and Httpd
  • Net backup, SCP, Rsync, and Bacula
  • Storage management & File Sharing NAS, SAN, DAS, LVM, RAID, NFS, SAMBA
  • Nagios, Dynatrace, Splunk, New Relic, OpenSearch, and PagerDuty
  • Firewalld, SElinux, SSH, Nessus, IPtables
  • Active Directory, OpenLDAP, ACL, RBAC, Okta, and Keycloak
  • DataBase: Postgres, MySQL, RDS, DynamoDB, and MongoDB
  • Server management, Incident & Disaster Management
  • Decommissioning, patching, and provisioning
  • Booting Process GRUB, PXE Boot & KickStart, Kernel Tuning & Parameters
  • Networking Protocols TCP/IP, NIS, DHCP, DNS, NFS, LDAP, SSH, SMTP, SNMP, SSL,HTTP, NTP, VPN, NIC Bonding

Soft Skills

  • Collaboration and Teamwork
  • Communication and Interpersonal Skills
  • Problem Solving and Troubleshooting
  • Adaptability and Flexibility
  • Time Management and Prioritization
  • Continuous Learning and Improvement
  • Attention to detail and accuracy
  • Leadership and Mentoring
  • Decision Making and Strategic Planning
  • Conflict Resolution and Negotiation
  • Creativity and Innovation
  • Emotional Intelligence and Relationship Building

Certification

  • Certified Kubernetes Administrator CKA, CNCF - Apr 2023
  • Credly : https://www.credly.com/badges/e1d154d2-5825-4740-b621-20618094f010/public_url

Timeline

Senior Cloud Platform Engineer / SRE

Cloud Software Group
07.2023 - Current

Lead DevOps Engineer

Prometheus
03.2022 - 06.2023

Linux System Engineer

AT&T
01.2019 - 03.2022

Linux Support Technician

Boeing
01.2017 - 11.2018

Trained in RHCSA -

Corvit Institute of Technology

BS - Computer Science

Superior College
ARHAM ZIA