
Senior Infrastructure & AI Infrastructure Engineer with 15+ years of experience designing, automating, and operating large-scale Linux, Kubernetes, cloud, and enterprise infrastructure across mission-critical production environments. Experienced in building highly available platforms supporting SaaS, low-latency, and enterprise workloads with a strong focus on reliability, automation, observability, and operational excellence.
At Workday and MEMX, I have designed and operated production Linux and Kubernetes platforms, implemented Infrastructure-as-Code using Terraform, automated infrastructure with Python, Bash, Chef, and Ansible, and managed enterprise storage, virtualization, monitoring, and cloud infrastructure across AWS and GCP. My experience includes Kubernetes operations, Helm, ArgoCD, CI/CD automation, observability using Prometheus, Grafana, ELK, Splunk, and InfluxDB, bare-metal provisioning, PXE/Kickstart automation, incident response, performance tuning, and root-cause analysis.
Recently, To expand my expertise into AI infrastructure, I earned the NVIDIA Certified Associate – AI Infrastructure and Operations (NCA-AIIO) certification and have built practical skills through hands-on labs and self-directed projects involving NVIDIA GPU infrastructure, GPU-enabled Linux servers, CUDA fundamentals, NVIDIA AI Enterprise concepts, Kubernetes GPU Operator, GPU scheduling, AI cluster operations, model serving, inference infrastructure, and Multi-Instance GPU (MIG). This complements my extensive Linux and Kubernetes platform engineering background and enables me to support modern GPU-accelerated AI and HPC environments.
I enjoy building reliable, automated, scalable infrastructure and continuously expanding my expertise in AI infrastructure, GPU computing, cloud-native platforms, and production operations.
Additional Professional Development (Personal Learning & Hands-on Labs)
• Provisioned and administered Red Hat Linux, Solaris, and Microsoft Windows server environments across enterprise data centers.
• Managed VMware, Solaris Zones, LDOMs, KVM, Veritas Cluster Server, Solaris Clustering, VMware
HA, LVM, ZFS, UFS, VxFS, and storage multipathing.
• Performed rack-and-stack operations, hardware installs, cable management, switch/router/firewall
installation support, failed disk/NIC/PSU replacement, and asset tracking.
• Managed storage technologies including EM