Senior Site Reliability Engineer with over 15 years of IT experience and 10 years of Big Data Platform Engineering experience in architecting & provisioning scalable Kubernetes, EKS, EMR and Hadoop clusters.
Currently administer and manage 25+ on-premise kubernetes clusters, comprising 1000+ worker nodes, alongside 100+ EKS clusters scalable to 3000+ nodes for processing & streaming 10K+ jobs per day. Also administered 10+ Hadoop clusters comprising 700+ nodes & 18+ PetaBytes of data. I have over 8 years of experience in AWS Cloud Platform Engineering, provisioning end-to-end cloud platform solutions, which include IaaS, Data Lakes, Data Replication and Data streaming.
• Created full stack Hadoop clusters, EMR clusters, Qubole clusters & Kubernetes cluster.
• Implemented Ranger on cloud Data Lakes to mask PII & PCI data.
• Upgraded Hadoop Clusters from Cloudera CDP to Hortonworks HDP.
• Implemented R, RHive, R Studio in Hadoop Clusters.
• Implemented High Availability Servers for Namenode, Resource Manager, Hive.
Open Source Project contributions :
Jetstream - Data Replication tool to move data from on premise to Cloud.
Data Highway - Data Streaming platform to stream data from on-premise to Cloud.
Apiary - Tool to provision Data Lakes in Cloud.
• Designed & Provisioned HDP 2.X Hadoop Clusters via Ambari automation.
• Administrator of 8 Hadoop clusters with 500+ nodes & 2.5+ Peta Bytes data.
• Upgraded Hadoop version from Cloudera (CDP) to Hortonworks (HDP)
• Implemented High Availability set-up on Namenode, Resource Manager, Hive.
• Encrypted PII/PCI data using IBM Gaurdium tool.
• Automated Data replication on critical data to Disaster Recovery Cluster.
• Led the offshore team.
• Managed 4000+ production POS servers of Home Depot stores.
• Built new servers for production by applying relevant packages & softwares.
Kubernetes Administration