Summary
Overview
Work History
Education
Skills
Certification
Publications
Timeline
Generic

Baskar Jayakumar

Senior Site Reliability Engineer
Lynnwood,WA

Summary

Senior Site Reliability Engineer with over 15 years of IT experience and 10 years of Big Data Platform Engineering experience in architecting & provisioning scalable Kubernetes, EKS, EMR and Hadoop clusters.

Currently administer and manage 25+ on-premise kubernetes clusters, comprising 1000+ worker nodes, alongside 100+ EKS clusters scalable to 3000+ nodes for processing & streaming 10K+ jobs per day. Also administered 10+ Hadoop clusters comprising 700+ nodes & 18+ PetaBytes of data. I have over 8 years of experience in AWS Cloud Platform Engineering, provisioning end-to-end cloud platform solutions, which include IaaS, Data Lakes, Data Replication and Data streaming.

Overview

15
15
years of professional experience
1
1
Certification
4
4
years of post-secondary education

Work History

Senior Site Reliability Engineer

Salesforce
04.2020 - Current
  • Manage 25+ on-premise & 100+ EKS kubernetes clusters that comprise more than 3000+ nodes handling 10K jobs & petabytes of data per day.
  • Manage all Spark infrastructure that runs on top of Kubernetes.
  • Build & maintain deployable Spark versions from the open-source community for internal customers.
  • Patch open-source Spark with security vulnerability fixes.
  • Implemented HPA (Horizontal Pod Scaling) to resolve persistent issues related to resource overhead.
  • Implemented automated deployment of kubernetes to on-premise clusters using AWS code deploy.
  • Implemented cost-saving measures by adopting optimized hardware & software improvisations in EKS & AWS.
  • Conducted root-cause analysis after major incidents to identify areas for process improvement.
  • Collaborate with stakeholders to optimize infrastructure performance.

Big Data Systems Engineer

Expedia Group
9 2015 - 04.2020

• Created full stack Hadoop clusters, EMR clusters, Qubole clusters & Kubernetes cluster.
• Implemented Ranger on cloud Data Lakes to mask PII & PCI data.
• Upgraded Hadoop Clusters from Cloudera CDP to Hortonworks HDP.
• Implemented R, RHive, R Studio in Hadoop Clusters.
• Implemented High Availability Servers for Namenode, Resource Manager, Hive.


Open Source Project contributions :


Jetstream - Data Replication tool to move data from on premise to Cloud.
Data Highway - Data Streaming platform to stream data from on-premise to Cloud.
Apiary - Tool to provision Data Lakes in Cloud.

Big Data Administrator/Onsite Lead

The Home Depot
03.2013 - 08.2015

• Designed & Provisioned HDP 2.X Hadoop Clusters via Ambari automation.
• Administrator of 8 Hadoop clusters with 500+ nodes & 2.5+ Peta Bytes data.
• Upgraded Hadoop version from Cloudera (CDP) to Hortonworks (HDP)
• Implemented High Availability set-up on Namenode, Resource Manager, Hive.
• Encrypted PII/PCI data using IBM Gaurdium tool.
• Automated Data replication on critical data to Disaster Recovery Cluster.
• Led the offshore team.

Linux Administrator

Tata Consultancy Services
12.2008 - 02.2013

• Managed 4000+ production POS servers of Home Depot stores.

• Built new servers for production by applying relevant packages & softwares.

Education

Bachelor of Engineering - Electronics & Communication Engineering

Anna University
Chennai, India
08.2004 - 05.2008

Skills

Kubernetes Administration

Certification

Hortonworks Certified Hadoop Administrator 2.X, 008-000273, 05/01/2015

Publications

Autonomous security Robotic System, Indian Institute of Technology, 09/01/2007

Timeline

Senior Site Reliability Engineer

Salesforce
04.2020 - Current

Big Data Administrator/Onsite Lead

The Home Depot
03.2013 - 08.2015

Linux Administrator

Tata Consultancy Services
12.2008 - 02.2013

Bachelor of Engineering - Electronics & Communication Engineering

Anna University
08.2004 - 05.2008

Big Data Systems Engineer

Expedia Group
9 2015 - 04.2020
Baskar JayakumarSenior Site Reliability Engineer