Summary
Overview
Work History
Education
Skills
Certification
Technical Profile
Timeline
Generic

Rahul Jain

San Jose,USA

Summary

Dynamic Sr. Data Platform Engineer with a proven track record at Jenius Bank, Apple, Palo Alto networks, excelling in Terraform automation and cloud infrastructure management. Expert in building resilient GCP and AWS services and fostering collaboration across teams. Passionate about driving innovation and enhancing performance in AI/ML workloads while ensuring adherence to security protocols.

Overview

14
14
years of professional experience
1
1
Certification

Work History

Sr. Data Platform Engineer

Jenius Bank
, remote
04.2022 - Current
  • Build infrastructure to support AI ML workloads. Infrastructure consisting of GPU((H100/A100) and TPU attached machine capable on GKE and GCE instances.
  • Developed reusable terraform modules to provision private GCP services including CloudSQL, BigQuery, Vertex AI Studio, GKE, and API gateway with Kong and APIM on Azure.
  • Host infrastructure for LLM hosting on autoscaling GKE cluster with Kong API gateway management handling Authorization and authentication.
  • Setup API endpoint to support highly resilient and available multi regional service.
  • Implemented MLOps practices to streamline AI and ML workload management using Azure DevOps Pipeline.
  • Setting up automation scripts using python google APIs Via Azure DevOps Pipeline.
  • Built and maintained open source Confluent Kafka on GCP for real-time ingestion pipelines, enabling encryption and authentication, and automating topic, producer, and consumer management.
  • Collaborated with teams to containerize docker microservice applications and deploy them in production, implementing Blue/Green and canary deployment strategies.
  • Coordinated with Infosec teams to maintain security protocol compliance.

Sr. Cloud Devops Engineer

Apple
Cupertino, USA
06.2021 - 04.2022
  • Build reusable terraform modules to provision private GKE, EKS clusters and other GCP/AWS services like CloudSQL, IAM, Docker, Filestore, Artifact Registry, Secret manager, KMS using CMEK, Networking, IAM factory VPC etc.
  • Developed centralized system for provisioning Kubernetes clusters across multiple cloud providers and project/account setups using IAC tools like terraform-cloud, Ansible, Jenkins, GitHub.
  • Work with different teams to containerize microservice applications and deploy them at production grade following Blue/Green and canary deployment strategies.
  • Implemented Kubernetes best practices utilizing GCP cloud foundation toolkit to enhance deployment reliability.
  • Implement OPA policies for gatekeeping and calico network policies for restriction of kubernetes network traffic.
  • Created automation scripts with Python Google APIs, scheduling execution through CI/CD Jenkins pipelines for streamlined deployment processes.
  • Setting up Secret manager for store service account keys and setting retention and rotation policies.
  • Work with different Infosec teams to adhere to the guidelines.

Sr DevOps Engineer - Data Operations Team

Palo Alto Networks
Santa Clara, USA
08.2017 - 05.2021
  • Architected and built central IT Datalake in GCP utilizing SaaS services and Kafka for seamless integration with diverse data sources.
  • Build CICD automation pipelines using Jenkins, github, terraform, ansible to promote Infrastructure including GKE, Docker, GCE, dataproc, dataflow, CloudSQL, Cloud Composer, Big query as a code.
  • Build scalable Airflow Infrastructure for workflow scheduling.
  • Maintain and scale production Hadoop, Kafka, and Spark clusters. Improved scalability, service reliability, capacity, and performance by adding or removing nodes. Upgrade CDH version and install CDP platform on prem and on cloud. Collaboration with development teams to install updates/patches, and manage version stability across Hadoop and Spark offerings including Cloudera components such as Kudu as required.
  • Work with development and QA teams to design Ingestion Pipelines, Integration APIs, and provide Hadoop ecosystem services. Participate in the occasional on-call rotation supporting the infrastructure; troubleshoot incidents, formulate theories, and narrow down possibilities to find the root cause.
  • Managed encryption at rest for data clusters, overseeing security, authorization, and access control while automating management of large Big Data clusters.
  • Configured security protocols for Kafka cluster to enhance data protection. with all Encryption, Authentication (SSL and Kerberos SASL, and Authorization (ACLs).
  • Developed certificate generation and renewal system, ensuring zero downtime for real-time pipelines.
  • Build monitoring and Alerting platform using open source tools like Prometheus and Grafana.

Big Data Engineer

Infosys Ltd
Pleasanton, USA
09.2015 - 07.2017
  • Wrote Spark-based Python code to extract pricing and sales data from historic Gap subsystems.
  • Developed Spark jobs to extract real-time data from GAP online stores via Kafka queues.
  • Tuned performance of Spark and Hive jobs coded in Python.
  • Writing python scripts for monitoring and maintaining the clusters.
  • Configured and integrated new data-nodes into production and non-production Hadoop clusters to enhance data processing capabilities. in prod and non-prod Hadoop clusters.
  • Installed HDFS cluster, Node setup and basic configuration of HDFS environments.
  • Maintenance of big data Infra environment with Hadoop 1.x and Hadoop 2.x using Hortonworks distribution.

L3 Lead for DELL Clerity Legacy Mainframe system: DevOps/Operations

Infosys Ltd
Pleasanton, USA
01.2012 - 08.2015
  • Managed critical mainframe applications for GAP, serving as the primary contact for issues affecting 60% of the total applications on the GAP mainframe.
  • Facilitated migration of mainframe code to Clerity, mentoring stakeholders on the new environment to develop their skills and increase awareness of recent advancements.
  • Collaborated with clients and vendors (Dell, CA, Syncsort) to resolve significant technical challenges and product bugs.
  • Served as the primary contact for technical difficulties faced by various application teams.

Education

Bachelors of Engineering - Electronics and Communication

University of Rajasthan
Jaipur

Skills

Automation tools

  • Terraform
  • Ansible
  • Jenkins
  • Harness
  • Jfrog Artifactory
  • GitHub
  • Nginx
  • Azure DevOps

Databases

  • BigQuery
  • Spanner
  • MySQL
  • NoSQL
  • DB2
  • Hive
  • Kudu
  • Impala
  • Scylla
  • Snowflake
  • Firestore
  • CDH/CDP Cloudera Hadoop
  • Spark
  • Sqoop
  • MapReduce
  • Confluent Kafka
  • HDFS
  • Altan

Compute tools

  • Dataproc
  • Dataflow
  • GKE

IDEs/Applications

  • JIRA
  • Service Now

Monitoring

  • Prometheus
  • Grafana
  • Airflow
  • Datadog

DataScience/AI ML

  • DataIku
  • GCP Vertex AI
  • Workbench
  • Cloud services
  • GCP AI platform

AWS

  • EC2, EKS, RDS
  • AWS CLI
  • Cloud-formation
  • ASG
  • Azure
  • VNET
  • MLOps

Cloud architecture

Infrastructure management

Performance tuning

Cloud infrastructure management

Linux system administration

DevOps methodologies

  • MLOps implementation
  • Network security
  • API design and development
  • Infrastructure as Code

Certification

  • CNCF - Certified Kubernetes Administrator - CKA
  • CNCF - Certified Kubernetes Security Specialist - CKS
  • HashiCorp Certified - Terraform Associate
  • Google Cloud Certified - Professional Machine Learning Engineer
  • Google Cloud Certified - Professional Cloud DevOps Engineer
  • Google Cloud Certified - Professional Cloud Architect
  • Google Cloud Certified - Professional Cloud Security Engineer
  • Google Cloud Certified - Associate Cloud Engineer
  • Big Data - Hortonwork HDP Certified Administrator - HDPCA
  • Amazon Web Services - Solution Architect - Associate level
  • Big Data: CCA175- Cloudera Spark and Hadoop Developer Certification
  • SnowPro Core Certification

Technical Profile

Terraform, Ansible, Jenkins, Jfrog Artifactory, GitHub, nginx, Azure DevOps, BigQuery, MySQL, NoSQL, DB2, DB2LUW, Hive, Kudu, Impala, Scylla, Snowflake, Firestore, CDH/CDP Cloudera Hadoop, Spark, Sqoop, MapReduce, Hive, Confluent Kafka, HDFS, Sentry, Zookeeper, ELK, EMR, Ambari, Cloudera Manager, Kibana, Dataproc, Dataflow, GKE, JIRA, GitHub, Service Now, Prometheus, Grafana, Airflow, Kubernetes, DataIku, GCP Vertex AI, Workbench, Big query, Cloud SQL, Dataproc, Dataflow, Docker GKE, Composer, App engine, VPC, VPC service controls, Firestore, Firebase, AI platform, EC2, EKS, EMR, RDS, AWS CLI, Cloud-formation, ASG, Azure DevOps, VNET, VPC

Timeline

Sr. Data Platform Engineer

Jenius Bank
04.2022 - Current

Sr. Cloud Devops Engineer

Apple
06.2021 - 04.2022

Sr DevOps Engineer - Data Operations Team

Palo Alto Networks
08.2017 - 05.2021

Big Data Engineer

Infosys Ltd
09.2015 - 07.2017

L3 Lead for DELL Clerity Legacy Mainframe system: DevOps/Operations

Infosys Ltd
01.2012 - 08.2015

Bachelors of Engineering - Electronics and Communication

University of Rajasthan
Rahul Jain