Summary
Overview
Work History
Education
Skills
Certification
Skills
Timeline
Generic

Mayur Narang

San Jose,USA

Summary

Accomplished Lead DevOps Engineer at MedCrypt, specializing in CI/CD pipelines and cloud infrastructure. Achieved a 50% reduction in deploy time through automation and optimized Kubernetes management. Proven ability to enhance security compliance while fostering team collaboration, driving significant improvements in operational efficiency and incident management.

Overview

12
12
years of professional experience
1
1
Certification

Work History

Lead DevOps Engineer

MedCrypt
08.2023 - Current
  • Company Overview: MedCrypt provides cybersecurity solutions for medical device and healthcare technology companies.
  • CI/CD & Developer Platform Enablement: Built and maintained CI/CD pipelines using GitHub Actions, Jenkins, Bitbucket Pipelines, and GitLab CI to support Node.js, Python, and Go-based workloads. Standardized Helm and Kustomize-based deployments with ArgoCD GitOps integration and secure promotion workflows. Integrated SBOM checks (Trivy, Syft), rollback hooks, image signing, and approval gates. Reduced deploy time by 50% and enabled self-service deploys across QA, staging, and production environments.
  • Cloud-Native Infrastructure on AWS and Azure: Designed and operated infrastructure on AWS (EKS, EC2, ECS, Lambda, RDS, S3, IAM, Route 53) and Azure (AKS, Azure Functions, VNets, Blob Storage, Azure DevOps). Used modular Terraform with workspaces and remote state for consistent provisioning across environments. Built VPCs, subnets, NAT gateways, and private ingress paths to isolate production traffic. Managed Kubernetes add-ons like CoreDNS, Cluster Autoscaler, and ExternalDNS with secure defaults.
  • Containerization & Orchestration: Managed Kubernetes-based workloads across EKS and AKS with Helm charts and Kubernetes-native patterns. Operated services behind Ingress-NGINX and ALB/NLB with traffic shaping and zone-aware autoscaling. Containerized legacy services with Docker and Docker Compose, enabling smoother migration and consistent test environments. Integrated application-level and sidecar-level health checks for improved rollout safety.
  • Monitoring & Observability: Deployed and managed observability stack including Prometheus, Grafana, Loki, Fluentd, Datadog, ELK, Coralogix, and OpenTelemetry. Instrumented services with tracing, custom metrics, and log correlation across environments. Built dashboards for error rates, latency, saturation, and throughput. Defined and enforced SLOs for core services and created actionable alerts routed via Alertmanager, Slack, and PagerDuty. Integrated X-Ray and CloudWatch Logs for full request traceability in production.
  • Infrastructure as Code & Environment Provisioning: Delivered Terraform modules for EKS/AKS clusters, IAM, networking, DNS, monitoring, and secrets. Integrated CloudFormation and Pulumi (where applicable) for hybrid team needs. Used Ansible for pre-bootstrapping and OS-level provisioning in EC2/VM-based systems. Maintained reproducible infrastructure with version-controlled change management, policy enforcement, and automated drift detection.
  • Security, Compliance & Secrets Management: Enforced least-privilege IAM and RBAC policies across AWS and Kubernetes clusters. Automated secret management using Vault, AWS Secrets Manager, and SOPS. Integrated SBOM scanning in pipelines, rotated secrets via automation, and enforced TLS/mTLS for all internal services. Supported CIS benchmark alignment and passed multiple SOC2 reviews with zero major findings. Built IAM audit tools and GitHub Apps for visibility into policy misconfigurations and access drift.
  • Release Strategies & Deployment Safety: Implemented Blue/Green deployments, Canary Releases, and feature-flagged rollouts using ArgoCD, CodeDeploy, and LaunchDarkly-equivalent tools. Linked PRs with deployment events, release health metrics, and automatic rollback logic based on error thresholds. Ensured consistent traffic management via weighted routing and pre/post-deploy checks. Reduced incident risk and recovery time during peak releases.
  • Automation, Tooling & Workflows: Automated platform workflows using Python, Bash, and Makefiles. Built CLI wrappers for Terraform, Helm, ArgoCD, and vault access. Scheduled CronJobs and AWS Step Functions for maintenance tasks, backup validation, and config sync. Created Argo Workflows for periodic automation jobs and GitHub Apps for dynamic deployment handling and policy enforcement.
  • MedCrypt provides cybersecurity solutions for medical device and healthcare technology companies.

Sr DevOps Engineer

Lending Club Bank
08.2019 - 08.2023
  • Company Overview: Lending Club is a digital marketplace bank offering transparent credit products for consumers and investors.
  • EKS & ECS Adoption for Banking Microservices: Migrated legacy EC2-based applications to containerized workloads on EKS and ECS, supporting a hybrid architecture. Provisioned EKS clusters with Terraform using secure node group configurations, IAM roles for service accounts, and Kubernetes RBAC policies. Enabled autoscaling and zonal distribution using Cluster Autoscaler and taints for compliance-sensitive services. Defined internal Helm standards and integrated admission controllers for workload validation.
  • RDS Management & DR Testing: Managed RDS PostgreSQL and MySQL clusters with cross-region failover, backup encryption, PITR, and event-driven snapshot verification. Tuned performance parameters for transactional consistency and optimized vacuum and replication lag thresholds. Created automated DR validation jobs that performed snapshot restores and simulated availability zone outages to confirm recovery workflows.
  • GitHub Actions-Based CI/CD Pipelines with Compliance Gates: Designed and maintained CI/CD workflows in GitHub Actions with gated promotion logic, rollback hooks, image validation, and changelog diffing. Integrated SBOM generation, static scans, and approval flows tailored to audit readiness. Enabled PR-based deployments with strict branch protections and tracked deployment history via GitHub commit status.
  • Coralogix Observability & System Insights: Shipped metrics, logs, and traces from EKS workloads to Coralogix via OpenTelemetry and FluentD. Built Grafana dashboards covering latency trends, error rates, API health, and deploy-time regression detection. Tuned alerts using correlation filters and routed incidents to the correct teams using structured metadata and escalation tagging.
  • Infrastructure Automation & Bootstrap Scripts: Used Terraform with workspaces and remote state for VPC, RDS, EKS, and IAM configuration. Wrote environment bootstrap scripts in Bash and Python to wire up Route 53 DNS, CloudWatch logs, external secrets, and initial Helm releases. Maintained consistent deployment patterns using Makefiles and reusable module templates.
  • IAM, Secrets, and Configuration Management: Automated IAM policy generation for application roles with scope-based access and rotated secrets through AWS Secrets Manager and custom GitHub Actions. Enforced TLS across services, validated ACM cert expiry, and blocked misconfigured resources pre-deploy using CI-integrated policy validation.
  • Disaster Recovery & Documentation Automation: Defined and tested DR plans for all stateful services using snapshot audits, failover automation, and zone-isolation tests. Created annotated runbooks and recovery workflows tied to infrastructure tags and metrics thresholds. Reduced recovery time objectives (RTO) across core systems by implementing proactive recovery validation tooling.
  • Internal Tooling & Team Enablement: Developed CLI tools in Python to automate service onboarding, namespace provisioning, Helm release validation, and secrets injection. Standardized CI scaffold templates and integrated preflight checks into all repo bootstraps. Led DevOps onboarding and knowledge sharing sessions for new engineering hires.
  • Lending Club is a digital marketplace bank offering transparent credit products for consumers and investors.

Cloud Site Reliability Engineer

Box Inc
02.2017 - 08.2019
  • Company Overview: Box is a leading cloud content management platform used by enterprises to collaborate securely.
  • Early AWS Infrastructure with Terraform & Bash Tooling: Pioneered Box's initial move to Infrastructure-as-Code by building Terraform modules for EC2 instances, security groups, IAM roles, and S3 storage. Developed shell scripts to wrap Terraform for safety, enforce tagging, and validate network ACLs pre-deploy. Reduced manual provisioning times from days to under 30 minutes.
  • CI/CD with Jenkins and Release Workflow Automation: Maintained and scaled Jenkins pipelines for Java/Node.js services with stage-aware deployment logic. Implemented pre-flight checks, artifact validation, and blue/green promotion steps using shared Groovy libraries. Standardized rollback procedures across teams, improving deployment confidence during peak traffic hours.
  • Python & Shell-Based Infrastructure Automation: Wrote Python and Bash scripts for key infra tasks: EC2 backup rotation, snapshot tagging, and RDS failover validation. Developed job schedulers and system monitoring agents for internal tools. Created automated diff-checkers for IAM policy drift detection and baseline enforcement. Shared scripts via internal CLI repo.
  • Monitoring with CloudWatch Logs & Custom Alarms: Implemented CloudWatch logging across EC2 and backend services, building dashboards for CPU, memory, and IO patterns. Created metric filters for common error patterns and configured alerting thresholds tied to service-level SLAs. Conducted monthly reviews with app teams to decommission unused alarms and reduce fatigue.
  • Security and Access Controls Across Environments: Established IAM role templates and assumed-role workflows for dev, staging, and prod environments. Built automated audit scripts to flag overly permissive roles or missing encryption settings on EBS and S3. Partnered with compliance to define access tracking via CloudTrail and hardened defaults across environments.
  • Developer Support and Internal Platform Enablement: Supported onboarding of engineers to cloud environments through docs, CLI wrappers, and pairing sessions. Built environment scaffolding tools for new services, reducing friction for developers launching microservices. Coordinated internal RFCs and release calendars, creating shared operational visibility for the platform.
  • Box is a leading cloud content management platform used by enterprises to collaborate securely.

SR Siri HPC Engineer

Apple Inc.
05.2015 - 02.2017
  • Company Overview: Apple is a global leader in consumer technology and innovation.
  • HPC & Real-Time Platform Optimization: Optimized Siri's compute platform by tuning kernel networking, NUMA topology, and TCP stack for ultra-low latency. Focused on high-throughput environments supporting natural language processing and ML inference, ensuring sub-second response times at scale.
  • Reliability Automation & Tooling: Built internal Python tools for log correlation, system performance visualization, and event-driven anomaly detection. Supported analysis of Siri infrastructure during iPhone launch events and WWDC keynotes. Reduced detection-to-mitigation time by automating log aggregation and alerting dashboards.
  • Configuration & Compliance Automation: Developed and maintained Ansible playbooks and shell scripts to enforce consistent infrastructure configurations across data centers. Implemented audit logging, immutable configurations, and hardening policies. Enabled teams to meet Apple security guidelines and reduce drift across fleets.
  • Monitoring Modernization: Led migration from Nagios to Prometheus for high-availability telemetry. Defined SLOs and SLI metrics for model-serving services. Improved the visibility into job performance and cluster health, driving better incident management.
  • Security Engineering: Introduced kernel-level access controls using AppArmor and SELinux. Enforced privileged action auditing and restricted lateral movement via hardened SSH. Implemented short-lived access tokens via HSM-backed authentication.
  • Cross-Functional Ops Collaboration: Partnered closely with ML engineers to design infrastructure for Siri's model lifecycle - from training to inference. Provided tailored infrastructure with real-time observability and performance tuning. Documented lessons learned for reproducibility and scalability across teams.
  • Linux Fleet Standardization: Supported both physical and virtual server provisioning with PXE, cloud-init, and Ansible. Enabled fast spin-up of GPU and CPU-optimized machines and reduced average provisioning time from hours to minutes across dev/test/production stages.
  • Apple is a global leader in consumer technology and innovation.

DevOps Intern

10.2013 - 11.2015

Education

M.S. - Computer and Electrical Engineering

Northeastern University
Boston, MA
05.2015

B.E. - Electrical Engineering

Manav Rachna International University
India
05.2013

Skills

  • CI/CD pipelines
  • Infrastructure automation
  • Cloud infrastructure
  • Container orchestration
  • Monitoring tools
  • Security compliance
  • Disaster recovery
  • GitHub Actions
  • Kubernetes management
  • Terraform provisioning
  • Team collaboration
  • Technical documentation
  • Incident management
  • Scripting languages
  • Process optimization
  • Version control systems
  • Virtualization technologies
  • Monitoring and logging
  • Release management
  • Continuous deployment
  • Configuration management
  • Maintenance and troubleshooting
  • Performance optimization
  • Linux operating system
  • System administration
  • Infrastructure as code

Certification

  • AWS Certified Solutions Architect - Professional, 04/17/24, 04/17/27
  • Cloudera Certified Developer for Apache Hadoop
  • Cisco Certified Network Associate (CCNA), 416674170388DNWI
  • Cisco Certified Network Professional (CCNP, Routing), 30829382
  • Cisco Certified Network Professional (CCNP, Switching), 1917929097
  • LPIC-1

Skills

AWS (EKS, ECS Fargate, EC2, Lambda, ALB/NLB, S3, RDS, CloudWatch, IAM, Route 53), Azure (AKS, Functions, VNets, Blob), GCP (GKE, Cloud Functions), Terraform (modular, remote state, workspaces), Helm, Kustomize, Ansible, Docker, Bash, Python, GitHub Apps, Makefiles, Argo Workflows, GitHub Actions, Jenkins Pipelines, Bitbucket Pipelines, GitLab CI, ArgoCD, Semantic Versioning, SBOM Scanning, Canary & Blue/Green Deployments, Auto Rollbacks, OpenTelemetry, Prometheus, Grafana, Coralogix, Fluentd, OpenSearch, Alertmanager, AWS X-Ray, Datadog (past), ELK Stack, IAM, RBAC, AWS Secrets Manager, Vault, SOPS, TLS/mTLS, OPA Policy Checks, SBOM (Trivy, Syft), CIS Benchmarks, SOC2/PCI-DSS Readiness, PostgreSQL (RDS), MySQL, MongoDB, Redis, S3, EFS, Azure Blob Storage, Node.js, TypeScript, JavaScript, Python, Bash, Go (working knowledge), React, Internal CLI Tools, Onboarding Scripts, Environment Scaffolding, Runbooks, GitOps Workflows, Slack Integrations, Template Repositories

Timeline

Lead DevOps Engineer

MedCrypt
08.2023 - Current

Sr DevOps Engineer

Lending Club Bank
08.2019 - 08.2023

Cloud Site Reliability Engineer

Box Inc
02.2017 - 08.2019

SR Siri HPC Engineer

Apple Inc.
05.2015 - 02.2017

DevOps Intern

10.2013 - 11.2015

M.S. - Computer and Electrical Engineering

Northeastern University

B.E. - Electrical Engineering

Manav Rachna International University
Mayur Narang
Want your own profile? Build for free at Resume-Now.com