Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Manikantha Naik

Plainsboro,NJ

Summary

Results-driven Cloud Platform & DevOps Engineering Leader with 14+ years of experience architecting, automating, and managing large-scale cloud infrastructures across AWS, Azure, and OCI. Proven success in building and leading high-performing DevOps teams to deliver enterprise-grade, data-driven cloud platforms using Terraform, Terragrunt, Kubernetes, Spark, and Presto. Expert in Infrastructure as Code (IaC), CI/CD automation, and hybrid cloud architecture, with strong experience supporting AI and Data Science teams by designing scalable compute environments and real-time analytics infrastructure. Recognized for driving cloud best practices, automation efficiency, and cost optimization, while enabling innovation through AI-integrated observability and self-service developer platforms.

Overview

15
15
years of professional experience
1
1
Certification

Work History

Senior Site Reliability Engineer

SYNCHRONOSS TECHNOLOGIES
09.2016 - Current
  • Lead the cloud platform engineering team to design, deploy, and maintain scalable infrastructure across AWS, Azure, and OCI using Terraform, Terragrunt, and CI/CD pipelines.
  • Owned end-to-end infrastructure creation, deployment, and production maintenance, ensuring high availability, security, and performance optimization.
  • Architected hybrid cloud solutions integrating on-prem and AWS workloads using NetApp StorageGrid in an Active/Active setup fronted by F5 load balancers.
  • Developed “Sparklens” observability application using Cloudera AI — a per-cluster Kubernetes and Spark monitoring solution with self-service troubleshooting, cost insights, and rightsizing recommendations.
  • Built and optimized Spark, Presto, and Hadoop clusters on Kubernetes and AWS for enterprise analytics workloads, improving performance and scalability.
  • Implemented custom NGINX round-robin routing with Route53 DNS health checks to replace ALB ingress, saving over $200K annually in AWS data transfer costs.
  • Deployed Atlantis and Terragrunt for automated IaC management, enabling on-demand non-prod environment provisioning via GitOps workflows — reducing idle compute costs and saving $150K+ annually.
  • Led Terraform core and provider plugin upgrades, ensuring compatibility, security compliance, and stability across AWS and EKS modules.
  • Managed EKS cluster upgrades and migrations with zero downtime, standardizing governance and cluster lifecycle automation.
  • Enhanced observability using Prometheus, Grafana, and ELK, integrating real-time dashboards and proactive alerting systems.
  • Mentored engineers on IaC best practices, Kubernetes lifecycle management, and multi-cloud automation
  • Participated in strategic planning and design sessions with product management, software engineering, platform engineering, and operations teams.
  • Redesign of Synchronoss Analytics application stack to Dockers (containers) and Kubernetes (K8) for Multi-AZ deployments using AWS Spot and GPU nodes with minio and s3 storage
  • Architected Disaster recovery cluster for Mapr with multi regions, configured active passive cluster in AWS east and west region
  • Leading Cloud & Infrastructure Engineering
  • Key Achievements:
  • Reduced provisioning time by 80% via single-click Terraform and Terragrunt deployments.
  • Achieved >$350K annual AWS cost savings through automation and optimization.
  • Standardized IaC across AWS, Azure, and OCI, improving scalability and compliance.
  • Delivered AI-powered self-service observability tools, reducing SRE dependency and improving developer autonomy.
  • Optimized deployment processes, resulting in faster release cycles and reduced errors.
  • Collaborated with cross-functional teams to develop, test, and deploy scalable software solutions.
  • Implemented cost-saving measures by optimizing resource utilization across cloud-based infrastructure environments.

Linux Systems Administrator

DELTZIA SCM SOLUTIONS
02.2015 - 06.2016
  • Managed Cloudera Hadoop clusters and optimized Spark and Presto workloads for large-scale analytics.
  • Automated deployments using Ansible and Bash scripting; integrated CI/CD pipelines with Jenkins and monitoring via ELK.
  • Client: TPVision India
  • Managed Linux server environments, ensuring optimal performance and security compliance.
  • Automated system monitoring processes to enhance operational efficiency and reduce downtime.
  • Implemented backup solutions, safeguarding critical data against loss or corruption.

System Support Engineer

MASTERCOM TECHNOLOGY SERVICE
09.2013 - 01.2015
  • Administered Linux servers and storage systems, automated maintenance tasks, and implemented proactive monitoring to improve uptime.
  • Client: Tata Communications
  • Diagnosed and resolved technical issues across diverse systems to enhance operational efficiency.
  • Analyzed support trends to identify recurring issues, driving continuous improvement in service delivery standards.

System Administrator

RCS TECHNOLOGIES
07.2011 - 09.2013
  • Supported Linux environments, managed user access, patching, and configuration for distributed systems.
  • Client: Sling Media Pvt. Ltd.

Education

MASTER OF COMPUTER APPLICATIONS - Cloud Computing

JAIN (Deemed-to-be University)
2023

Bachelor of Computer Applications -

Dr. C.V. Raman University
Bilaspur
2019

Diploma - Electronics and Communication

Board of Technical Education
Bangalore
2010

Skills

  • Big data Eco system components – HDFS,Hive,Oozie,Airflow,Spark,Sparksql, Drill and Zookeeper
  • Cloud Platforms: AWS (primary), Azure, Oracle Cloud (OCI), OpenStack, VMware
  • Infrastructure as Code (IaC): Terraform, Terragrunt, Atlantis, Ansible, CloudFormation
  • CI/CD & Automation: Jenkins, AWS CodePipeline, GitHub Actions, Bamboo
  • Containerization & Orchestration: Kubernetes (EKS), Docker, Helm, Istio, Rancher
  • Big Data & Analytics Infrastructure: Spark, Presto, Hadoop, Airflow, Kafka
  • AI/Data Science Support: Building and managing infrastructure for model training, MLOps pipelines, and analytics workloads
  • Monitoring & Observability: Prometheus, Grafana, ELK, Loki, LogicMonitor, AI-powered observability tools
  • Programming & Scripting: Python, Bash, Ruby
  • Database: Postgres,Mysql,Mariadb,Mongodb,elasticsearch and opensearch

Certification

CKA: Certified Kubernetes Administrator


Timeline

Senior Site Reliability Engineer

SYNCHRONOSS TECHNOLOGIES
09.2016 - Current

Linux Systems Administrator

DELTZIA SCM SOLUTIONS
02.2015 - 06.2016

System Support Engineer

MASTERCOM TECHNOLOGY SERVICE
09.2013 - 01.2015

System Administrator

RCS TECHNOLOGIES
07.2011 - 09.2013

MASTER OF COMPUTER APPLICATIONS - Cloud Computing

JAIN (Deemed-to-be University)

Bachelor of Computer Applications -

Dr. C.V. Raman University

Diploma - Electronics and Communication

Board of Technical Education