Summary

Overview

Work History

Education

Skills

Additional Information

Timeline

Nitesh Brahmachandra Vaidyanath

San Jose,CA

Summary

I am seeking a Senior or Principal DevOps engineer / Site Reliability Engineer position in a reputed organization for challenges that would utilize my multi-platform experience as well as enrich my knowledge and skills.

Overview

years of professional experience

Work History

Principal Software Engineer

Palo Alto Networks

12.2023 - Current

Design, develop and implement highly scalable software features and infrastructure on our next-generation security platform ready for cloud native deployment from inception to completion
Work with different development and quality assurance groups to achieve the best quality - You accomplish this by being hands-on, creating tools, processes, and systems that produce transparency, alignment, and direction
Profile, optimize and tune systems software (management/control/dataplane) for efficient cloud operation

Site Reliability Engineer Manager

Apple

03.2023 - 11.2023

Act as the Service Owner, designing and mapping key performance indicators to achieve the organization’s mission
Lead the definition of requirements, priorities and planning of engineering deliverables
Implement structured engineering and operations processes
Lead the team in daily agile SRE practices, ensuring proper team focus on priorities, achievements, and deliverables
Optimize velocity and efficiency of delivery, and drive continuous improvement

Site Reliability Engineer

Apple

03.2022 - 02.2023

Worked on creating terraform provider for getting infrastructure details from inventory to update Netscaler LB.
Worked on kubernetes operator for managing CRUD operations on GSLB and DNS.
Setting up alerts and dashboards for applications.
Automation using golang for failing over traffic from GSLB.
Onboarding new apps on Kubernetes.
Migrating existing microsevices from bare metal to kubernetes.
Onboarding PCI and non PCI applications.
Managing API gateway (nginx) configs for applications.
On call duties.

Senior Site Reliability Engineer

Palo Alto Networks

01.2019 - 03.2022

Worked on migrating microservices to kubernetes cluster.
Automated aws infrastructure, eks and gke cluster using terraform.
Architecture design replacing ELB with http api gateway and vpc link between istio ingress lb and api gateway.
Created kubernetes operator using kubebuilder.
Istio mesh setup using operator, adding virtual service, destination rules and ingress gateway.
Automated bringing up entire kubernetes infrastructure and application services using terraform.
Worked on integrating with GitLab which supports terraform IaC CI/CD.
Worked on setting up cortex, loki and tempo for observability stack.
Setting up gitlab CI/CD for kubernetes deployment using terraform.
Contributed to cortex open source project.
Worked on creating consul golang modules.
Major consul, vault and mongoDB upgrade.
Automated using golang and python for aggregating application metrics (Application which are not using prometheus endpoints) Grafana cloud agent and vmagent for scraping metrics from prometheus endpoints.
Implemented Kubernetes event driven autoscaling for kafka consumers.
Upgrading ubuntu from to 14.04 to 18.04.
Troubleshooting application issues.
Setting up grafana with mysql as backend storage to have exact same configs, datasource and dashboards in staging and production environment.
Helped in hiring and team building.
Worked on setting up strimzi kafka on kubernetes cluster.
Locust cluster for testing kafka load.
Mentoring new hires.

Site Reliability Engineer

Cisco

06.2018 - 12.2018

Worked on big data analysis on client events.
Used Apache spark for batch data processing (pyspark), storing it on hdfs as parquet file.
Impala to query data from parquet file.
Used Qlik Sense for visualization.

DevOps Engineer

Netskope

06.2015 - 05.2018

Configuring, managing and troubleshooting issues.
Using Ansible to deploy new services to production machines.
Automated complete aws stack using terraform and ansible.
Planning and creating virtual machines or provisioning physical machine or aws instance (using terraform) or docker container as per the application requirements.
Deploying using Jenkins.
Configuring Load balancer (F5, nginx and Haproxy).
Building debian package or docker container using Jenkins.
Troubleshoot issues and escalating to developers to fix the bug in code.
Data Center management.
Infrastructure planning and deploying.

NOC Administrator

DreamWorks Animation Bengaluru

04.2015 - 05.2015

Quickly learned new skills and applied them to daily tasks, improving efficiency and productivity.
Carried out day-day-day duties accurately and efficiently.

Platform Operations Engineer

Akamai Technologies

03.2012 - 04.2015

Member of the network team within the Network Operations Centre co-ordinating with ISPs on networking issues.
Troubleshooting day to day networking issues, like packet loss, connectivity issues telnetting into routers and switches.
Working knowledge of BGP .Good knowledge of routing protocols such as RIP, EIGRP, OSPF etc.
Automated manual tasks using Perl and bash scripts.
Also part of core group within the NOC for automating procedure guidelines for issue handling.
Handled server software releases and maintenances.
Good knowledge in perforce.
Have undergone training at CCNA level for networking.
Handled multiple high priority incidents while rallying resources during the same.
Worked with other stakeholders and pursued root cause determination.

Education

M. Tech - Computer And Information Sciences

Birla Institute of Technology

Pilani

01.2018

B. Tech - Telecommunication

AMC Engineering College

Bengaluru, India

06.2011

Skills

Container Orchestration (Kubernetes)
Cloud Computing ( AWS and GCP)
IaC (Terraform)
DR Architecture for High Availably and Reliability of services
Configuration Management (Ansible and Puppet)
Programming Language ( Golang, Python and bash script)

DBA ( MongoDB, Redis, Elasticsearch, MariaDB, Redis)
Observability Stack (Prometheus, Cortex, Loki, Thanos, Grafana, TICK)
CI/CD (Gitlab and Jenkins)
Istio service mesh
Architecture design
Software Development Lifecycle

Additional Information

ACADEMIC PROJECTS:

My objective was to create a general active monitoring system where performance could be measured in “Real time” . Based on the data, traffic can be re routed to different servers. Netskope is a CASB(Cloud security) company where customers data will be monitored, So all the traffic needs to be sent to Netskope proxy to analyze the customer traffic on real time so performance needs to be monitored all the time to have a better response back to the customer. Passive monitoring is a proxy server, all the client traffic should be directed to proxy server and based on the current cpu,memory,iostat, number of connections etc., performance of the traffic is analyzed by capturing the response time at each hops i.e., client → proxy → server. If there is a performance degradation, then traffic should be diverted to some other available proxy to get a better performance. By data analysis, one can know the threshold for the better performance in terms of cpu,memory,io,network stats etc. If sufficient proxies are unavailable then this system should be able to create either cloud(aws) instance or container proxy.

Timeline

Principal Software Engineer

Palo Alto Networks

12.2023 - Current

Site Reliability Engineer Manager

Apple

03.2023 - 11.2023

Site Reliability Engineer

Apple

03.2022 - 02.2023

Senior Site Reliability Engineer

Palo Alto Networks

01.2019 - 03.2022

Site Reliability Engineer

Cisco

06.2018 - 12.2018

DevOps Engineer

Netskope

06.2015 - 05.2018

NOC Administrator

DreamWorks Animation Bengaluru

04.2015 - 05.2015

Platform Operations Engineer

Akamai Technologies

03.2012 - 04.2015

M. Tech - Computer And Information Sciences

Birla Institute of Technology

B. Tech - Telecommunication

AMC Engineering College

Nitesh Brahmachandra Vaidyanath

Summary

Overview

Work History

Principal Software Engineer

Site Reliability Engineer Manager

Site Reliability Engineer

Senior Site Reliability Engineer

Site Reliability Engineer

DevOps Engineer

NOC Administrator

Platform Operations Engineer

Education

M. Tech - Computer And Information Sciences

B. Tech - Telecommunication

Skills

Additional Information

Timeline

Principal Software Engineer

Site Reliability Engineer Manager

Site Reliability Engineer

Senior Site Reliability Engineer

Site Reliability Engineer

DevOps Engineer

NOC Administrator

Platform Operations Engineer

M. Tech - Computer And Information Sciences

B. Tech - Telecommunication

Similar Profiles

Sai Krishna KamireddySai Krishna Kamireddy

JOSHUA FITCHETTJOSHUA FITCHETT

Julia ZengJulia Zeng

Aditya KumarAditya Kumar

Cambden HadleyCambden Hadley