Summary
Overview
Work History
Education
Skills
Affiliations
Additional Information
Certification
Travelling, Playing Board Games,Reading.
Timeline
Generic

Narasimhan Venkadeswaran

San Jose,California

Summary

Results-driven Staff Site Reliability Engineer with over 15 years of experience in designing, implementing, and maintaining scalable systems. Proven expertise in leveraging automation, monitoring, and incident response to enhance system reliability and performance. Adept at collaborating with cross-functional teams to drive best practices in software development and operations. Committed to fostering a culture of continuous improvement and resilience in high-availability environments.

Overview

17
17
years of professional experience
1
1
Certification

Work History

STAFF RELIABILITY ENGINEER

DOMINO DATALAB
04.2021 - Current
  • Architected and implemented E2E migration of self hosted teleport unified access plane for domino to cloud hosted ensuring safe and secure access plane for domino cloud and on-prem deployments. This was a massive effort starting from designing and implementing terraform repo as modules for teleport cloud deployment using Circleci, creating helm charts from scratch for k8s client interaction, designing and implementing Breakglass procedure, audit logging, granular RBAC in k8s, Alert integration on pagerduty, thorough documentation, interaction with multiple teams and multiple enablement sessions for the entire engineering org.
  • Embedded SRE for platform teams championing and driving adoption of DevOps culture including but not limited to reducing incident costs, defining definition of done for new and existing service, monitoring and observability, training teams to run postmortems, reducing monitoring ingestion costs by 30% focusing on dimensionality reduction etc.,
  • Designed and implemented event routing capability end to end on existing IaaS framework for monitoring and alerting using terraform for all of domino customers achieving 30% automated incident reduction.
  • Part of 2 member team creating SLO framework for critical user journeys for domino based on terraform and Openslo framework creating value for customers and engineering. Work involved exploring current Sli , identifying gaps and working with engineering teams in addressing those gaps, creating SLO and finally enable alerting for different destinations like slack , pagerduty , jira etc .,
  • Created python libraries and checks for critical services improving observability for those services and reducing MTTR during customer incidents.
  • Owning and maintaining teleport (unified access plane) for all domino deployments. Responsibilities include 1.Creating CI/CD pipeline using Circleci for upgrade 2.Building teleport AMIs for Domino 3.Managing helm charts for teleport agent 4.Defining and Implementing RBAC support in teleport for different roles within org 5.Maintaining Infrastructure using terraform etc.,

PRINCIPAL ENGINEER

YAHOO
12.2007 - 04.2021
  • Principal production engineer owned and operated media content platform for yahoo serving 100k rps. Some achievements includes scaling cassandra backend 30TB in 5 Data Centers , Redesign and implement platform working with engineering to reduce e2e latency from 1.5s to 500ms and for 4 9's availability. Received Yahoo Spot and U Rock award.
  • Principal production engineer managing unified API middle tier ran on Kubernetes serving 50k rps for yahoo sites finance , sports , home page etc., Technical skillsets covers ingress routing on k8s , Servicemesh using Istio , MBP Canary analysis using kayenta, mentoring junior engineers.
  • Senior Tech Lead for Media Analytics Real Time Pipeline Druid , Apache Storm, Apache Kafka and Api Stacks to provide real time analytics on content for editor. Role includes Maintainence and upgrading Realtime Analytics platform, monitoring , observability, Championed and adopted Imply Pivot a powerful visualization tool for realtime analytics.
  • Tech Lead managing private cloud (based on Chroot jail) for yahoo media hosting 100s of frontend (nodejs) and backend applications ( java).

Education

Master of Arts - Economics

MKU University Madurai
04.2014

Master of Science - Multimedia Technology

College of Engineering Guindy
Chennai
04.2006

Bachelor of Technology - Information Technology

Madurai Kamaraj University
03.2004

Skills

Programming - Python,Bash,Go

IaaC - Terraform, Ansible

Version Control - Git

Documentation/Sprint Managment - Confluence,Jira

NOSql - Apache cassandra,Redis,Mongodb ,Druid,memcached,Hbase

Logging - Fluentd, fluentbit,splunk,elk

RDBMS - Mysql, Postgresql

Container Orchestration and Cloud - Kubernetes,Docker,AWS,Azure,GCP

Queueing/Stream Processing/Big Data - Kafka, Rabbitmq, Pulsar, Apache Storm,Hadoop

Incident Management - Pagerduty

Deployment Pipeline - CircleCI, Argo, Jenkins

Monitoring & Observability - Prometheus,Grafana,NewRelic

Ingress Proxies - Apache TrafficServer,Nginx

Affiliations


  • CKAD Certified Kubernetes Application Developer
  • Issued Jul 2022 · Expires Jul 2025
  • Credential ID LF-72g4g7hudy
  • HashiCorp Certified: Terraform Associate
    Expires May 2024
  • Credential ID 9cc04409-823d-45c5-b81b-e09cc727b5d6
  • CKA: Certified Kubernetes Administrator. Expires Apr 2025 Credential ID LF5qk3imd1qlCredential ID LF5qk3imd1ql
  • Microsoft Azure FundamentalsAZ-900 Issued Feb 2022 Credential ID 992669881
  • AWS Certified Cloud Practitioner
  • Expires Dec 2024
  • Credential ID AWS02543957Credential ID AWS02543957
  • MYSQL DBA Administrator
  • RHCA (Redhat Certified Administrator)



Additional Information


IPCOM000250809D - Method and System for Gossip Protocol based disaster recovery for Distributed web systems.

IPCOM000223854D - Method and System for Access Control of Mobile Network Applications using Fingerprints as Structural Patterns.

Certification

  • CKAD Certified Kubernetes Application Developer
  • Issued Jul 2022 · Expires Jul 2025
  • Credential ID LF-72g4g7hudy
  • HashiCorp Certified: Terraform AssociateExpires May 2024
  • Credential ID 9cc04409-823d-45c5-b81b-e09cc727b5d6
  • CKA: Certified Kubernetes Administrator. Expires Apr 2025 Credential IDLF5qk3imd1qlCredential ID LF5qk3imd1ql
  • Microsoft Azure FundamentalsAZ-900 Issued Feb 2022 Credential ID 992669881
  • AWS Certified Cloud Practitioner
  • Expires Dec 2024
  • Credential ID AWS02543957Credential ID AWS02543957
  • MYSQL DBA Administrator
  • RHCA (Redhat Certified Administrator)

Travelling, Playing Board Games,Reading.

I used to travel with family, play boardgames like checkers, chess, Sequence and interest reading books in different domains.

Timeline

STAFF RELIABILITY ENGINEER

DOMINO DATALAB
04.2021 - Current

PRINCIPAL ENGINEER

YAHOO
12.2007 - 04.2021

Master of Arts - Economics

MKU University Madurai

Master of Science - Multimedia Technology

College of Engineering Guindy

Bachelor of Technology - Information Technology

Madurai Kamaraj University
Narasimhan Venkadeswaran