Results-driven Staff Site Reliability Engineer with over 15 years of experience in designing, implementing, and maintaining scalable systems. Proven expertise in leveraging automation, monitoring, and incident response to enhance system reliability and performance. Adept at collaborating with cross-functional teams to drive best practices in software development and operations. Committed to fostering a culture of continuous improvement and resilience in high-availability environments.
Programming - Python,Bash,Go
IaaC - Terraform, Ansible
Version Control - Git
Documentation/Sprint Managment - Confluence,Jira
NOSql - Apache cassandra,Redis,Mongodb ,Druid,memcached,Hbase
Logging - Fluentd, fluentbit,splunk,elk
RDBMS - Mysql, Postgresql
Container Orchestration and Cloud - Kubernetes,Docker,AWS,Azure,GCP
Queueing/Stream Processing/Big Data - Kafka, Rabbitmq, Pulsar, Apache Storm,Hadoop
Incident Management - Pagerduty
Deployment Pipeline - CircleCI, Argo, Jenkins
Monitoring & Observability - Prometheus,Grafana,NewRelic
Ingress Proxies - Apache TrafficServer,Nginx
IPCOM000250809D - Method and System for Gossip Protocol based disaster recovery for Distributed web systems.
IPCOM000223854D - Method and System for Access Control of Mobile Network Applications using Fingerprints as Structural Patterns.
I used to travel with family, play boardgames like checkers, chess, Sequence and interest reading books in different domains.