Summary
Overview
Work History
Education
Skills
Timeline
Generic

Daniel Esponda

McKinney,TX

Summary

Highly energized self-starter with 10+ years of experience looking to stay on the leading edge of software. Systems, and site reliability engineering.

Multiple proven successes of leading platform teams to provide best in class, easy to use platforms for the enterprise that reduce operational cost and ensure security and compliance of auditing programs such as PCI, SOC, FedRAMP.

Overview

10
10
years of professional experience

Work History

Staff Site Reliability Engineer

VMware
Remote, TX
11.2021 - Current
  • Provide technical leadership of VMware's largest SaaS Kubernetes Platform with 5,0000+ of nodes and 100+ k8s clusters at 99.99%+ Platform Uptime
  • Led architecture and development of K8s operators to manage all aspects of kubernetes clusters fleet
  • Hands on coding for critical modules of the platform
    Participate and influence code reviews, design reviews for robust and scalable products
  • Architect/Design/Code/Test/Automate major modules of the kubernetes platform
  • Owned delivery of Istio, API gateway, kuberhealthy, vault, and kube prometheus stack tenant services
  • Maintain platform compliance against variety of audits (PCI/HIPPA/FedRAMP)
  • Work across geographical boundaries to deliver well integrated solutions

Senior Site Reliability Engineer

Toyota Connected
Plano, TX
10.2019 - 11.2020
  • Built organization wide platforms that accelerate development and significantly
    reduce cost of production workloads
  • Improved system availability through use of standard kubernetes clusters that
    implemented best practices for scaling and use of AWS spot instances.
  • Implemented software reliability standards, alerting, and incident management by settings standards for key performance indicators of systems
  • Built APIs to allow teams to self service administration of Vault, ELK, and AWS
    account administration with security guardrails.
  • Automated AWS accounts policies through policy as code via cloud custodian.
    Deployed global ELK cluster handling 3TB/day of data ingest at a small fraction of
    cost of commercial applications.

Site Reliability Engineer Manager

Capital One
Plano, TX
02.2016 - 10.2019
  • SRE for enterprise-wide Kubernetes Cluster
  • Provide Enterprise Kubernetes coaching
  • Provide AWS and GCP expertise to teams deploying to the cloud
  • Manage the Architecture, Site Reliability Practice, Chaos Engineering of the Kubernetes platform for the Enterprise in AWS and GCP
  • Advise on Cloud Native Software Architecture
  • Migrated over 100 + Microservices from On Premise Servers to AWS and then onto Kubernetes

Education

Bachelor of Science - Computer Science

The University of Texas At Dallas
Richardson, TX
2013

Skills

  • Kubernetes
  • Docker
  • Monitoring (Prometheus, Datadog, Wavefront)
  • Kubernetes Operators
  • Golang
  • Python
  • Java
  • Databases (Oracle DB, MySQL, MongoDB)
  • Networking (TCP, DNS, HTTP, HTTPS)
  • Linux
  • Terraform
  • Agile Methodologies
  • AWS
  • APM

Timeline

Staff Site Reliability Engineer

VMware
11.2021 - Current

Senior Site Reliability Engineer

Toyota Connected
10.2019 - 11.2020

Site Reliability Engineer Manager

Capital One
02.2016 - 10.2019

Bachelor of Science - Computer Science

The University of Texas At Dallas
Daniel Esponda