Site Reliability Engineer with a drive to automate processes and simplify workloads with modern Infrastructure-as-Code practices and CI/CD pipelines. Self-driven to communicate among different teams to coordinate smooth technical handoffs of projects with detailed, internal documentation. Strong technical aptitude and ability to research and solve complex issues independently using various forms of official or unofficial documentation. Ability to establish processes and automate them to track and implement system solutions. Experience configuring CI/CD pipelines and scripting deployment activities. Expertise in configuration management and Agile & Scrum methodologies.
Overview
11
11
years of professional experience
Work History
Senior Site Reliability Engineer
LexisNexis Risk Solutions, IDAnalytics
08.2021 - Current
Develop Helm charts with Kustomize overlays for multi-environment deployments of API Mocking product utilized by Performance Testing team
Wrote, troubleshot, & maintained sophisticated Azure infrastructure using prebuilt, internally-maintained Terraform Modules. Modules included Azure Kubernetes, Service Bus, Log Analytics, Azure Container Registry, Cost Management Budgets & Alerts
Utilized Terraform-bootstrapped ArgoCD instance within AKS cluster to deploy Helm chart for application
Configured Horizontal Pod Autoscalers based on production traffic patterns to minimize cost and efficiently utilize resources in-cluster
Spearheaded a 6-month project to migrate complex Argo Workflows & Argo Events Batch Processing system from one data center to another, allowing automated billing of client batch requests
Reverse engineered existing architecture of nested Helm charts for replication & created detailed documentation using MermaidJS for architecture and ongoing support
Transitioned ownership of the system to another team while training & mentoring team leads to become SMEs
Rewrote & migrated all Gitlab CI/CD Pipelines for use in Github Actions
Created & managed new Jenkins Pipelines for Github repositories to build and package code
Updated Dockerfile images to upgrade underlying software & create base/intermediary images for deployment
Scoped, planned, & executed plan to manually rotate on-prem Kubernetes CA & Certificates to ensure continued cluster functionality before expiration dates
Updated code for deploying on-prem VM-backed Kubernetes clusters using Terraform, Flatcar Linux, Hyperkube, and Cobbler
Scoped & executed Kubernetes version upgrades across non-prod & production clusters with zero downtime for application workloads
Scoped & executed NetApp Trident version upgrades across non-prod clusters for persistent storage backends to Kubernetes clusters
Used existing Ansible playbooks to provision new VM builds for various internal teams
Utilized Gitlab Pipelines to patch production & non-prod vm operating systems
Site Reliability Engineer
Kyriba
05.2021 - 08.2021
Developed Terraform Module monorepo for rapid deployment of AWS resources
Developed Terraform to deploy AWS resources for platform deployment including: VPC, Subnets, Security Groups, Peering Connections, Routes, ec2 instances
Used Terraform to create Chef configuration for provisioning & bootstrapping of ec2 instances
Utilized on-prem Terraform Enterprise to coordinate build & development of Terraform resources with team members
Collaborated with team members using Bitbucket/Git to coordinate work in branches and pull requests utilizing infrastructure-as-code best practices
Participated in Sprint based development workflow using Jira tickets to organize work (epics, stories, tasks, etc.)
Deployed microservices-based application using Azure Devops on Azure Service Fabric clusters
Ensured http availability checks for microservices were updated across numerous dev and production environments monitored by Azure Application Insights
Developed Docker container to consume Vault secrets for encryption by in-house command line application
Developed Helm chart to deploy Docker container as Kubernetes job for ad-hoc encryption of secrets to be stored in Vault
System Engineer
Kyriba
01.2018 - 05.2021
Worked on Operations team for a global Software-as-a-Service platform providing financial services to enterprise clients
Wrote a web app in Golang which authenticated to Jira, parsed tickets from REST API and formatted resultant text required for SaaS deployments; app was built and compiled into a Docker container
Wrote a command line utility in Golang to simplify ssh key installations to an ftp server
Developed Terraform to deploy 3 node Docker Swarm cluster in VMWare using Chef to bootstrap nodes
Used Jira ticketing to track ticket queues and work to be done
Partially automated DB statistics report by using Python to concatenate multiple csv files decreasing time to generate the report by 85% (from 4 hours spent to 30 min spent)
Modified configuration files for various services on linux (ftp, apache, etc.)
Administered Vmware ESXi infrastructure using vSphere 6.0 web interface
Ensured timely OS updates on a monthly basis for ~500 CentOS/RHEL servers using Spacewalk
Configure SSH key authentication for SFTP endpoints using centralized Chef configuration
System Administrator
CentrexIT
01.2013 - 01.2018
Provided outsourced IT service, support, security and leadership for small and medium-sized businesses in the greater San Diego area
Managed entire IT environments for more than 65 companies, ranging from 10- 250 staff
Administered compute/storage on VMWare ESXi private cloud with ~600 nodes
Managed distributed backup and replication system for 200TB+ of data
Utilized PowerCLI to administer VMs and audit VM configurations
Led team of 10 for managed service provider with small to medium sized Biotech and medical clients
On-boarded new users accurately and reduced processing time through scripting from eight hours down to 10 minutes
Managed large-scale hardware and software upgrade projects relating to network systems
Education
Bachelor of Arts - Interdisciplinary Computing & The Arts - Music
A privately hosted, single-node Kubernetes cluster featuring: K3s distribution running on bare-metal ArgoCD driven gitops workflow, all version controlled in a GitHub repository
Kube-Prometheus-Stack monitoring with alerting connected to a private Slack workspace
Grafana for metrics visualization
MetalLB for service load balancing
Built-in Traefik ingress controller
Internal CoreDNS deployment for service name resolution
Migration from FluxCD to ArgoCD for gitops tool comparison (In Progress)
Traits
Strong adaptability
Self-directed
Well organized
Results-driven
Team player
Empathetic
Innovative
Critical thinker
Timeline
Senior Site Reliability Engineer
LexisNexis Risk Solutions, IDAnalytics
08.2021 - Current
Site Reliability Engineer
Kyriba
05.2021 - 08.2021
System Engineer
Kyriba
01.2018 - 05.2021
System Administrator
CentrexIT
01.2013 - 01.2018
Bachelor of Arts - Interdisciplinary Computing & The Arts - Music