Summary
Overview
Work History
Education
Skills
Websites
Portfolios
Accomplishments
Traits
Timeline
Generic

DARRELL TANG

San Diego,CA

Summary

Site Reliability Engineer with a drive to automate processes and simplify workloads with modern Infrastructure-as-Code practices and CI/CD pipelines. Self-driven to communicate among different teams to coordinate smooth technical handoffs of projects with detailed, internal documentation. Strong technical aptitude and ability to research and solve complex issues independently using various forms of official or unofficial documentation. Ability to establish processes and automate them to track and implement system solutions. Experience configuring CI/CD pipelines and scripting deployment activities. Expertise in configuration management and Agile & Scrum methodologies.

Overview

11
11
years of professional experience

Work History

Senior Site Reliability Engineer

LexisNexis Risk Solutions, IDAnalytics
08.2021 - Current
  • Develop Helm charts with Kustomize overlays for multi-environment deployments of API Mocking product utilized by Performance Testing team
  • Wrote, troubleshot, & maintained sophisticated Azure infrastructure using prebuilt, internally-maintained Terraform Modules. Modules included Azure Kubernetes, Service Bus, Log Analytics, Azure Container Registry, Cost Management Budgets & Alerts
  • Utilized Terraform-bootstrapped ArgoCD instance within AKS cluster to deploy Helm chart for application
  • Configured Horizontal Pod Autoscalers based on production traffic patterns to minimize cost and efficiently utilize resources in-cluster
  • Spearheaded a 6-month project to migrate complex Argo Workflows & Argo Events Batch Processing system from one data center to another, allowing automated billing of client batch requests
  • Reverse engineered existing architecture of nested Helm charts for replication & created detailed documentation using MermaidJS for architecture and ongoing support
  • Transitioned ownership of the system to another team while training & mentoring team leads to become SMEs
  • Rewrote & migrated all Gitlab CI/CD Pipelines for use in Github Actions
  • Created & managed new Jenkins Pipelines for Github repositories to build and package code
  • Updated Dockerfile images to upgrade underlying software & create base/intermediary images for deployment
  • Scoped, planned, & executed plan to manually rotate on-prem Kubernetes CA & Certificates to ensure continued cluster functionality before expiration dates
  • Updated code for deploying on-prem VM-backed Kubernetes clusters using Terraform, Flatcar Linux, Hyperkube, and Cobbler
  • Scoped & executed Kubernetes version upgrades across non-prod & production clusters with zero downtime for application workloads
  • Scoped & executed NetApp Trident version upgrades across non-prod clusters for persistent storage backends to Kubernetes clusters
  • Used existing Ansible playbooks to provision new VM builds for various internal teams
  • Utilized Gitlab Pipelines to patch production & non-prod vm operating systems

Site Reliability Engineer

Kyriba
05.2021 - 08.2021
  • Developed Terraform Module monorepo for rapid deployment of AWS resources
  • Developed Terraform to deploy AWS resources for platform deployment including: VPC, Subnets, Security Groups, Peering Connections, Routes, ec2 instances
  • Used Terraform to create Chef configuration for provisioning & bootstrapping of ec2 instances
  • Utilized on-prem Terraform Enterprise to coordinate build & development of Terraform resources with team members
  • Collaborated with team members using Bitbucket/Git to coordinate work in branches and pull requests utilizing infrastructure-as-code best practices
  • Participated in Sprint based development workflow using Jira tickets to organize work (epics, stories, tasks, etc.)
  • Deployed microservices-based application using Azure Devops on Azure Service Fabric clusters
  • Ensured http availability checks for microservices were updated across numerous dev and production environments monitored by Azure Application Insights
  • Developed Docker container to consume Vault secrets for encryption by in-house command line application
  • Developed Helm chart to deploy Docker container as Kubernetes job for ad-hoc encryption of secrets to be stored in Vault

System Engineer

Kyriba
01.2018 - 05.2021
  • Worked on Operations team for a global Software-as-a-Service platform providing financial services to enterprise clients
  • Administered on-premise Docker Swarm clusters to manage services & container deployments
  • Wrote a web app in Golang which authenticated to Jira, parsed tickets from REST API and formatted resultant text required for SaaS deployments; app was built and compiled into a Docker container
  • Wrote a command line utility in Golang to simplify ssh key installations to an ftp server
  • Developed Terraform to deploy 3 node Docker Swarm cluster in VMWare using Chef to bootstrap nodes
  • Used Jira ticketing to track ticket queues and work to be done
  • Partially automated DB statistics report by using Python to concatenate multiple csv files decreasing time to generate the report by 85% (from 4 hours spent to 30 min spent)
  • Modified configuration files for various services on linux (ftp, apache, etc.)
  • Administered Vmware ESXi infrastructure using vSphere 6.0 web interface
  • Ensured timely OS updates on a monthly basis for ~500 CentOS/RHEL servers using Spacewalk
  • Configure SSH key authentication for SFTP endpoints using centralized Chef configuration

System Administrator

CentrexIT
01.2013 - 01.2018
  • Provided outsourced IT service, support, security and leadership for small and medium-sized businesses in the greater San Diego area
  • Managed entire IT environments for more than 65 companies, ranging from 10- 250 staff
  • Administered compute/storage on VMWare ESXi private cloud with ~600 nodes
  • Managed distributed backup and replication system for 200TB+ of data
  • Utilized PowerCLI to administer VMs and audit VM configurations
  • Led team of 10 for managed service provider with small to medium sized Biotech and medical clients
  • On-boarded new users accurately and reduced processing time through scripting from eight hours down to 10 minutes
  • Managed large-scale hardware and software upgrade projects relating to network systems

Education

Bachelor of Arts - Interdisciplinary Computing & The Arts - Music

University of California - San Diego
La Jolla, CA

Skills

    • Gitlab/Github
    • AWS/Azure
    • AWS CLI
    • Cloud Networking
    • SaaS Operations
    • Terraform/TF Enterprise
    • Vault
      • Packer
      • Golang & Python
      • Bash & PowerShell
      • Linux Administration
      • VMWare
      • CentOS/RHEL
      • Kubernetes/GitOps

Portfolios

  • github.com/DarrellTang/argocd-k3s
  • github.com/DarrellTang/sre-exercises

Accomplishments

  • A privately hosted, single-node Kubernetes cluster featuring: K3s distribution running on bare-metal ArgoCD driven gitops workflow, all version controlled in a GitHub repository
  • Kube-Prometheus-Stack monitoring with alerting connected to a private Slack workspace
  • Grafana for metrics visualization
  • MetalLB for service load balancing
  • Built-in Traefik ingress controller
  • Internal CoreDNS deployment for service name resolution
  • Migration from FluxCD to ArgoCD for gitops tool comparison (In Progress)

Traits

  • Strong adaptability
  • Self-directed
  • Well organized
  • Results-driven
  • Team player
  • Empathetic
  • Innovative
  • Critical thinker

Timeline

Senior Site Reliability Engineer

LexisNexis Risk Solutions, IDAnalytics
08.2021 - Current

Site Reliability Engineer

Kyriba
05.2021 - 08.2021

System Engineer

Kyriba
01.2018 - 05.2021

System Administrator

CentrexIT
01.2013 - 01.2018

Bachelor of Arts - Interdisciplinary Computing & The Arts - Music

University of California - San Diego
DARRELL TANG