Summary
Overview
Work History
Skills
Linkedin URL
Certification
Timeline
Generic

Aida Khalelova

Maineville,Ohio

Summary

Highly skilled Site Reliability Engineer with over 5 years of experience in designing, implementing, and maintaining cloud infrastructure on AWS. Proficient in managing and monitoring observability infrastructure using tools such as Datadog, Prometheus, Grafana, and EFK stack. Experienced in developing automation tools and scripts to improve system efficiency and reliability. Strong troubleshooting and problem-solving abilities combined with expertise in networking principles and technologies. Well-versed in CI/CD tools, containerization technologies, and Terraform for streamlined deployment processes. Excellent written and verbal communication skills, with a proven ability to collaborate effectively with cross-functional teams. Committed to staying up-to-date with the latest trends and technologies in the dynamic field of DevOps.

Overview

6
6
years of professional experience
1
1
Certification

Work History

Site Reliability Engineer

ActiveCampaign
01.2021 - Current
  • Managed and maintain AWS infrastructure comprising 100+ EC2 instances, ensuring redundancy and scalability for optimal operation uptime and SLA compliance.
  • Implemented Ansible, CircleCI, and Packer to transform mutable AWS infrastructure into an immutable architecture, reducing deployment times by 50% and enhancing system reliability.
  • Developed Ansible tasks to proactively monitor connections, detecting failures and latency issues to ensure optimal system performance and superior customer experience.
  • Implemented secure retrieval and storage of critical system secrets from AWS Secret Manager, ensuring consistent data integrity across 50+ applications.
  • Streamlined the CI/CD pipeline by migrating from legacy Jenkins to GitHub Actions, enhancing efficiency and eliminating reliance on Jenkins.
  • Utilized observability tools like Datadog and SumoLogic to gain deep insights into platform performance and troubleshoot issues effectively.
  • Led the creation of an AWS EKS cluster using Terraform, showcasing technical expertise and driving innovation during an internal Hackathon project.
  • Leveraged Python scripting and automation techniques to optimize tasks, improving operational efficiency and reducing manual effort.
  • Participated in a 24/365 monitoring and incident response rotation, providing prompt resolution to outages and system issues, and conducting root cause analysis.
  • Contributed to the development of CircleCI custom orbs, automating repetitive processes within the CI/CD workflow.
  • Collaborated closely with development teams and system administrators to align systems with business requirements and ensure optimal performance.

DevOps Engineer

Trimble, Inc.
10.2018 - 12.2020
  • Automated the creation of Lambda functions for immediate notifications triggered by custom CloudWatch alarms, ensuring rapid incident response and resolution.
  • Collaborated with the development team to design and deploy Docker containers, facilitating the transition from a monolithic to a microservices architecture.
  • Implemented monitoring and alerting solutions using CloudWatch, Prometheus, and Grafana, enabling proactive system monitoring, quick incident detection, and efficient automated recovery.
  • Orchestrated the implementation of an EFK stack for logging within an EKS environment, improving log management and troubleshooting capabilities.
  • Configured and maintained highly-scalable, fault-tolerant 3-tier architecture for web applications on AWS, ensuring optimal performance and availability.
  • Utilized Terraform to automate the provisioning and management of RDS MySQL instances, simplifying database deployment and configuration.
  • Automated the creation of IAM roles and policies to ensure consistent and secure access control across different environments.
  • Established robust CI/CD pipelines using Jenkins, enabling frequent and reliable deployments while maintaining system stability.

System Administrator

Trimble, Inc
10.2017 - 10.2018
  • Performed system administration tasks, such as installation, configuration, and monitoring of servers and applications.
  • Implemented and maintained security measures to protect sensitive data and systems, including access controls, encryption, and intrusion detection/prevention systems.
  • Proactively monitored system performance, troubleshooted issues, and optimized system resources to ensure efficient and reliable operation.
  • Collaborated with other IT teams and departments to identify and resolve technical issues and provide timely support to end-users.

Skills

  • AWS: EC2, VPC, S3, ECR, EKS, Lambda, CloudWatch, CloudTrail etc
  • Containerization tools: Docker, Kubernetes
  • Linux Distributions(Ubuntu, CentOS)
  • IaC: Terraform
  • Configuration management: Ansible, Chef
  • Monitoring tools: AWS CloudWatch,Datadog, Prometheus, Grafana
  • Logging tools: Sentry, SumoLogic, Elasticsearch
  • Databases: MySQL, RDS, DynamoDB
  • Virtualization tools: VMware, VirtualBox
  • CI/CD tools: Jenkins, GithubActions, CircleCi
  • Scripting: Bash, Python
  • Agile tools: Trello, Jira

Linkedin URL

https://www.linkedin.com/in/aidaaidarkulov/

Certification

AWS Certified Solutions Architect - Associate

Timeline

Site Reliability Engineer

ActiveCampaign
01.2021 - Current

DevOps Engineer

Trimble, Inc.
10.2018 - 12.2020

System Administrator

Trimble, Inc
10.2017 - 10.2018
Aida Khalelova