Summary
Overview
Work History
Education
Skills
Early Experience
Technical Skills
Timeline
Generic
Jacob Garrett

Jacob Garrett

Houston,TX

Summary

Data Professional with experience designing, developing, and implementing data-driven solutions on AWS, Azure, and Google Cloud platforms. Proven track record of optimizing data CI/CD pipelines, building scalable data warehouses, and automating data workflows and infrastructure. Proficient in DevOps principles and tools, including Docker containerization and Terraform, ensuring efficient and reliable deployments. Strong problem-solving skills and ability to collaborate effectively with cross-functional teams to deliver high-quality, data-driven solutions that drive business value.

Overview

7
7
years of professional experience
5
5

Data Engineer Experience

Work History

Senior Data Engineer

CATERPILLAR INC, HELIOS PLATFORM
Peoria , IL
2023.06 - 2023.11
  • Identified and eliminated Snowflake platform wasteful spending, reducing cloud warehousing costs by 15% through workload restructuring and automated shutdowns
  • Implemented warehouse adjustments, notifications, and automated shutdowns, optimizing Snowflake costs and improving efficiency by 20%
  • Streamlined schema change processes for 5 teams, ensuring 100% data consistency and reducing errors by 30%, replacing Flyway
  • Utilized Terraform to provision and manage 20 Google Cloud Storage buckets, improving data storage efficiency by 25%
  • Established CI/CD templates using GitLab CI/CD for automated schema changes, streamlining development workflows by 40%
  • Developed and deployed Python Docker microservice, enabling real-time monitoring and adjustment of Snowflake costs, saving $10,000 monthly
  • Enhanced Azure DevOps pipelines with Python and Azure CI/CD in PLF framework, increasing efficiency by 15%
  • Reduced deployment errors by 10% through rigorous testing and refinement of development and production pipelines
  • Implemented critical one-time changes in AWS Sage Maker within sprint deadlines, utilizing Snowflake (SQL and Python) for problem identification and resolution, saving 50 hours of development time.
  • Optimized existing cloud environments by identifying bottlenecks and implementing best practices for performance improvements
  • Managed disaster recovery planning for critical applications hosted in cloud environment, minimizing potential data loss or service outages during unforeseen events
  • Evaluated new cloud technologies and services, recommending strategic investments to support business objectives such as using Snowflake Snow Park, Kafka, and Snowflake Schema Change
  • Led training sessions on cloud technologies, increasing team knowledge and fostering culture of continuous learning
  • Championed containerization initiatives using Docker/Kubernetes platforms that increased application scalability while reducing overheads associated with traditional virtualization techniques
  • Performed comprehensive analyses of existing workloads; identified opportunities to optimize their placement across various compute instances

Data Engineer

SONOBI, Push Data Team
2022.02 - 2023.06
  • Overhauled and optimized Amply Media data pipelines, migrating to Snowflake and Airflow, improving efficiency by 30%
  • Implemented advanced Snowflake features, boosting data processing efficiency by 40% and pipeline performance by 25%, saving $50,000 in costs
  • Developed Slack-integrated notification systems, enabling real-time pipeline monitoring and assisting in AWS to bare metal transition, reducing costs by 20%
  • Established testing protocols for data from 10+ API sources, ensuring 99% data quality and lineage
  • Pioneered AWS Lambda adoption, converting 20 Python scripts into event-driven processes, enabling real-time data handling from website interactions
  • Co-authored AWS infrastructure re-architecture, increasing data processing efficiency by 25% and reducing costs by 40%
  • Identified and eliminated inefficiencies, replacing 15 costly EC2 instances with Apache Spark architecture, saving $75,000 annually
  • Improved data quality by ensuring 100% data consistency across all pipelines.
  • Collaborated with cross-functional teams to define requirements and develop end-to-end solutions for complex data engineering projects
  • Optimized data pipelines by implementing advanced ETL processes and streamlining data flow
  • Reduced operational costs by automating manual processes and improving overall data management efficiency
  • Analyzed large datasets to identify trends and patterns in customer behaviors
  • Prepared documentation and analytic reports, delivering summarized results, analysis and conclusions to stakeholders
  • Built databases and table structures for web applications
  • Explained data results and discussed how best to use data to support project objectives

Data Engineer

MUFG UNION BANK, Project Shared Services Team
2021.01 - 2022.02
  • Spearheaded implementation of version control automation tool, resulting in 25% improvement in code management efficiency and 30% enhancement in data management application functionality
  • Designed and refined 12 Tableau dashboards, advancing data visualization capabilities for production teams by 40%, enabling data-driven decision-making and strategic planning that led to 15% increase in operational efficiency
  • Ensured project success by leading quality assurance and architectural design for 8 projects, maintaining rigorous standards that reduced errors by 20%, and mentoring 5 developers to foster culture of excellence
  • Developed innovative ELT and ETL applications that generated labor cost savings of over $100,000 annually and optimized data management and reporting workflows, directly enhancing leadership decision-making capabilities and reducing time-to-insight by 30%
  • Transformed user experience by overhauling web interface with React, resulting in 50% improvement in user engagement, 25% increase in efficiency, and 35% boost in usability scores based on user feedback
  • Aligned business objectives with technical requirements through close collaboration with product owners providing insights based on available data sources
  • Established strong working relationships with stakeholders across multiple departments, facilitating clear communication channels regarding project requirements and progress updates
  • Developed custom algorithms and advanced analytics solutions tailored to specific needs of various business units, enabling actionable insights that drove informed decision making
  • Implemented monitoring systems to proactively identify potential bottlenecks in data pipeline, ensuring optimal performance at all times

Data Engineer

ENTERPRISE PRODUCTS, Big Data Team
2019.11 - 2021.01
  • Developed Pi Web API application, improving Apache Influx Database data extraction efficiency by 15% and enhancing data handling and accessibility
  • Led Texas Railroad Commission (TRC) web scraping initiative, improving data completeness and addressing data integrity issues
  • Created Flask web applications for PI Data Collector tag data validation and Power Optimization, streamlining data verification and optimization
  • Integrated Alteryx into data pipeline, boosting pipeline performance by 5% and enhancing data processing capabilities
  • Optimized data processing by implementing Hadoop (MapR) and Spark frameworks for big data management
  • Improved collaboration between teams by creating comprehensive documentation detailing technical aspects of various big data solutions
  • Proactively addressed potential bottlenecks in existing and new ETL processes through regular monitoring, enabling seamless workflow operations
  • Designed scalable ETL pipelines for improved data ingestion, processing, and storage
  • Optimized data processing by implementing Hadoop and Spark frameworks for big data management
  • Automated routine tasks through scripting languages, reducing manual effort and human error risks

Education

BACHELOR’S DEGREE -

University of Texas / Austin, Texas
Austin
01.2017

Skills

  • Data Engineering: ETL/ELT development (SQL, Python, Airflow), Data Warehousing (Snowflake), Data Processing (Apache Spark), Cloud Platforms (AWS, Azure), CI/CD Pipelines (Azure DevOps, Jenkins), Containerization (Docker, Kubernetes).
  • Cloud Engineering: AWS/Azure/GCP (Basic) Services, Infrastructure as Code (Terraform), Containerization & Orchestration (Docker, Kubernetes, Prefect, Airflow, Azkaban), CI/CD Implementation, DevOps Methodologies.
  • Programming & Development: Python, SQL, Object-Oriented Programming, API Integration.
  • Databases: Relational (Snowflake, Postgres), NoSQL (InfluxDB)
  • Data Visualization: Tableau, Power BI, Looker
  • Web Development: React (Basic), Web Scraping Techniques
  • Tools & Methodologies: Version Control (Git), Agile Methodologies, Data Analysis (Alteryx, Snowflake)

Early Experience

FROM 2015 TO 2019:


Utilized SQL and Python to develop data pipelines for data extraction, transformation, and loading (ETL) processes, gaining foundational knowledge in data engineering principles. Designed and implemented data models to support business intelligence (BI) tools, familiarizing yourself with data modeling concepts for data warehousing. Contributed to data analysis projects by creating data visualizations to communicate insights to stakeholders, demonstrating early data storytelling skills.

Technical Skills

SQL, Python, Airflow, Snowflake, Apache Spark, AWS, Azure, Azure DevOps, Jenkins, Docker, Kubernetes, Basic, Terraform, Docker, Kubernetes, Prefect, Airflow, Azkaban, DevOps Methodologies, Snowflake, Postgres, InfluxDB, Tableau, Power BI, Looker, React (Basic), Web Scraping Techniques, Git, Alteryx, Snowflake

Timeline

Senior Data Engineer

CATERPILLAR INC, HELIOS PLATFORM
2023.06 - 2023.11

Data Engineer

SONOBI, Push Data Team
2022.02 - 2023.06

Data Engineer

MUFG UNION BANK, Project Shared Services Team
2021.01 - 2022.02

Data Engineer

ENTERPRISE PRODUCTS, Big Data Team
2019.11 - 2021.01

BACHELOR’S DEGREE -

University of Texas / Austin, Texas
Jacob Garrett