Summary
Overview
Work History
Education
Skills
Websites
Projects
Timeline
Generic
SAI NITHIN MAIPATH

SAI NITHIN MAIPATH

Frisco,TX

Summary

  • Results-driven Data Engineer with 3+ years' experience in building and managing modern data stacks using modern deploying tools. I specialized in creating efficient data pipelines, cloud infrastructure automation, implementing data governance and Data Ops practices using GEN AI.
  • Explored all the modern-day DevOps tools like harness and implemented CI/CD and chaos engineering along with intelligent rollbacks in a unified platform using harness delegates.
  • Capable of extracting open-source codes and extending them to the company needs and Integrated AI with the resources and reduced the tool dependencies for data quality and other tasks
  • I work with all the traditional data stacks and modern data stacks and can work on different tools for different data engineering and data analysis tasks
  • In-depth knowledge of Apache Iceberg table formats and used its ACID capabilities and integration of it with Trino query engine for a complete data transformation of parquet, AVRO files to aggregated tables in medallion architecture.
  • Worked with large unstructured data sets and helped in driving valuable insights using big data technologies and Integrated machine learning workflows into data pipelines, streamlining model deployment and management using Airflow. Optimized data preprocessing and feature engineering processes to enhance model performance and scalability, enabling seamless collaboration between data engineering and ML teams.

Overview

5
5
years of professional experience

Work History

Data Engineer

Terces Solutions
Dallas
05.2024 - Current
  • Designed and deployed scalable infrastructures using Terraform and GCP components to centralize operations for grocery stores and restaurants, integrating over 90 resources and streamlining CI/CD workflows with GitHub Actions and harness
  • Built and optimized ETL pipelines using Airbyte, Docker, and Spark jobs in Big Query, enabling seamless data ingestion and transformation to support real-time insights on sales, inventory, and operational trends
  • Ensured availability and data quality using Data plex
  • Integrated Lang chain with Airbyte to ask questions about my processing and batching data to AI
  • Managed data lakes with Apache Iceberg and Trino, delivering actionable insights for promotions, combo offers, and pricing strategies while improving query performance and scalability
  • Developed dashboards in Apache Superset, visualizing operational metrics and sales trends, empowering stakeholders to make data-driven decisions and enhance profitability
  • Implemented cost optimization strategies, including automated compute shutdowns post-transformation, reducing cloud expenses by 25% while ensuring uninterrupted service delivery

Data Engineer

Infosys LTD
India
08.2021 - 07.2022
  • Designed automated infrastructure provisioning for telecom clients using Terraform and AWS CloudFormation, reducing deployment time by 30% and enabling seamless scalability
  • Built modular transformation workflows with dbt in BigQuery, creating aggregated views by creating schema for tables that enabled business insights for customer retention strategies
  • PostgreSQL database is used as metadata storage and small file storage in the stages of transformation for future analysis
  • Orchestrated ETL pipelines with Airflow, rapidly resolving bottlenecks and ensuring uninterrupted data flow for high-volume transactional datasets
  • Stored meta data in NoSQL databases like MongoDB and other MySQL databases that is managed in nessie catalog to avoid data loss in container refreshing or re-deployments
  • Optimized query performance in Big Query and Redshift, improving analytical speeds by 20% for monthly customer reports and promotions
  • Collaborated with cross-functional teams to standardize data mapping and workflows, reducing onboarding time for new projects by 25% and collaboratively worked with data models to make most out of the business required metrics
  • Conducted cost analysis for cloud resources, implementing optimizations that saved 15% in operational expenses without compromising performance

Data Engineer

Terces Solutions
India
08.2019 - 07.2021
  • Optimized cloud infrastructure for F&B clients using AWS IAM, EC2, VPC, and CloudFormation, automating the deployment of over 20+ resources, reducing setup time by 30%
  • I used to manage terraform scripts and develop startup- scripts using Shell Scripting and Python for all Instances, optimizing repetitive infrastructure management tasks by 50% to deliver monthly reports and promotions for the customers
  • Applied version control using GitHub and automated deployment using GitHub actions, reducing deployment time by 35%
  • Extracted and tested e-commerce and area related data from API's and managing the data in s3 by identifying the key performance indicators from google maps, Zomato and local government websites
  • Administrated ingestion pipelines set-up on Py-Spark jobs and orchestrated with Airflow, rapidly identifying bottlenecks, analyzing logs to diagnose failures, and implementing quick resolutions to ensure seamless data flow
  • Leveraged Amazon Redshift for efficient storage and high-performance data warehousing, optimizing large-scale data storage
  • Documented workflow templates for client projects and supported in business analysis tasks to reduce further on-board time by 20%

Education

Master of Science (M.S) - Business Analytics

University Of North Texas
05.2024

Bachelor of Technology - Mechanical Engineering

Vasavi College of Engineering
06.2021

Skills

  • Airbyte
  • Fivetran
  • Lambda
  • Big Query
  • Apache Spark
  • DBT
  • Dagster
  • Apache Airflow
  • Amazon Redshift
  • Snowflake
  • Apache Kafka
  • AWS
  • GCP
  • Terraform
  • Docker
  • Portainer
  • Data Plex
  • Harness
  • Mini Kubernetes
  • Dremio
  • AWS Glue catalog
  • Apache Iceberg
  • Apache Superset
  • Tableau
  • Medallion Architecture
  • Hadoop
  • Python
  • NoSQL
  • PostgreSQL
  • Shell Scripting
  • Parquet
  • YAML
  • JSON
  • Version Control
  • GitHub
  • Data Mapping
  • Data Modeling

Projects

Data Engineer for Foodlytics – Analytical Solutions for Grocery Store Chain

Project Overview:
Developed end-to-end data pipelines for a grocery store chain to analyze operational data, focusing on trends like seasonality, daily sales, and category performance. Integrated real-time reporting with automated dashboards and implemented optimized workflows for cost-effective data management.

Business Value:

· We introduced weekend combo promotions based on sales trend analysis, resulting in a significant revenue boost and enhanced customer satisfaction.

· Delivered actionable insights into product and category trends, enabling better inventory management and reducing stockouts.

· Built centralized dashboards to empower stakeholders with real-time operational insights, driving data-driven decision-making across multiple stores.

Timeline

Data Engineer

Terces Solutions
05.2024 - Current

Data Engineer

Infosys LTD
08.2021 - 07.2022

Data Engineer

Terces Solutions
08.2019 - 07.2021

Master of Science (M.S) - Business Analytics

University Of North Texas

Bachelor of Technology - Mechanical Engineering

Vasavi College of Engineering
SAI NITHIN MAIPATH