Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Sai Chandana Gamini

Orlando,FL

Summary

Experienced Sr. Data Engineer with 5+ years in designing and optimizing scalable data pipelines and cloud-based architectures (AWS & Azure). Expert in Python, PySpark, Spark, and ETL automation, driving cost-efficient solutions and seamless migrations. Skilled in big data technologies, data modeling, and Infrastructure as Code (Terraform). Proven ability to deliver high-quality, reliable data workflows for analytics and machine learning teams. Certified Azure Data Engineer with strong collaboration and communication skills.

Overview

8
8
years of professional experience
1
1
Certification

Work History

Sr. Data Engineer

QLX Consulting Company, USA
10.2024 - Current
  • Designed, developed, and maintained scalable data pipelines using Python and PySpark to support pricing analytics across USA, Germany, and Canada markets.
  • Optimized and managed complex data workflows using AWS Glue and AWS Athena, improving data processing efficiency by 60%.
  • Collaborated with data scientists, analysts, and stakeholders to deliver data solutions aligned with business requirements.
  • Implemented efficient data storage solutions using Parquet and Apache Hudi, enabling faster query performance and reduced storage costs.
  • Monitored and troubleshot data pipelines to ensure high availability, performance, and reliability.
  • Developed and enforced data engineering best practices, including data governance, security, and comprehensive documentation.
  • Developed Disaster Recovery (DR) modules using Terraform and AWS native services (S3, RDS snapshots, ECS service replication) to ensure high availability.
  • Implemented ETL processes to extract, transform, and load data from diverse sources, ensuring data quality, integrity, and consistency.
  • Performed UAT testing for Docker images and managed container lifecycle by pushing images to Amazon ECR and deploying them to Amazon ECS. Ensured successful deployment of new features and monitored Amazon CloudWatch for performance metrics, logs, and potential issues during rollout.
  • Automated cloud resource provisioning using Terraform scripts for Infrastructure as Code (IaC), reducing manual setup efforts by 80%.
  • Designed and implemented Apache Airflow DAGs to orchestrate complex ETL workflows, ensuring timely and reliable data movement across multiple services.
  • Integrated version control and environment-based deployments (dev, UAT, prod) through Jenkins and GitHub Actions, enabling safe rollouts with rollback options.
  • Automated testing, code deployment, and monitoring processes through CI/CD pipelines to enhance reliability.
  • Worked with VPC (Virtual Compute Environment) to support high-compute workloads and integrate them with pricing models and simulations for regional teams.

Data Engineer

Northern Trust, USA
06.2023 - 10.2024
  • Designed and implemented scalable data architectures using Azure Synapse Analytics, Azure Data Lake, and Databricks, to enhance performance.
  • Developed ETL pipelines with Azure Data Factory and Databricks, integrating data from SQL databases, APIs, and on-premises systems into a centralized data warehouse.
  • Played a crucial role in migrating data from on-premises systems to Azure, utilizing Data Factory and Databricks, achieving a 30% reduction in operational costs.
  • Leveraged Databricks notebooks for advanced data transformations and machine learning workflows, enabling more accurate predictive analytics.
  • Utilized Python (Pandas), SQL, and Spark within Databricks for data transformation, ensuring clean and analysis-ready data.
  • Automated routine data processing tasks using Apache Airflow and Databricks, reducing manual work by approximately 40%.
  • Created interactive Power BI dashboards to monitor data trends and support informed decision-making.
  • Managed project tasks and team collaboration through Jira and Confluence, ensuring clear communication and timely delivery.
  • Worked with Hadoop, Spark, and Databricks for processing large datasets, building data pipelines that supported the company's analytics initiatives.
  • Wrote custom Hive queries, BigScripts, and Databricks SQL to optimize data workflows, improving processing efficiency.

Azure Data Engineer

IFFCO TOKIO General Insurance Company
07.2020 - 06.2022
  • Developed and optimized complex T-SQL and PL/SQL stored procedures, functions, and triggers to streamline data processing and ensure data integrity across multiple insurance platforms.
  • Implemented advanced indexing strategies and query optimization techniques, resulting in a 40% improvement in report generation times.
  • Conducted thorough data analysis and validation using SQL, ensuring accuracy and consistency in policy and claims data.
  • Designed and deployed robust SSIS packages to automate ETL processes, facilitating seamless data migration from legacy systems to Azure-based solutions.
  • Utilized SSIS features such as Conditional Split, Lookup, and Derived Column transformations to cleanse and transform data effectively.
  • Leveraged Azure Data Factory to orchestrate complex data pipelines, integrating data from on-premises SQL Server and Oracle databases into Azure Data Lake Storage.
  • Utilized Azure Synapse Analytics and Azure SQL Database to build scalable data models supporting real-time analytics and reporting.
  • Collaborated with cross-functional teams to implement data governance and security protocols using Azure Purview and Azure Key Vault, ensuring compliance with industry standards.

Data Analyst

CitiusTech, India
01.2018 - 06.2020
  • Developed a customer segmentation model using clustering algorithms in Python on AWS SageMaker, enhancing targeted marketing efforts and increasing customer engagement by 25%.
  • Led the migration of the company's data warehouse from on-premises servers to Snowflake on AWS, resulting in a 40% reduction in maintenance costs and improved data access speeds.
  • Designed and implemented an ETL process using AWS Glue to integrate CRM data into the company's analytics platform, enabling real-time reporting and seamless data flow.
  • Developed interactive dashboards in Looker, hosted on AWS, to monitor operational metrics, leading to a 15% improvement in operational efficiency.
  • Conducted data cleansing and transformation using AWS Data Wrangler to improve data quality, reducing data-related issues by 35%.
  • Utilized SQL and AWS Redshift for complex queries and analyses, providing insights that informed product development and strategic planning.
  • Built a predictive maintenance model using machine learning techniques in Python on AWS SageMaker, reducing equipment downtime by 20% and optimizing maintenance schedules.
  • Spearheaded the transition of financial reporting tools from Excel to Tableau, Azure, Power BI, and AWS QuickSight, cutting report preparation time by 50% and improving visualization capabilities.
  • Collaborated with product teams to analyze user data on AWS, identifying key trends and contributing to a 10% increase in user retention.

Education

Master of Science - Computer Science and Technology

University of Texas Arlington
05.2023

Bachelor of Science - Information Science and Technology

Sri Venkateswara College, Bangalore, India
06.2021

Skills

  • Programming & Analysis: Python, SQL, PySpark, Hive, Pig, Excel, Jupyter Notebook
  • Big Data & Data Engineering: Spark, Hadoop, Databricks, PySpark Streaming, Apache Hudi, Parquet
  • Cloud Platforms: AWS (S3, EC2, Lambda, EMR, SageMaker, Glue, Athena, Redshift, QuickSight), Azure (Synapse Analytics, Data Lake, Data Factory), Google BigQuery
  • ETL & Automation: Azure Data Factory, Databricks, Apache Airflow, Apache NiFi, AWS Glue, CI/CD Pipelines
  • Infrastructure & DevOps: Terraform (IaC), Git
  • Data Modeling & Management: MySQL, SQL Server, Snowflake, Data Governance & Security
  • Visualization: Power BI, Tableau, Looker, AWS QuickSight, matplotlib
  • Machine Learning: Model Development (Python, SageMaker, Databricks)
  • Other Tools: Jira, Confluence
  • Methodologies: Agile, Scrum
  • Soft Skills: Time Management, Communication, Team Collaboration

Certification

Azure Data Engineer Associate

Timeline

Sr. Data Engineer

QLX Consulting Company, USA
10.2024 - Current

Data Engineer

Northern Trust, USA
06.2023 - 10.2024

Azure Data Engineer

IFFCO TOKIO General Insurance Company
07.2020 - 06.2022

Data Analyst

CitiusTech, India
01.2018 - 06.2020

Bachelor of Science - Information Science and Technology

Sri Venkateswara College, Bangalore, India

Master of Science - Computer Science and Technology

University of Texas Arlington
Sai Chandana Gamini