Summary
Overview
Work History
Education
Skills
Websites
Timeline
Generic

Avinash Yerramilli

Hutto,TX

Summary

Data Engineer with over 10 years of experience in designing, implementing, and optimizing data solutions within Retail, Telecom, Banking and Healthcare domains. Expertise includes Snowflake, ETL development using Talend and Informatica, and Enterprise Data Warehousing. Proven track record of delivering scalable and efficient data architectures aligned with business needs.

Overview

12
12
years of professional experience

Work History

Senior Data Engineer

Skidmore
Remote
11.2023 - 04.2024
  • Led the design and implementation of scalable Azure data solutions, enhancing data visualization capabilities and optimizing business processes
  • Assembled and architected large, complex sets of data that met business and technical requirements
  • Orchestrated end-to-end ETL processes using Azure Data Factory, T-SQL, and Spark SQL to extract, transform, and load data into Azure Data Storage
  • Developed and maintained data pipelines using Azure Data Factory and Azure DataBricks for optimized data processing and transformation
  • Built scalable data pipelines using Azure Databricks to process large datasets
  • Optimized Spark jobs for performance and cost-effectiveness within Azure Databricks
  • Collaborated with data scientists to deploy machine learning models in Azure Databricks
  • Ensured data security and compliance by implementing best practices in Azure Databricks environments
  • Utilized Microsoft SQL Server Management Studio (SSMS) for database management, optimization, and query development
  • Configured and monitored Azure Data Factory pipelines using Azure Monitor and other monitoring tools to ensure reliable and scalable data workflows
  • Collaborated with data engineers and architects to design and optimize data movement strategies using Azure Data Factory, ensuring compliance with best practices and architectural guidelines
  • Implemented error handling, logging, and retry mechanisms within Azure Data Factory pipelines to enhance data pipeline reliability and fault tolerance
  • Developed SQL queries in Power BI Desktop to validate both static and dynamic data for data validation
  • Built automated reports and dashboards using Power BI and other reporting tools to provide actionable insights to stakeholders
  • Designed and developed data models using DBT for transformed data ready for analysis and loading into data warehouse solutions
  • Loaded data into Snowflake using SnowSQL, optimizing performance and ensuring data integrity
  • Designed and implemented Snowflake databases, schemas, tables, and views to align with business requirements
  • Managed Snowflake user access and privileges based on business roles, ensuring data security and compliance
  • Leveraged Snowflake features such as Clone and Time Travel for data versioning and historical analysis
  • Created mapping documents outlining data flow from source to targets to ensure effective data integration and transformation
  • Leveraged Autosys and Jenkins for job scheduling and automation of data processes
  • Maintained thorough documentation of data architecture, processes, and workflows to facilitate knowledge sharing and compliance
  • Collaborated with cross-functional teams to ensure data needs/requirements are captured and fulfilled
  • Optimized Snowflake performance by tuning queries and managing data partitioning
  • Collaborated with data analysts and scientists to leverage Snowflake for advanced analytics and reporting
  • Environment Used: Azure Services (Data Lake, Storage, SQL, DataBricks, Synapse Analytics), Snowflake (Data Warehouse), Power BI (Reporting environment), DBT (Data Build Tool), GitLab CI/CD, SQL, Python.

Senior Data Engineer/Snowflake Developer

Rheem manufacturing Inc
Plano, TX
01.2022 - 11.2023
  • Utilized SnowSQL to load data from AWS S3 into Snowflake, deploying scripts for object updates and executing SQL queries, DDL, and DML operations
  • Designed and implemented secure data pipelines integrating on-premises and cloud data sources into Snowflake, leveraging Ab Initio for ETL processes and real-time data updates via Snowpipes from AWS S3
  • Prepared Snowflake databases, schemas, tables, and views to align with business requirements, optimizing data processing
  • Managed Snowflake user access and privileges based on business roles, automated data ingestion with AWS Lambda functions, and optimized data processing using Snowflake Multi-Cluster Warehouses and Virtual Warehouses
  • Demonstrated expertise in Snowflake cloud technology, including utilizing Snowflake Clone and Time Travel for data versioning and historical analysis
  • Designed and implemented data warehousing solutions using Snowflake
  • Developed ETL pipelines to load data into Snowflake from various sources
  • Implemented Python-based data quality checks and validation to ensure accuracy, integrating data from MongoDB databases, and designing DAX Queries for Power BI reports
  • Developed Snowflake pipelines for efficient data processing and flow, leveraging Talend for complex ETL workflows to ensure seamless integration and transformation
  • Collaborated with data engineering teams to architect and develop data pipelines using Scala and Spark for batch and streaming data processing
  • Engaged with stakeholders to elicit and analyze business requirements for BI solutions, identifying key metrics and reporting needs
  • Designed and implemented complex data models within Snowflake to align with business requirements and industry standards
  • Leveraged AWS services such as Lambda, Glue, S3, IAM, and Kinesis to build scalable and reliable data solutions
  • Developed and maintained automated data processing workflows using GitLab CI/CD pipelines, ensuring robust and efficient data operations
  • Conducted data analysis and implemented test automation to validate data integrity and performance across various stages of data pipelines
  • Developed and maintained data processing scripts using Python for various ETL tasks
  • Automated data workflows and reporting processes using Python
  • Integrated Python scripts with cloud services for data extraction, transformation, and loading
  • Utilized Python for data analysis and visualization to support business decision-making
  • Managed source code repositories and CI/CD pipelines using GitLab
  • Collaborated with development teams to ensure code quality and adherence to best practices
  • Automated deployment processes using GitLab CI/CD for continuous integration and delivery
  • Monitored and resolved issues in GitLab pipelines to ensure smooth development workflows
  • Environment Used: Snowflake, AWS (S3, Lambda, Glue, IAM, Kinesis), Ab Initio, Python, Power BI, Talend, MongoDB, Scala, Spark, DAX Queries, SQL, GitLab CI/CD.

Data Engineer

Delta Dental
Remote
07.2019 - 01.2022
  • Designed and deployed scalable Azure solutions using Azure VMs, Azure Blob Storage, Azure Functions, and Azure Resource Manager for infrastructure deployment
  • Collaborated with cross-functional teams to gather data requirements and provided Python-based solutions for effective data transformation and integration
  • Developed automation scripts using Python and Azure CLI to streamline infrastructure configuration, enhancing operational efficiency and security compliance
  • Conducted security audits and implemented Azure IAM roles, encryption, and access controls to ensure robust data security and compliance
  • Provided ongoing production support, resolving issues related to Azure infrastructure and applications, and administered Snowflake roles for effective data governance
  • Implemented data pipelines and workflows to automate data integration and processing, optimizing data flow and accuracy
  • Conducted performance tuning and query optimization to enhance query performance and reduce response times
  • Managed and monitored Azure SQL Database instances to ensure high availability, reliability, and data integrity, implementing database backup and recovery strategies
  • Led the migration of on-premises databases to Azure SQL Database using Azure Database Migration Service (DMS) for seamless data transfer
  • Developed reports in Looker based on Snowflake connections, validating insights with Azure SQL Database data for accuracy
  • Collaborated with business users to gather data visualization requirements, designing Power BI Dashboards and Reports for data-driven decision-making processes
  • Utilized Power BI for creating dynamic and drill-down visualizations, enabling real-time trend analysis from various databases, including Snowflake
  • Developed and optimized PySpark applications for distributed data processing, leveraging Apache Spark's parallel computing capabilities to handle large-scale datasets efficiently
  • Implemented complex data transformations and aggregations using PySpark DataFrame APIs to prepare data for analytics and machine learning tasks
  • Integrated PySpark with cloud-based data platforms such as Azure Databricks to orchestrate and automate data workflows in scalable and cost-effective ways
  • Collaborated with data scientists and analysts to deploy machine learning models using PySpark MLlib, facilitating predictive analytics and data-driven decision-making
  • Implemented data pipelines in PySpark to ingest, transform, and load data from diverse sources into data lakes or data warehouses, ensuring data quality and reliability
  • Environment Used: Azure Services (VMs, Blob Storage, Functions, Resource Manager, SQL Database, DMS, IAM), Snowflake, Power BI, PySpark, Azure Databricks, Python, Azure CLI.

Snowflake/Talend Developer

KPI Partners
Newark, CA
01.2018 - 07.2019
  • Implemented Agile – Scrum methodologies for requirements gathering, analysis, and sprint planning
  • Designed dimensional models for data marts using star and snowflake schemas
  • Managed extraction, transformation, and load (ETL) processes using Talend, optimizing data integration workflows
  • Automated data loading into Snowflake via Snowpipes and implemented continuous data integration from various sources
  • Implemented scripts for object updates and executed DDL and DML operations in Snowflake
  • Developed robust data solutions to ensure smooth data flow from various on-premise and cloud data sources into Snowflake, leveraging Talend for ETL
  • Assisted in the migration from legacy ETL processes to Talend, enhancing data processing performance and maintainability
  • Provided technical support for Talend deployment, ensuring high availability and performance of data pipelines
  • Utilized Talend to transform data for analysis, designing complex ETL workflows to integrate data from various sources into Snowflake
  • Designed and implemented data models to support business intelligence and analytics needs
  • Created ER diagrams and data flow diagrams to represent data structures and processes
  • Optimized data models for performance and scalability in large-scale data environments
  • Collaborated with stakeholders to gather requirements and refine data models based on business needs
  • Designed and implemented data modeling to support business intelligence and analytics needs
  • Created ER diagrams and data flow diagrams as part of data modeling to represent data structures and processes
  • Optimized data modeling for performance and scalability in large-scale data environments
  • Collaborated with stakeholders to gather requirements and refine data modeling based on business needs.

Data Engineer

Accenture
Austin, TX
02.2016 - 01.2018
  • Utilized ETL processes to extract and load data efficiently from Oracle databases and flat files into an Oracle Data Warehouse
  • Conducted analysis, documentation, and testing of workflows to ensure data accuracy and reliability
  • Debugged Informatica mappings and validated data in target tables post-loading with mappings
  • Monitored workflows using Workflow Manager and Workflow Monitor to ensure smooth execution of data processes
  • Automated data loading and transformation tasks by creating and scheduling workflows/worklets using Workflow Manager
  • Implemented various transformations (e.g., Update Strategy, Look Up, Filter, Router) to manipulate data effectively during ETL processes
  • Developed mappings for loading data into Slowly Changing Dimensions, preserving historical data for accurate reporting
  • Ensured proper execution of incremental and complete data loads, managing dependencies for efficient data processing
  • Maintained OLAP systems to deliver high-quality data supporting Decision Support Systems (DSS)
  • Designed ETL processes using Informatica ETL tool to integrate data from diverse sources such as flat files, SQL Server, Teradata, and Oracle
  • Utilized Scala's functional programming capabilities to write concise and efficient code for data manipulation and computation
  • Contributed to open-source Scala projects or libraries, showcasing expertise and active participation in the Scala community
  • Mentored junior developers in Scala best practices, functional programming concepts, and Spark optimization techniques
  • Designed, deployed, and managed Snowflake data warehouse solutions to support various analytics and business intelligence needs
  • Developed and implemented ETL processes using Talend for efficient data transformation and loading into Snowflake
  • Built robust data pipelines using Talend to extract data from various sources, transforming and loading it into Snowflake for analysis
  • Created and optimized Snowflake queries to support complex reporting requirements, ensuring high performance and accuracy
  • Designed and implemented secure data sharing and collaboration solutions using Snowflake, enabling seamless data access across teams and departments
  • Automated data quality checks and validation processes to ensure the accuracy and reliability of data in Snowflake
  • Developed automated test scripts to ensure data quality and accuracy
  • Implemented continuous testing processes using tools like Selenium and JUnit
  • Collaborated with QA teams to identify test requirements and create test plans
  • Monitored and maintained automated test environments to ensure reliability and performance
  • Environment Used: ETL (Informatica), Oracle Data Warehouse, Workflow Manager, Workflow Monitor, OLAP, Scala, Snowflake, Talend, SQL Server, Teradata.

SQL Developer

Service Stream
Melbourne, AU
02.2012 - 12.2015
  • Performed regular database maintenance and development tasks, such as backups, restores, and index optimization, to ensure data integrity and availability
  • Monitored database performance and proactively identified and resolved performance bottlenecks or issues
  • Designed, developed, and implemented SQL database solutions, including data models, tables, stored procedures, and views
  • Collaborated with cross-functional teams to understand data requirements and assisted in developing efficient database solutions
  • Worked closely with application developers to optimize database queries and improve overall system performance
  • Troubleshot and resolved database-related problems in a timely and efficient manner
  • Ensured database security and data privacy by implementing access controls and encryption techniques
  • Created and maintained documentation related to database design, architecture, and processes
  • Stayed up to date with the latest database technologies and trends to propose improvements and innovative solutions
  • Designed and optimized complex SQL queries for data extraction and analysis
  • Managed and maintained database schemas, indexes, and stored procedures
  • Conducted performance tuning and query optimization to enhance database performance
  • Ensured data integrity and security through the implementation of SQL best practices.

Education

Masters in Information Systems -

University of Ballarat
03.2011

Bachelor of Computer Science -

Jawaharlal Nehru Technological University
07.2008

Skills

  • Modern Relational Databases
  • SQL Optimization
  • Cloud Platforms (AWS, Azure)
  • ETL Tools (Talend, Informatica)
  • Snowflake Data Warehouse
  • Data Modeling and Transformation
  • Reporting and Visualization
  • Python Programming
  • Python PySpark
  • Cluster Computing Frameworks
  • SQL
  • Data Engineering
  • Snowflake
  • ETL (Extract, Transform, Load)
  • Azure Services
  • GitLab
  • Data Modeling
  • Data Analysis
  • Test Automation

Timeline

Senior Data Engineer

Skidmore
11.2023 - 04.2024

Senior Data Engineer/Snowflake Developer

Rheem manufacturing Inc
01.2022 - 11.2023

Data Engineer

Delta Dental
07.2019 - 01.2022

Snowflake/Talend Developer

KPI Partners
01.2018 - 07.2019

Data Engineer

Accenture
02.2016 - 01.2018

SQL Developer

Service Stream
02.2012 - 12.2015

Masters in Information Systems -

University of Ballarat

Bachelor of Computer Science -

Jawaharlal Nehru Technological University
Avinash Yerramilli