Summary
Overview
Work History
Education
Skills
Timeline
Generic

NAGA Y

Arlington,Tx

Summary

Azure Data Engineer with a proven track record at Westlake Corporation, adept at architecting complex ETL workflows and optimizing data models. Skilled in Azure Data Factory and SQL, I excel in delivering actionable insights and collaborating with BI teams, driving data-driven decision-making and enhancing performance for large-scale data workloads.

Overview

5
5
years of professional experience

Work History

Azure Data Engineer

Westlake Corporation
Houston, USA
08.2024 - Current
  • Architected and managed complex ETL workflows in Azure Data Factory, integrating on-premises and cloud data sources for seamless data movement.
  • Administered Azure SQL Database and Azure Synapse Analytics, applying indexing and query optimization techniques for large-scale workloads.
  • Designed advanced data models for analytics using Azure Synapse Analytics, enabling dimensional modeling within Azure SQL Data Warehouse.
  • Built real-time streaming data pipelines with Azure Stream Analytics and Azure Databricks for immediate data processing and analysis.
  • Optimized Azure Blob Storage and Azure Data Lake Storage Gen2 for scalable data storage, ensuring efficient access for downstream analytics.
  • Performed performance tuning on Azure SQL Database and Cosmos DB, including query optimization and indexing strategies for high-throughput workloads.
  • Integrated multiple data sources via Azure Data Factory, automating workflows with Azure Logic Apps and Functions.
  • Collaborated with BI teams to prepare data for reporting using Power BI and SSRS, generating actionable insights.

AWS Data Engineer

Comerica Bank
Dallas, USA
02.2023 - 07.2024
  • Managed complex ETL workflows using AWS Glue, Lambda, and Kinesis for robust data processing.
  • Administered Amazon RDS and Redshift databases while applying indexing strategies to optimize performance.
  • Designed data models and pipelines with Redshift to enhance analytical capabilities.
  • Built real-time data pipelines with Kinesis to enable near real-time analysis.
  • Streamlined data storage with S3, Glacier, and Data Lake solutions for efficient dataset access.
  • Executed performance tuning on RDS and Redshift to support high-throughput workloads.
  • Implemented automated workflows with AWS tools to enhance data movement efficiency.
  • Collaborated with BI teams using QuickSight to generate comprehensive reports from datasets.

GCP Data Engineer

Novartis Pharmaceuticals
Hyderabad, India
06.2021 - 07.2022
  • Architected and managed complex ETL workflows with Google Cloud Dataflow and Dataproc, integrating data from BigQuery, Cloud Storage, and on-premises systems.
  • Administered Google BigQuery, Cloud SQL, and Cloud Spanner, employing performance optimization techniques like partitioning and query optimization for efficient data processing.
  • Designed advanced data models and warehousing solutions in Google BigQuery using dimensional modeling to support business intelligence.
  • Built real-time ingestion and processing pipelines with Google Cloud Pub/Sub and Dataflow to enable event-driven architectures.
  • Optimized Google Cloud Storage and BigQuery for large-scale data storage, ensuring efficient access and seamless GCP service integration.
  • Performed performance tuning on BigQuery, Cloud Spanner, and Cloud SQL by optimizing queries and indexing strategies for low-latency operations.
  • Integrated diverse data sources into unified workflows using Dataflow and Apache Beam, automating processing pipelines for scalability.
  • Collaborated with BI teams to prepare datasets for reporting using Data Studio and Looker, delivering actionable insights to stakeholders.

Data Engineer

Star Health and Allied Insurance
Hyderabad, India
06.2020 - 05.2021
  • Architected and developed efficient ETL pipelines, ensuring reliable data movement across diverse platforms using Apache Kafka, Spark, and Airflow.
  • Managed and optimized PostgreSQL, MySQL, and NoSQL databases to enhance query performance and ensure high availability.
  • Created cloud-based data warehouses with Amazon Redshift and Snowflake, designing scalable schema structures for seamless analysis.
  • Built real-time data processing pipelines with Apache Kafka and Flink for live ingestion and operational intelligence.
  • Automated data engineering workflows using Apache Airflow and AWS Lambda to reduce manual intervention in complex processes.
  • Leveraged monitoring tools such as Prometheus and Grafana to track health metrics of data pipelines for proactive management.
  • Integrated curated datasets with BI tools like Power BI and Tableau, enabling stakeholders to generate real-time insights.
  • Refined data workflows using Apache Spark, implementing parallel processing strategies that significantly reduced execution times.

Education

Masters - Computer Science

University of Texas at Arlington
Arlington, Texas, USA

Skills

  • AWS
  • Azure
  • GCP
  • Google Cloud Bigtable
  • Amazon S3
  • Google Cloud Storage
  • Apache Hadoop
  • Apache Spark
  • Apache Kafka
  • Apache Flink
  • Amazon Redshift
  • Google BigQuery
  • Snowflake
  • Azure Synapse Analytics
  • SQL
  • MySQL
  • PostgreSQL
  • SQL Server
  • Oracle
  • MariaDB
  • NoSQL
  • MongoDB
  • DynamoDB
  • Apache NiFi
  • AWS Glue
  • Azure Data Factory
  • Google Cloud Dataflow
  • Apache Airflow
  • AWS Kinesis
  • Azure Event Hubs
  • Google Cloud Pub/Sub
  • Star Schema
  • Snowflake Schema
  • Dimensional Modeling
  • Jenkins
  • Azure DevOps
  • GitLab CI
  • AWS CodePipeline
  • Google Cloud Build
  • Terraform
  • Power BI
  • Tableau
  • Looker
  • Qlik
  • Excel
  • TensorFlow
  • PyTorch
  • Azure Machine Learning
  • AWS SageMaker
  • Google Vertex AI
  • AWS Lambda
  • Google Cloud Functions
  • Azure Functions
  • CloudFormation
  • ARM Templates
  • HDFS
  • MapReduce
  • Hue
  • Hive
  • Pig
  • Sqoop
  • Impala
  • Apache HBase
  • Zookeeper
  • Flume
  • PySpark
  • Kubernetes
  • OpenShift
  • Google Kubernetes Engine
  • Azure Kubernetes Service
  • Amazon EKS

Timeline

Azure Data Engineer

Westlake Corporation
08.2024 - Current

AWS Data Engineer

Comerica Bank
02.2023 - 07.2024

GCP Data Engineer

Novartis Pharmaceuticals
06.2021 - 07.2022

Data Engineer

Star Health and Allied Insurance
06.2020 - 05.2021

Masters - Computer Science

University of Texas at Arlington
NAGA Y