Summary
Overview
Work History
Education
Skills
Timeline
Generic

Unith G

Providence

Summary

Accomplished Data Engineer with expertise in designing robust ETL pipelines at Infosys, leveraging Snowflake and Teradata for optimized data processing. Proficient in SQL optimization and data validation, ensuring compliance with HIPAA standards. A collaborative problem-solver dedicated to enhancing data reliability and efficiency across healthcare systems.

Overview

4
4
years of professional experience

Work History

Data Engineer

Infosys
RI
10.2024 - Current
  • Designed and implemented ETL pipelines using Snowflake and Teradata, enabling seamless data extraction, transformation, and loading (ETL) processes across healthcare systems
  • Developed and optimized SQL queries and stored procedures in Snowflake and Teradata for efficient data processing and retrieval, improving query performance
  • Conducted thorough data validation and cleansing to ensure data accuracy and compliance with healthcare industry standards and regulations, including HIPAA compliance
  • Implemented data models and schemas in Snowflake, leveraging clustering and partitioning techniques to enhance data storage efficiency and query speed
  • Created comprehensive test plans and executed test cases for E2E testing, including data reconciliation, data completeness, and data accuracy checks
  • Managed large-scale data loads in Snowflake, using Snowpipe for continuous data ingestion and optimizing batch processing for large healthcare datasets
  • Utilized Snowflake’s capabilities such as Time Travel and Zero-Copy Cloning to ensure data reliability and simplify data recovery processes during migration
  • Supported post-migration activities, including data validation, system optimization, and user acceptance testing (UAT), ensuring a smooth transition with minimal disruption

Data Engineer

Tangensis Inc
TX
05.2023 - 10.2024
  • Designed and implemented robust data pipelines using Google Cloud Platform (GCP) services including BigQuery, Cloud Dataflow, Cloud Storage, Cloud Pub/Sub, and Cloud Composer, ensuring efficient data ingestion, transformation, and loading (ETL) processes
  • Developed and maintained scalable ETL workflows using Cloud Dataflow (Apache Beam) for both batch and real-time data processing, optimizing data flow and minimizing latency
  • Managed data storage and retrieval processes using Google Cloud Storage and BigQuery, implementing partitioning, clustering, and data lifecycle policies to optimize performance and costs
  • Built and optimized data models in BigQuery, improving query execution times by and enhancing data analytics capabilities
  • Managed data security and governance by implementing GCP IAM roles and permissions, ensuring data confidentiality and compliance with industry standards
  • Utilized Data Catalog for metadata management, enhancing data discoverability and governance across the organization
  • Implemented data ingestion solutions with Cloud Pub/Sub for real-time streaming data, improving system responsiveness and data availability
  • Utilized Cloud Composer (Apache Airflow) for orchestrating complex data workflows, ensuring data pipeline reliability and automating end-to-end data processes

Data Engineer

IndustryArc
India
05.2021 - 08.2022
  • Designed, developed, and maintained data pipelines using AWS services including AWS Glue, AWS Lambda, Amazon S3, Amazon Redshift, and Amazon RDS to ensure efficient data extraction, transformation, and loading (ETL) processes
  • Implemented data ingestion workflows with AWS Kinesis and AWS Data Pipeline, optimizing real-time and batch data processing for improved system performance and data reliability
  • Developed and maintained ETL pipelines using AWS Glue and Python, ensuring seamless data integration from multiple sources into Amazon Redshift for analytics and reporting
  • Built and optimized data models in Amazon Redshift, improving query performance and reducing data retrieval
  • Utilized AWS Lambda for serverless computing to automate data processing tasks, reducing system overhead and improving scalability
  • Managed data storage solutions with Amazon S3, implementing data partitioning and lifecycle policies to optimize storage costs and ensure data availability
  • Ensured data security and compliance by implementing AWS IAM roles and policies, encrypting sensitive data at rest and in transit using AWS KMS
  • Worked with AWS CloudWatch and AWS CloudTrail for monitoring, logging, and troubleshooting data pipeline issues, ensuring high availability and minimal downtime

Intern

AI-ROBOTICA
Bengaluru
01.2021 - 04.2021
  • Project: News App (Android Development). March 2023 -May2023 • API Integration: Generated and implemented API keys to seamlessly fetch and display real-time news content within the application. Ensured a constant flow of up-to-date information for users. • Technology Stack: Developed the Android app using XML and Android Studio, leveraging the power of Kotlin programming language. This technological combination provided a robust foundation for creating a dynamic and responsive news application. • Real-time News Access: Engineered features within the app to provide users with real-time access to the latest headlines. This not only enhanced user engagement but also positioned the app as a reliable source for staying informed • User-Friendly Design: Crafted a user-friendly interface through XML, focusing on an intuitive and visually appealing design. Prioritized a seamless user experience to ensure easy navigation and accessibility of news content. • Kotlin-Based Functionality: Leveraged Kotlin's capabilities to enhance the functionality of the Android news app. Utilized Kotlin's concise syntax and advanced features to create a modern and efficient application. • Current Affairs Integration: Developed a news app that enables users to stay up-to-date with the latest headlines, fostering a sense of awareness and engagement with current affairs.

Education

Masters - CS

University of Illinois Springfield
Springfield, IL
01.2023

Bachelors - CS

Gitam University
India
01.2021

Skills

  • Python
  • SQL
  • PL/SQL
  • Snowflake
  • Teradata
  • Amazon Redshift
  • Google BigQuery
  • GCP BigQuery
  • GCP Dataflow
  • GCP Cloud Storage
  • ETL
  • AWS Data Pipeline
  • Cloud Composer
  • Data Modeling
  • SQL Optimization
  • Serverless Computing
  • End-to-End Testing
  • HIPAA Compliance
  • AWS IAM
  • Data Validation
  • Cloud Monitoring
  • Git
  • Jenkins
  • Google Data Studio
  • Looker
  • Python Scripting
  • AWS CloudFormation

Timeline

Data Engineer

Infosys
10.2024 - Current

Data Engineer

Tangensis Inc
05.2023 - 10.2024

Data Engineer

IndustryArc
05.2021 - 08.2022

Intern

AI-ROBOTICA
01.2021 - 04.2021

Masters - CS

University of Illinois Springfield

Bachelors - CS

Gitam University
Unith G