Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Saravannan Pushparaj

Plano

Summary

Results-driven ETL Developer with 11+ years of IT experience in the banking domain, specializing in ETL/ELT, Snowflake, dbt, and IBM DataStage. Proven expertise in designing, optimizing, and modernizing large-scale ETL/ELT pipelines, enterprise data warehouses, and real-time data solutions. Adept at leading technical teams, enforcing data governance, and delivering high-performance, secure, and scalable data platforms.

Overview

12
12
years of professional experience
1
1
Certification

Work History

Data Engineer/ Technical Lead

United Service Automobile Association - USAA
06.2014 - Current
  • Mainly worked in Banking Fraud, SCRA and FCRA related projects
  • Designed end-to-end modernization of enterprise data warehouse Teradata to Snowflake using AWS Services for improved scalability, performance and cost-efficiency.
  • Extracted large volumes of data in Teradata using AWS Glue and AWS DMS and staged it in Amazon S3 using compressed Parquet format for efficient downstream processing.
  • Developed and maintained real-time data processing solutions on AWS, utilizing Kafka, Amazon Kinesis and Lambda functions.
  • Utilized Snow pipe auto-ingests these streaming files to Snowflake.
  • Optimized Teradata EDW workloads, including complex BTEQ scripts, macros, stored procedures.
  • Mapped and converted Teradata data types to Snowflake equivalents, ensuring data integrity and compatibility across thousands of tables during schema migration.
  • Built and orchestrated ETL pipelines using AWS Glue (PySpark), enhancing processing capabilities of S3 data.
  • Used Snowflake's COPY INTO to bulk-load from S3 into staging tables.
  • Developed Python-based reconciliation scripts to validate data accuracy between Teradata and Snowflake, comparing record counts, aggregates, and row-level values.
  • Implemented data quality checks using Python (Pandas library) to validate pipeline outputs in Snowflake and trigger alerts on schema or value mismatches.
  • Developed, tested, and deployed data orchestration pipelines in Python using qTest. Monitored MWAA workflows and ensured modular and maintainable DAG design.
  • Containerized data preprocessing tasks using Docker, deployed on ECS Fargate for on-demand, scalable processing and stored images in ECR (Docker).
  • Implemented various data security measures like data masking, to protect sensitive customer transaction information and ensure HIPAA, GDPR compliance.
  • Automated all deployments using CloudFormation templates and integrated with Jenkins pipelines, supporting multi-environment rollouts and change tracking.
  • Automated cloud resource management tasks using Python-based scripts and SDKs for AWS (Boto3).
  • Designed scalable architectures for search engines to handle growing datasets and user queries using AWS Elastic Search Service (OpenSearch) and Cloud Front.
  • Proficient in deploying and managing containerized applications as micro services on Amazon Elastic Kubernetes Service (EKS), enabling scalability and flexibility in application architecture.
  • Skilled in performance tuning and optimization of SQL queries within AWS services, ensuring faster data retrieval and efficient processing of complex queries in Snowflake and Amazon Redshift.
  • Proficient in containerizing applications using Docker and deploying them on AWS Elastic Kubernetes Service (EKS), allowing for seamless and portable deployment across various environments.
  • Proficient in Amazon Aurora for MySQL and PostgreSQL, employing SQL for integrating and managing relational databases within AWS, ensuring seamless data operations.
  • Utilized AWS Secret Manager for credential handling, and AWS Data Catalog for metadata store.
  • Implemented robust security measures within AWS, including encryption, IAM policies, and audit logging, safeguarding sensitive healthcare data across Amazon S3.
  • Established disaster recovery and backup plans within AWS to ensure data availability in case of system failures or disasters, leveraging services like Amazon S3, AWS Backup, and Amazon RDS for redundancy and backups.
  • Wrote shell scripts for triggering and monitoring ETL jobs.
  • Engineered data processing solutions using Spark, enhancing real-time data analysis and processing capabilities.

Education

Bachelor of Technology - Information Technology

Anna University
Chennai
06-2012

Skills

  • Cloud services: Snowflake
  • Data integration and transformation tools: DBT (Data Build Tool), Datastage
  • Programming languages: SQL, Python, and Shell scripting
  • Database technologies: SQL databases (MySQL, Oracle)
  • Data warehousing and big data technologies: Kafka
  • scheduling: Control-M
  • CI/CD: Git
  • Performance tuning
  • Query optimization

Certification

  • DBT Fundamentals
  • AWS Certified Cloud Practitioner

Timeline

Data Engineer/ Technical Lead

United Service Automobile Association - USAA
06.2014 - Current

Bachelor of Technology - Information Technology

Anna University
Saravannan Pushparaj