Summary
Overview
Work History
Education
Skills
Websites
Timeline
Generic

Preyaa Atri

Manhattan,NY

Summary

Team oriented lead data engineer seeking to leverage 5+ years of experience in building data intensive applications, tackling challenging architectural and scalability problems, data exploration and wrangling ETL pipeline development and analysis. Patient problem solver with an appreciation for clean code.

Overview

9
9
years of professional experience

Work History

Lead Data Engineer

MSC Industrial Supply
01.2022 - Current
  • Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
  • Installed Apache Airflow, orchestrated data pipelines by developing Airflow operators using Python.
  • Implemented SCD Type 1 and SCD Type 2 methodologies using SQL, to maintain a historical data warehouse.
  • Designed and implemented data visualizations to effectively communicate data insights using Tableau.
  • Built and optimized data warehouse on GCP, using BiqQuery, Composer, Cloud Storage and Cloud Functions, to reduce costs by 30% while ensuring data integrity and accuracy.

Data Engineer

ON Q FINANCIAL
08.2020 - 12.2021
  • Data pipeline creation/maintenance using Python and SQL.
  • Develop, construct, test and maintain databases and other large scale processing system's architecture.
  • Work with Business Analysts to gather requirements and design reliable and scalable data pipelines using AWS EMR.
  • Tuning table designs in Amazon Redshift.
  • Create dashboards using Power BI.

Data Improvement Intern

GRAVY ANALYTICS
05.2019 - 09.2019
  • Analyze foot traffic, using signals received from electronic devices in crowd, using Python and R along with Power BI.
  • Transfer data from database to Elasticsearch, using Java.

Data Engineer

AMDOCS DEVELOPMENT CENTRE
07.2016 - 10.2017
  • Integrate Kafka with Spark Streaming, using Scala.
  • Manage HDFS and load unstructured data.
  • Cleaning and statistical analysis of data using Python and R.

Data Analyst

COGNIZANT TECHNOLOGY SOLUTIONS
07.2014 - 07.2016
  • Importing and exporting data using Flume and Kafka.
  • Partitioning and bucketing of data in Hive.
  • Migrate Data Warehouse from Oracle to AWS Redshift.
  • Real-time batch processing with Spark, over AWS EMR.

Education

Master of Science - Data Analytics Engineering

Volgenau School of Engineering, George Mason University
Virginia, USA

Bachelor of Technology - Computer Science

Institute of Engineering, Bundelkhand University
India

Skills

  • Big Data Ecosystems - HDFS, Hive, Pig, Sqoop, Flume, Kafka, Oozie, HBase, Spark, Zookeeper
  • Languages/Concepts - Scala, Python, R, Java, SQL, Machine Learning, NLP, HiveQL, C, C, C#
  • Databases - MySQL, PostgreSQL, MongoDB, DynamoDB, Oracle 12c
  • Tools & Utilities - Tableau, Power BI, Control M, Autosys, SQL Developer, Elastic Search and Kibana, Jenkins, Hadoop Framework, Microsoft Azure
  • AWS/Cloud Services - S3, EC2, Redshift, EMR, Data Lakes, Lambda
  • GCP - Apache Airflow, Bigquery, Composer, Cloud Storage

Timeline

Lead Data Engineer

MSC Industrial Supply
01.2022 - Current

Data Engineer

ON Q FINANCIAL
08.2020 - 12.2021

Data Improvement Intern

GRAVY ANALYTICS
05.2019 - 09.2019

Data Engineer

AMDOCS DEVELOPMENT CENTRE
07.2016 - 10.2017

Data Analyst

COGNIZANT TECHNOLOGY SOLUTIONS
07.2014 - 07.2016

Master of Science - Data Analytics Engineering

Volgenau School of Engineering, George Mason University

Bachelor of Technology - Computer Science

Institute of Engineering, Bundelkhand University