Summary
Overview
Work History
Education
Skills
Websites
Projects
Timeline
Generic

Nitesh Srivatsav

Seattle,WA

Summary

Experienced Data Engineer keen to help companies collect, collate and exploit digital assets. Practiced at cleansing and organizing data into new, more functional formats to drive increased efficiency and enhanced returns on investment.

Overview

7
7
years of professional experience

Work History

Data Engineer

Amazon LLC
09.2021 - Current
  • Built a real-time ingestion workflow that ingests metadata information of various Amazon resources(Redshift,S3 etc.) across all Amazon into a privacy monitoring and auditing system.
  • Built an event-driven pricing detection system using Lambda and AWS Kinesis that monitors the checkout prices of Amazon Device products and alerts if the price breaches specified thresholds which saves over 10M$ across Devices every year.
  • Automated Operational Excellence scorecard that fetches various AWS metrics data across the Devices org., stores them in a secure data lake and publishes a weekly report.
  • Working with Data Scientists to build a data ingestion system that sources the top Devices products to advertise on across all Amazon supported marketplaces to drive Devices revenue.

Data Engineer

Vistra Energy Corp.
08.2018 - Current
  • Vast experience in building ETL/ELT pipelines using Python, PySpark that involves fetching data from disparate sources, cleaning & transforming them and loading them into data warehouses using schedulers like Airflow.
  • Experience in AWS creating AWS resources through CloudFormation stacks; source controlled through CFTs.
  • Designed and implemented data ingestion pipelines that loads data from S3 to Snowflake using S3 events that trigger Lambda functions through SQS(event-driven architecture).
  • Dockerizing applications and deploying them through CI/CD solutions, building CI/CD pipelines(Gitlab CI/CD, Jenkins).
  • Working closely with Data Scientists to transform data from various sources and storing them in partitioned, parqueted tables in Hadoop and to extract actionable insights from data.
  • Generated detailed studies on potential third-party data handling solutions, verifying compliance with internal needs and stakeholder requirements.
  • Participating in SCRUM and code reviews to understand complex business requirements and translating them to optimized, scalable solutions by writing highly efficient and quality code in Python or Java.

Education

Master of Science - Computer Science

The University of Texas At Dallas
Richardson, TX

Bachelor of Science - Computer Science

SRM University

Skills

  • Languages: Python, Java, Bash, and JavaScript
  • Devops: Docker,Gitlab CI/CD, Jenkins, Kubernetes
  • Big Data: Hadoop, Spark, Hive, Impala, Sqoop, MapReduce
  • Databases: Snowflake, RedShift, HDFS, Amazon S3,Oracle,Postgres, SQL Server
  • Workflow Management: Airflow, Crontab

Projects

Big Data Project “Twitter sentiment analysis" (Python, Scala) May 2017-Sep 2017

• Used Apache Kafka to stream performed sentiment analysis on the fly (live stream) from the producer to the consumer using StanfordNLP library to classify the tweets as 'positive', 'negative' or 'neutral'.

• Classified 50,000 tweets and identified the number of tweets from each category originating from different states of USA using Elasticsearch's Kibana.

Big Data Project “Crime rate Forecasting System" (Pyspark, Scala) May 2017-Sep 2017

• Used the Portland crime rate dataset which consisted of 829,384 rows and 19 columns which I clustered into three clusters using the Spark MLlib (K-means clustering).

• Developed a time forecasting system on the three clusters using the ARIMA (Autoregressive Integrated Moving Average) model that forecasted the crime rate for one month with an accuracy of 80%.

Timeline

Data Engineer

Amazon LLC
09.2021 - Current

Data Engineer

Vistra Energy Corp.
08.2018 - Current

Bachelor of Science - Computer Science

SRM University

Master of Science - Computer Science

The University of Texas At Dallas