Summary
Overview
Work History
Education
Skills
Personal Information
Certification
Timeline
Generic

Prayush Tatke

Austin,TX

Summary

Seasoned Data Engineer with 14 years of strong experience in building scalable big data ETL/ELT solutions, both on-premise and Cloud. Proven track record of managing projects from conception to completion. Having startup experience, Prayush has proven ability to rapidly deliver prototypes and proof-of-concept work to validate a proposed architectural approach.

Demonstrated proficiency in leading and mentoring individuals, while forming cohesive team environment. Adept in Agile methodology, Team Management skills like delegation, competence management, mentoring. Comfortable in working with cross-cultural and Multinational teams, interacting with people across hierarchical levels for smooth project execution. Good decision Making, analytical skills, excellent communication and management skills. Commitment to staying updated on emerging trends and technologies in big data and analytics.

Overview

14
14
years of professional experience
1
1
Certification

Work History

Data Engineer-|||

Expedia Group
04.2024 - Current

- Created a Marketing Data Platform, which enabled the team a cohesive environment to easily create manage and troubleshoot the ETL Pipelines.
- Created an Interface for users to perform bulk Marketing Activities efficiently, Saving lot ot manual efforts.
- Created Alerting system to efficiently monitors the campaigns in Marketo.

Senior Data Engineer-II

Utopus Insights
08.2018 - 09.2023
  • Led cross-functional team as squad lead, serving as primary point of contact for product teams regarding platform-related queries, new requirements, changes, issues, and troubleshooting.
  • Designed and implemented robust data platform using Databricks and Delta Lake, ensuring efficient data processing and storage.
  • Developed and implemented data ingestion services using PySpark Jobs, Spark Structured Streaming, Lambdas, and Kinesis, enabling seamless data integration from various sources.
  • Created data enrichment services utilizing Lambdas, enhancing the quality and depth of data insights.
  • Designed and implemented data enabling services using PySpark Jobs and Lambdas, facilitating data accessibility and utilization.
  • Developed and implemented high-performance data pipelines for real-time data with strict latency constraints, ensuring timely and accurate data processing.
  • Designed and implemented data pipelines for large volumes of batch data, both historical and operational, enabling comprehensive data analysis and reporting.
  • Successfully managed end-to-end delivery of data pipelines to production, ensuring smooth and reliable data flow.
  • Built and mentored high-performing team, conducting regular 1-1s, providing guidance and support, and fostering a culture of continuous learning and growth.
  • Facilitated daily scrums, effectively delegated tasks, and conducted thorough code reviews to maintain code quality and adherence to best practices.

Analyst-|, Apps Programmer

Bank Of America
04.2014 - 07.2018
  • Utilized Apache Storm, Apache Kafka, Apache Cassandra, Java, and ELK stack to develop and deploy scalable and efficient solutions.
  • Designed and developed high-performance, pluggable alerting system on top of streaming data, enabling real-time alerts for critical events.
  • Created configurable Data Delivery System that facilitated seamless transfer of upstream streaming data to downstream systems in required format.
  • Implemented robust system to validate incoming streams of events, ensuring data integrity and accuracy.
  • Led design and development of comprehensive monitoring system for projects clusters and job execution, providing real-time insights into system performance and health.
  • Played key role in various infrastructure and cluster-level activities, including setting up and configuring Storm, Kafka, Cassandra, and other components.
  • Actively involved in cluster monitoring and maintenance activities, such as upgrades, releases, and database backups, ensuring smooth operation of the system.
  • Maintained clear and concise documentation of all system configurations, processes, and maintenance activities, ensuring knowledge transfer and ease of understanding for team.

Senior Software Engineer

Infosys
04.2011 - 03.2014
  • Developed generic framework using Hadoop, Hive, Map-Reduce, Oozie, ibatis, Oracle, MySQL, Shell Scripting, Python, and AutoSys to load data from various OLTP systems into Hadoop EDW environment.
  • Implemented Hive-Compaction module, utilizing HDFS API and Hive, to optimize data storage and retrieval processes.
  • Designed and developed a Hadoop Cluster-Maintenance framework in Java, leveraging HDFS API to clean temporary files and cache from cluster, ensuring optimal performance.
  • Created and executed functional tests for cluster Executer module, responsible for submitting Hive Queries to the Hadoop cluster.
  • Developed version deployment script in Python, streamlining the deployment process and ensuring consistency across environments.
  • Successfully migrated 150TB of historical data from Teradata to Hadoop using Historical Data migration framework.
  • Prepared comprehensive design documents, user manuals, and functional test cases to facilitate understanding and usage of the framework.
  • Ensured accuracy and quality by conducting functional and regression testing throughout the development process.

Education

Bachelor of Engineering - Computer Science

Rajiv Gandhi Technical University

Skills

  • Bigdata Frameworks: Spark, PySpark, Hadoop, Storm
  • Streaming Frameworks/Queues: Apache Kafka, Spark Structured Streaming, AWS Kinesis, AWS SQS, RabitMQ, MQTT
  • NoSQL DBs: Cassandra, AWS DynamoDB, AWS DocumentDB, MongoDB, Redis Cache
  • Datawarehouse: Databricks Delta Lake, Redshift, Hive, Trino,
  • DBs: MySQL, Postgress, AWS RDS
  • Languages: Python, Java, Scala, SQL, SparkSQL, Bash
  • AWS Services: Lambda, Kinesis, API gateway, RDS, Elasticache, DynamoDB, DocumentDB, Redshift, SQS, SNS, ECS, EC2, S3, EMR, Cloud Watch, Event Brigde
  • Others: Airflow, Ansible, Jenkins, Jira, Bitbucket, Terraform etc

Personal Information

Certification

AWS Certified Solutions Architect-Associate

Timeline

Data Engineer-|||

Expedia Group
04.2024 - Current

Senior Data Engineer-II

Utopus Insights
08.2018 - 09.2023

Analyst-|, Apps Programmer

Bank Of America
04.2014 - 07.2018

Senior Software Engineer

Infosys
04.2011 - 03.2014

Bachelor of Engineering - Computer Science

Rajiv Gandhi Technical University
Prayush Tatke