
Dynamic software development professional with over 12 years of experience, including more than 5 years specializing in data engineering. Expertise in designing, implementing, and maintaining robust batch and streaming data platforms, leveraging full-stack proficiency in Python, SQL, and Java, complemented by extensive knowledge of various RDBMS and Data Lake technologies. Success in executing ELT and ETL processes using tools such as Snowflake, Databricks, PIG, Hive, MapReduce, Spark, and YARN, alongside deploying jobs in Hadoop Clusters with Cloudera and Hortonworks distributions. Skilled in creating automated data movement frameworks utilizing Python scripts and Airflow Scheduler while possessing a solid understanding of cloud technologies like Amazon S3, Redshift, and Google Cloud Platform.
Build and maintain an automated data ingestion platform that led to generate 100K invoices across multiple business units to generate 500M revenue.
Contributed to building a data replication ETL platform that involved various heterogeneous source and targets like oracle, teradata, Hadoop etc.
Supervised and mentored a team of 4 junior teammembers