Summary
Overview
Work History
Education
Skills
Timeline
Generic

Jude Sebastian Reginald

Toronto,ON

Summary

Data Engineer with 6 years of IT experience including 5 years of experience with analytical and intuitive skills in building successful Big Data and cloud projects. Expertise in designing scalable Big Data applications, migrating data warehouse models into cloud on large-scale distributed data. Extensive knowledge on data serialization techniques like Avro, Sequence Files, Parquet, JSON, and ORC. Proficient in Azure data services such as Azure SQL Database, Azure Cosmos DB, Azure Data Lake Storage, and Azure Data Factory. Skilled in ETL/ELT processes, data modeling, and data analysis. Responsible for creating and maintaining architecture for RESTful API using Spring Boot. Strong experience and knowledge of real-time data analytics using Spark Streaming and Kafka. Good experience developing applications using Python and Scala. Extensively worked using Azure Databricks cloud by using (ADF, ADLS Gen2, SQL, Blob storage, Synapse).

Overview

11
11
years of professional experience

Work History

Senior Architect - BI

Lennox International
04.2022 - 06.2024
  • Interacted with end customers and gathering requirements for Designing and developing common architecture for storing Retail data within Enterprise and building Data Lake in Azure cloud
  • Developed Geo Tracker applications using PySpark to integrate data coming from other sources like ftp, csv files processed using Azure Databricks and written into Snowflake
  • Developed Spark applications for data extraction, transformation and aggregation from multiple systems and stored on Azure Data Lake Storage using Azure Databricks notebooks
  • Worked on Spark with Scala and converted into Pyspark Code for Geo tracker
  • Written Unzip and decode functions using Spark with Scala and parsing the xml files into Azure blog storage
  • Developed PySpark scripts from source system like Azure Event Hub to ingest data in reload, append, and merge mode into Delta tables in Databricks
  • Optimized PySpark applications on Databricks, which yielded a significant amount of cost reduction
  • Created Pipelines in ADF to copy parquet files from ADLS Gen2 location to Azure Synapse Analytics Data Warehouse
  • Environment: Azure ADF, Scala, Pyspark, Spark, SQL, Snowflake, Databricks, GitHub, Azure Git, Kafka, ADF Gen2, ADF Blog Storage.

Senior Architect - BI

Lennox India Technology Center
06.2014 - 03.2022
  • Worked on replacing existing Hive scripts with Spark Data-Frame transformation and actions for faster analysis of the data
  • Developed PySpark scripts to Reduce costs of organization by 30% by migrating customers data in SQL Server to Hadoop
  • Experience in handling JSON datasets and writing custom Python functions to parse through JSON data using Spark
  • Worked on best buy applications using PySpark to integrate data coming from other sources like ftp, csv files processed using Azure Databricks
  • Developed Spark applications for data extraction, transformation and aggregation from multiple systems and stored on Azure Data Lake Storage using Azure Databricks notebooks
  • Created Pipelines in ADF to copy parquet files from ADLS Gen2 location to Azure Synapse Analytics Data Warehouse
  • Generate weekly based reports and ops reports, customer goals reports, mobile scan and pay goals and usage in sales data by using power BI
  • Environment: Azure ADF, Scala, Pyspark, Spark, SQL, Snowflake, Databricks, GitHub, Azure Git, Kafka, ADF Gen2, ADF Blog Storage,
  • Azure Synpase, Power BI.

BI Consultant

Accenture Technology Solutions
02.2013 - 04.2014
  • Worked on Real Time Streaming data of various source system (EROMINUS, IVR, FINANCIAL, COMPLIANTS, FINANCE) of various streams and processing the data from source AWS S3 to target AWS Snowflake
  • Developed lambda function using Python for data migration services for data to load from AWS S3 TO SNOWFALKE
  • Created Python-lambda function for both Historical and Delta load
  • Designed Kafka producer client using Confluent Kafka and produced events into Kafka topic
  • Subscribing the Kafka topic with Kafka consumer client and process the events in real time using spark
  • Environment: Python, Java, PL SQL, S3, RDS, Kinesis, Lambda, Terraform, Makefile, CloudWatch, Kafka, Data Dog, Yaml, Docker, Snowflake, Jenkins, Shell Scripting, Data Bricks, Avro, Parquet, Json.

Education

Master of Business in ERP - Information Technology

Victoria University
Singapore
11.2012

Bachelor of Science - Electronics And Communications Engineering

Anna University
India
04.2010

Skills

  • Technical Skills:
  • Big Data Ecosystem: HDFS, Pig, Hive, Oozie, Sqoop, Impala, Presto and Spark, Spark SQL, Dataframe and Dataset
  • Cloud Ecosystem: Azure (Databricks, ADF, Synapse, ADLS Gen2)
  • Languages: Python, SQL, and Shell Scripting
  • Orchestration/Tools: Airflow, Oozie, Maven, Jenkins, IntelliJ, and GIT
  • CI/CD: Azure Devops
  • Streaming: Spark Streaming, Kafka

Timeline

Senior Architect - BI

Lennox International
04.2022 - 06.2024

Senior Architect - BI

Lennox India Technology Center
06.2014 - 03.2022

BI Consultant

Accenture Technology Solutions
02.2013 - 04.2014

Master of Business in ERP - Information Technology

Victoria University

Bachelor of Science - Electronics And Communications Engineering

Anna University
Jude Sebastian Reginald