Developed scalable data pipelines using Microsoft Fabric to ingest and process data for recommendation systems, partner ecosystems, product catalogs, solutions, and services.
Implemented semantic chunking using embedded LLMs to enhance data organization and retrieval efficiency.
Designed and executed data ingestion workflows for Azure AI Search Index to support indexing and chatbot integration.
Data Engineer
N-able
02.2023 - 02.2025
Developed custom ingestion processes in Matillion ETL to seamlessly integrate data from diverse sources into Snowflake data warehouse
Led end-to-end setup of Python-based data synchronization from Pendo to Snowflake, automating ETL workflows for real-time analytics
Integrated Snowpark to enhance data ingestion speed and efficiency, leveraging advanced processing capabilities for scalable workflows
Implemented uniform long-term contract flagging across all data products, improving data governance and accessibility for enhanced reporting
Engineered a robust data validation framework in Matillion ETL tool using Snowflake SQL stored procedures, ensuring precision in data accuracy and reliability, enabling teams to make informed decisions through trustworthy data, significantly enhancing overall operational efficiency
Enhanced database efficiency by strategically setting flags in extensive code repositories, using Snowflake SQL
Led the initiative in crafting Azure release pipelines, ensuring smooth integration of files from Github while implementing necessary validations
Formulated forward-thinking Python validations embedded within the Matillion's stage framework, proficiently identifying a 25% record differential between raw and stage tables
Data Services Engineer
HSBC
08.2021 - 02.2023
Built reusable Python patterns for Core Data Mesh, aiding producers and consumers in their data journey
Deployed Airflow DAGs to automate ETLs and Ingestion pipelines on GCP
Developed a Python violation scanner tool for daily scanning of GCP projects, flagging issues to Symphony chat rooms
Automated customer onboarding pipeline in Jenkins for GitHub repo creation, team setup, and cloud project updates
Constructed data migration and validation pipeline in Python and MYSQL, migrating on-prem Hadoop data to S3
Trained teams on GCP tools, built APIs for user dashboards, and created utility libraries for table creation DDLs
Contributed to GCP Data ingestion pipelines for HSBC Securities Services, using tools like GCS, BigQuery, Airflow, and Docker
Data Engineer
HSBC
05.2019 - 08.2021
Implemented Data partitioning framework in python to aid in the maintenance of large tables and reduce overall response time to read and load data to the data lake
Expertise in APACHE Kafka with over 2 years of experience
2+ Years of experience in handling service account setups, applying firewall rules, managing capacity/quotas, and use of beeline on Hadoop clusters
Implemented Homomorphic encryption using GCP during Enterprise Engineer Program HSBC
2+ years of experience in creating CICD pipelines and performing Sqoop, File, Juniper, and CDC (Change Data Capture) Ingestions on Hadoop Clusters
Mentored 4 interns in Summer 2019, 2 interns in 2020, and 5 interns in 2021 and helped them complete their projects around big data Ingestion pipelines
Senior Software Engineer
HSBC
04.2018 - 05.2019
Streamlined ingestion and automation for 300+ source systems
Prepared induction plan and training plan for new joiners in the team for India/Toronto/London regions
Logged and implemented development tasks based on system requirements and documented approaches on confluence
Tested troubleshooting methods, devised innovative solutions, and documented resolutions for inclusion in the knowledge base for support team use
Software Engineer
HSBC
11.2015 - 03.2018
Gathered data on integration issues and vulnerabilities and provide recommendations for efficiency
Assist with troubleshooting and issue resolution relating to current applications, assisting in the development
Diagnose SQL errors & QlikView generated scripts
Education
Bachelors - Computer Science
Computer Science
Skills
Microsoft Fabric
Microsoft Azure
Python
MySQL
Snowflake Warehouse
Apache Airflow/Composer
Azure DevOps
Data Mesh
AWS
GCP
Core Java
Hive
BigQuery
S3
Matillion
Shell Scripting
Certification
Google Cloud Platform Certified Professional Data Engineer
HSBC - Certified Enterprise Engineer
Coursera - Data Engineering with Google Cloud Specialization
Linux Academy - Google Cloud Certified Professional Data Engineer