Summary

Overview

Work History

Education

Skills

Certification

References

Interests

Timeline

RAHUL SINGH

Canton,MA

Summary

Experienced and highly skilled Data Architect / Data Engineer with a solid track record of 11+ years in Data Architecture, Integration, ETL Development, Cloud Migration, Administration, and Engineering. Adept at successfully migrating diverse databases and ETL workflows to platforms such as Teradata Cloud, Snowflake, AWS and Azure SQL DB. As a versatile professional in roles including Data Engineer, ETL Developer, and Data Architect, I possess a deep understanding of the entire Software & Data lifecycle. Meticulous in architecting solutions that precisely align with business requirements and implementing them with technical excellence.

Overview

years of professional experience

Certification

Work History

Senior Data Architect / Engineer

Tredence

Boston, MA

09.2023 - Current

Prior experience as a Data Engineer within an investment team, showcasing a deep understanding of financial data requirements and industry-specific nuances.
Employed strategic SQL techniques for seamless querying and manipulation of complex financial datasets critical for investment decision-making.
Ensured data accuracy and reliability for the fund's quantitative models and analysis.
Applied advanced Python programming skills, with a focus on Pyspark and Pandas, to process and analyze large datasets efficiently on their Databricks and Snowflake Platforms
Developed custom algorithms to meet the specific quantitative and analytical needs of the hedge fund.
Specialized in integrating alternative data sources, contributing to a more holistic view of market trends and opportunities.
Applied innovative techniques to incorporate non-traditional datasets into the fund's analytical models.
Engineered and fine-tuned Databricks clusters to balance performance and cost, utilizing auto-scaling and custom configurations for efficient resource allocation.
Implemented dynamic cluster management to adapt to varying workloads and optimize processing times.
Orchestrated data workflows by designing and scheduling Databricks jobs, optimizing parallel processing to handle large datasets efficiently.
Utilized Databricks notebooks for collaborative and reproducible analysis, incorporating version control for code management.
Implemented Delta Lake to enhance data reliability, ACID compliance, and versioning within Databricks, ensuring data consistency for critical business processes.
Utilized time travel and schema evolution features for seamless data evolution.
Designed and implemented real-time data processing using Databricks Structured Streaming, ensuring low-latency and high-throughput processing for time-sensitive applications.
Integrated streaming pipelines with Delta Lake for reliable, transactional streaming analytics.
Designed and implemented multi-cluster Snowflake warehouses to efficiently handle concurrent workloads, optimizing query performance and resource utilization.
Configured auto-scaling policies to dynamically adjust warehouse size based on workload demands.
Implemented and managed Snowflake's data sharing functionality to securely and efficiently share data across different accounts, streamlining collaboration and data exchange.
Leveraged materialized views for performance optimization in shared datasets.
Utilized Snowflake's time travel feature for seamless data history tracking and point-in-time recovery, ensuring data consistency and compliance with regulatory requirements.
Configured fail-safe mechanisms to prevent data loss and maintain data integrity during system failures or accidental changes.
Implemented Snowflake's security features, including role-based access control (RBAC), encryption, and multi-factor authentication, to ensure data privacy and compliance with industry regulations.
Configured network policies and Virtual Private Snowflake (VPS) for secure data access.
Orchestrated end-to-end data pipelines using Snowflake, integrating with tools like Apache Airflow and Data Build Tool (DBT) for workflow automation and data transformation.
Utilized Snowflake tasks and stored procedures for efficient and automated data processing workflows.
Demonstrated a deep understanding of financial markets and instruments, acquired through extensive experience as a Data Engineer in hedge fund environments.
Collaborated closely with quantitative analysts and portfolio managers to align data solutions with the fund's investment strategies.
Implemented and managed version control systems, ensuring a streamlined and controlled deployment process for data platform enhancements using GITLAB.
Deployed CI/CD pipelines to automate testing and deployment, minimizing downtime and maximizing reliability using GITLAB Actions.

Senior Data Architect

Resideo Technologies

Canton, MA

09.2022 - 08.2023

Analyzed the SQL scripts and designed the solution to implement using Spark Framework in Python and Scala
Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, Spark SQL and U-SQL Azure Data Lake Analytics
Data Ingestion to one or more Azure Services (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in In Azure Databricks
Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse
Implemented data integration and synchronization solutions between Snowflake and Databricks, ensuring seamless data transfer and synchronization across platforms
Worked with Terraform Templates to automate the Azure IaaS virtual machines using terraform modules and deployed virtual machine scale sets in production environment.
Leveraged advanced skills in Databricks to optimize and scale the platform for increased efficiency.
Demonstrated expertise in Python with a focus on Pyspark and Pandas for efficient data processing and analysis within the Databricks environment.

Senior Data Architect

Teradata

Atlanta, GA

10.2020 - 09.2022

Worked on Teradata on Premise Cloud Migration Plan to Snowflake Cloud over Azure & AWS as Platform
Worked on designing, creating and implementing both Single & Multi Cluster Warehouses in Snowflake as part of workload migration from Teradata on premises Datawarehouse environments
Creating Databases, Schemas, Tables, Event trigger and using functions flatten JSON to load tables in Snowflake
Creating pipeline in Snowflake to load the History data as part of Migration from Teradata on Premise to Snowflake Databases
Designed data pipelines using Fivetran & DBT to load & Transform data in Snowflake tables
Migrating an entire Teradata database to SNowflake on GCP using using data pipelines in Airflow.
Working on the Solutions to Store data files in Google Cloud Buckets daily basis using DataProc and processing the bucket files in Snowflake to load the data in the tables.
Spearheaded the migration of an on-premises data warehouse to Snowflake AWS optimizing query performance and reducing infrastructure costs by 40%
Led the design and implementation of a real-time data processing system using Kafka and AWS Kinesis, resulting in a 30% reduction in data latency and improved business insights
Create CICD Pipelines in Azure DevOps , Jenkins and GitLab for continuous integration and continuous development on different cloud platforms like Azure , AWS & GCP

Associate Data Architect

Wells Fargo

Charlotte, NC

03.2020 - 10.2020

Working on the Encryption of all the Sensitive and Non-Sensitive customer data using third party encryption Algorithm in Teradata
ER & Dimension modeling & Design Data Models using ERwin Data Modeler tool
Manipulated data using pivot tables, pivot charts and macros in Excel
Extensive Experience in using Fast Load, Fast Export, TPT, TPump, Multi Load and BTEQ
Directed development of project scope, including estimates, budgets, and schedules.

Lead Data Engineer

CGI

Lafayette, LA

06.2017 - 03.2020

Configuring Teradata Database for Customer Messages and Storing the Healthcare Information in Teradata Table
Worked on Teradata CIM(Customer Interaction Manager) and RTIM(Real Time Interaction Manager) to create campaigns for marketing and capture the user responses based on the channels
Performance Tuning using the Teradata Viewpoint and Teradata Active System Management to filter the bad running queries
Performance Tuning of long running queries to improve their performance
Worked on Code Review by using Teradata Statistics wizard.

Data Engineer

Teradata Corporation

Mumbai, Maharashtra

05.2012 - 05.2017

Closely with the Data Integration team to develop ETL solutions using various ETL tools like (Informatica, Data stage and Teradata tools Utilities)
Migrating DB2 Data Marts and Subject area codes in Teradata Environment using DataStage ETL and Teradata Database
Loading and unloading Teradata tables using Multiload, Fast load, Fast export and BTEQ export utilities scripts
Performance tuning for 22 BTEQ running within the time window SLA
Loading tables from Oracle to Teradata using Fast clone tool to complete the history load for ODS and Datamart Layer
Installed Teradata Query Grid to unload and load data from HDFS to Teradata
Monitoring the Database through Teradata Viewpoint.

Education

Bachelor of Science - Information Technology

Mumbai University

06.2011

Skills

Teradata Database 12-1720 (TPT, BTEQ, TTU, Teradata Stored Procedures)
ETL ( DataStage,Informatica)
Dashboard (Tableau & Looker)
SQL & Advanced SQL
AWS(S3,Glue,SNS, SQS, Lambda)
Azure Databricks (Delta Tables, Delta Live Tables & Unity Catalog)
CI/ CD (GitLab Actions, Jenkins)
Python(Pyspark, Boto3)
Unix shell scripting

RDBMS(DB2,SQL Server,Oracle,Teradata)
Erwin Data Modeler
Performance Tuning (SQL , Spark)
Cloud Migration
Azure (Data Factory ,Synapse, Azure Terraform,ADLS Gen 2,Blob Storage,Data Explorer)
Snowflake (Snowpipe, Stage ,Stream, Task, Snowpark, Snowproc)
Data build Tool (DBT) , Fivetran
Apache Spark , Apache Scala

Certification

Teradata Certified Professional. Teradata Certified SQL developer. Teradata Certified Specialist. Amazon Web Services Solutions Architect. Databricks Lakehouse Fundamentals.

References

References References & supporting documentation can be provided upon request.

Interests

Playing Table Tennis Cricket Soccer Listening to Music

Timeline

Senior Data Architect / Engineer

Tredence

09.2023 - Current

Senior Data Architect

Resideo Technologies

09.2022 - 08.2023

Senior Data Architect

Teradata

10.2020 - 09.2022

Associate Data Architect

Wells Fargo

03.2020 - 10.2020

Lead Data Engineer

CGI

06.2017 - 03.2020

Data Engineer

Teradata Corporation

05.2012 - 05.2017

Bachelor of Science - Information Technology

Mumbai University

RAHUL SINGH

Summary

Overview

Work History

Senior Data Architect / Engineer

Senior Data Architect

Senior Data Architect

Associate Data Architect

Lead Data Engineer

Data Engineer

Education

Bachelor of Science - Information Technology

Skills

Certification

References

Interests

Timeline

Senior Data Architect / Engineer

Senior Data Architect

Senior Data Architect

Associate Data Architect

Lead Data Engineer

Data Engineer

Bachelor of Science - Information Technology

Similar Profiles

Asha MalviyaAsha Malviya

Manoj Singh ParmarManoj Singh Parmar

Surbhit BhardwajSurbhit Bhardwaj

Akash ReshanAkash Reshan

Andrea RileyAndrea Riley