Plays a hands-on role in driving the overall technical and data architecture within Fidelity Brokerage Technology (FBT) and Fidelity Institutional Technology (FIT).
Provides technical leadership in architecting, designing, and building highly scalable Operational database applications, analytical and reporting database applications.
Understand legacy database application criticality ( Tier ratings ) and define/recommend cloud agnostic database solutions. Example. 1) OLTP database from legacy db2/oracle into CockroachDB 2) Operational database from legacy oracle into Aurora Postgres, Oracle RDS Or Oracle on EC2 instance. 3) Analytical database workload into snowflake. 4) NOSQL database, key/value with Aerospike,AWS DynamoDB. Document data in Nuxeo etc.
Defining and implementing cloud batch data strategy that support the modernization of legacy batch workflows ( informatica ) into AWS batch Or SnapLogic Or EMR.
Define zero data copy architecture ( using Segment tool) from enterprise analytical platform (snowflake) into consuming applications- salesforce sales cloud, marketing cloud, personalization cloud.
Architects modern data integration frameworks and highly scalable distributed systems using open source and data architecture designs/patterns.
Design highly resilient multi region database application considering region/zone/node failover. Example- Multi region replication (synchronous) for Tier 0 database application.
Designing enterprise master data management (MDM) solution based upon multiple data sources. Define algorithm ( match/merge) to keep golden record out of multiple sources in Reltio MDM.
Design real time data streaming use case with confluent KAFKA and consume KAFKA topics with snowpipe streaming API. Provide solution to handle duplicate messages from KAFKA topic.
Define optimize way to share data across internal/external customers using transmissions, snowflake data shares, Zero data copy architecture etc.
As a data architect, Partners with Solution architect to enable operational database in FIT sales and marketing applications, FBT Risk workstations and integrate data with analytical and CRM/Marketing Cloud platforms.
Designing and developing data solutions using Spark Scala and PySpark framework. Designing Spark Scala common reusable frameworks and components to develop complex
ETL/ELT data transformations using SparkSQL and Spark Scala native libraries. Developing data pipelines using Spark SQL, Spark RDD, DataFrame and Spark native libraries.
Evaluates, prototypes, and recommends emerging technologies and platforms. Example, SnapLogic,Twilio Segment, Salesforce marketing cloud, Personalization cloud.
As a data architect, define optimum way in bringing external social media interaction data which includes organic and paid social sites from google, linkedin, reddit into snowflake analytical layer for marketing AI/ML use cases.
Define optimized way to consume output of AWS sagemaker AI/ML model (legacy model in IBM SPSS ) for sales/marketing applications.
Solutioning, designing, architecting, and building scalable and resilient software solutions according to DevOps practices using continuous integration and continuous deployment tools(Jenkins and GitHub )
Builds and improves applications, provide technical guidance on performance improvement, observability and automation of the data solutions.
Overview
19
19
years of professional experience
Work History
Principal Architect
Fidelity Investments
07.2022 - Current
Documented current state data flow architecture. Defined interim stage and future stage data architecture for sales, marketing , control & compliance , risk workstation applications.
Analyzed legacy sales & marketing monolithic on-prem oracle operational database. Understand the database subject areas and consumer patterns.
Defined data migration road map & data strategy for CRM and marketing subject area use cases.
Refactor/ Re-model , Lift & shift database tables where ever required during cloud migration journey.
Analytical workload use cases are defined in snowflake enterprise analytical platform ( EAP) .
Use HVR for data replication from oracle to snowflake history data migration. Python Custom utility is used for data validation.
Defined batch application use case roadmap from legacy spring batch/informatica applications into cloud ( Snaplogic/AWS batch/EMR).
Defined data strategy for compliance & control audit from legacy DB2 to Snowflake. used Confluent KAFKA to capture order logs from MQ. Then used snowpiple streaming API to ingest data into snowflake.
Risk workstation database warehouse migration from on-prem oracle into snowflake. used HVR for history data migration.
Cockroach DB is used to store Tier 0 data requirement. Defined region failover strategy with active/active configuration.
Designed snaplogic ultra pipeline to capture salesforce leads/opportunities/contact information and store in AWS auroa DB.
Completed POC with High volume data use cases using EMR vs AWS batch Vs SnapLogic.
Designed zero-data-copy architecture to refer customer and finance data from analytical platform.
Used data caching ( Rreddis elastic cache ) for frequently used reference data in batch pipeline.
Documented and published database recommendation guidelines and patterns for OLTP Vs Operational data Vs Analytical data workload.
Documented decision-tree matrix for batch usage. example where to use snaplogic vs aws batch vs EMR.
Expertise and support cloud migration into AWS RDS, Aurora PostgreSQL , DynamoDB, Snowflake DB, Cockroach DB , Aerospike
Collaborated with system architects, design analysts and others to understand business and industry requirements.
Senior Technology Solution Architect
Infosys Limited , American Express
11.2018 - 06.2022
Project Name: – GDR-Global data repository Modernization (
Modernizing existing GDR ETL application into NextGen Spark
As an architect and senior data engineer, I have designed/architect spark scala/python common reusable framework/components which are being extensively used in developing ETL data pipeline
Heavily used SparkSql , Spark RDD/DataFrame and reusable routines to write complex ETL data transformation
Store persistent data in AWS S3 with private data encryption key
Along with work on database re-modeling before migrating data from IBM DB2 to oracle database
Responsibilities:
Migrating legacy ETL jobs- Ab initio graphs , Informatica workflows and Talend jobs into Spark Scala
Design and develop ETL data pipeline using spark scala common routines/framework
Leverage the power of AWS
GLUE to build data pipeline
GDR logical data modeling changes while migrating legacy DB2 database into oracle
Developing Airflow DAG’s
Airflow is used to orchestrate data pipeline and job scheduling
GitHub is used for code version management
Jenkins is used to build spark scala code and then use maven release for deployment
Performance tuning ETL spark data pipelines
Developing wrapper scripts, which internally invokes spark jobs
Incorporating data encryption process , archival and incident management logic
UAT Support and get sign-off from product management teams, post implementation support.
Technology Lead
Infosys Limited , American Express
01.2012 - 10.2018
Virtual Payment (vPayment) : Modernizing legacy payment applications & products into next generation cost effective solutions
Tokenization : Purpose of the Project is to build tokenization capability for Amex-Corporate Payment service domain. Identify PII elements and tokenize the attributes.
Modernizing legacy account payable products
Corporate Payment service ( CPS ) database re-modeling and application migration
Design E-R logical model with relationships across tables
Review ER logical model with database architect team
Implement changes to the logical design
Hand over ER model to database administrator
Co-ordinate with database admin team to complete physical tables/column creation
Review database tables
Document the process
Application development and unit testing, Performance tuning of ETL jobs
Design and develop ETL data pipeline using Talend and Ab initio ETL Tool
Migrating legacy ETL Applications into Talend
Developing Unix wrapper scripts, which internally invokes Talend jobs
End to end testing , production deployment and live support
Technology Analyst
Infosys Limited
11.2010 - 02.2012
Project: Aetna EDW ( Claim/Member domain ) Development
Role: ETL Tech Analysis
Project description: Ab initio version uplift, database uplift in Aetna claim/member domain
Responsibilities:
Design and develop Datastage ETL jobs in member & claim domain
Define dimensional modeling -Ab initio ETL version uplift from 2X to 3X version
Modify list of data pipeline components in Ab initio which is impacted as part of the version uplift
Complete unit testing and promote the changes to higher environment -Database uplift from IBM DB2 to IBM PDOA
Changes to SQL queries syntax, Changes to PL/SQL procedure and functions
Data comparison and validation after data movement from IBM DB2 to IBM PDOA platform.
Sr. Data Warehousing Consultant
Virtusa India Pvt Limited
07.2010 - 11.2010
British Telecom
Project: OST Modernization, Requirement is cut-off of OST (One Siebel tactical) flow and integrate it with SIM warehouse
All functionality of OST should be incorporated in SIM .OST deals with BT Consumers data and as part of this project all Consumer data would be loaded in SIM instead of OST
Responsibilities:
Developing data pipeline using Informatica ETL tool
Develop reusable maplets to build OST integration with SIM
Performance tuning ETL jobs during end to end run in pre-production environment
Unit and integration testing
Production rollout and live support
Tech Lead
TECH MAHINDRA LTD
01.2005 - 07.2010
General Electric Consumer Finance
Project : GE Consumer Finance – collection & recovery, Description: Collections module of GECF deals with the delinquent account holders in card industry
GE Cards maintains data for all transactions made by cardholder for their clients
The transactional data from the OLTP systems is loaded into data warehouse thereby generating business reports that are part of Decision Support System
Responsibilities:
Pre-transition and planning activities
Resource identification and Team formation
Reverse Presentation for KT
Shadowing and Reverse Shadowing
Displaying TechM confidence in SIR development activities to BT and Obtaining Transition sign-off
Development estimation, project timelines
High level component design document, Low level design document review. Preparing Test specs and reviewing test cases prepared by team members
Development of new Ab Initio graphs as per business requirement
Performance tuning of ETL ab initio jobs
Development of housekeeping job, wrapper scripts. Code review, peer review