Summary
Overview
Work History
Education
Skills
Certification
Projects
Timeline
Generic

Soumen Bag

Muskego,WI

Summary

Possess over 17 yrs. of IT experience in Data Modeling for OLAP & OLTP, Oracle DBA & Database Designing, Data Integration Architect, Test Data Management, Big Data and Data Warehousing as a Developer & Solution Architect. Extensive experience in Data Modeling, Database Design, Data Model Reverse Engineering, Data Lineage Analysis. Extensive experience in building data pipelines in Databricks, manage Unity Catalog, Delta Lake. Extensive experience in AWS Gule Crawler, ETL, Data Catalog, Iceberg & Lake Formation. Worked on Oracle database backup & recovery, manage database objects, export & import, user access management, tablespace management as Oracle DBA L1. Extensive experience of OLAP data model design, Data Warehousing, Data Marts, Star schema, Snowflake Schema Modeling, Dimensional Modeling concepts. Proficient in building CICD pipeline using GitLab, Ansible. Experienced in Flyway for schema management. Hands on experience in noSQL databases like Casandra and Mongo. Manage Sprint Planning, Daily Scrum, Sprint backlog review, Retro & PI planning. Worked with Kohl’s for 7 years as Data Analyst, Test Data Management Architect, Informatica Admin/ Developer & Tableau Visualization Developer. Extensive experience in ER Studio Data Architect & Erwin Data Modeler & Data Mart Web Portal. Good experience in designing reports/dashboard created in Power BI. Extensive experience in Informatica PowerCenter & TDM admin. Install Informatica on both Linux and Windows, upgrade repository version, apply EBF and hotfix, create role and user, and assign privilege to user or group. Extensive experience in using Informatica PowerCenter 10.2HF2 & ILM-TDM for implementation of ETL methodology in Data Extraction, Transformation, Loading, Data Subset, Masking, Profiling & Synthetic Data generation. Extensive experience in Test Data Management pillars like Data Subset, Data Masking, Data Encryption/Tokenization, Data Profiling, Golden Data Architecture, Synthetic Data Creation, Data Refresh and Data Virtualization. Have Data Integration knowledge between on-prem databases and data lakes on AWS. Worked Redshift data warehouse on AWS. Hands on experience in Spark SQL & Machine Learning. Developed TDM tool called Test Data Platform (TDP) in Core Java & Swing which is capable of test data generation, Data Tokenization/Encryption, Database refresh for relational & no SQL database. Extensive experience on Data Virtualization, Data Masking, point in time restore & Data as Service using Delphix. Extensive experience in Google Big Query & Big Table. Extensive experience in DevOps CICD pipeline. Integrate TDM solution with CICD pipe line. Worked on Informatica ILM, FAS, DVO, IDQ, IDE, Metadata Manager, PowerExchange for CDC and Mainframe. Having working experience in data integration between Oracle & Spanner Database. Having experience on conducting training on Informatica suite of products across locations and projects. Experience and Expertise in developing SQL and PL/SQL codes through various procedures, functions, and packages to implement the ETL logic and business logics of database in Oracle, DB2, Teradata, SQL Server. Having good knowledge and experience in UNIX Shell programming. Having good working experience in Core Java & Python programing languages. Achieves goals, objectives and milestones in an accurate and consistent manner.

Overview

18
18
years of professional experience
1
1
Certification

Work History

Sr Data Engineer

Northwestern Mutual
Milwaukee, US
05.2020 - Current
  • Company Overview: Northwestern Mutual is a financial services organization that provides a range of insurance and investment products
  • Website: https://www.northwesternmutual.com
  • This project is related to building Enterprise data strategy, modernizing current on-prem ETL pipeline
  • As part of the project, define strategy for modern ETL pipeline using AWS Glue, Databricks, provide ETL best practice, setup naming standard for data modeling, design database architecture, manage infrastructure, performance tuning guide, solving any complex technical challenges
  • Perform PoC on event driven data pipeline, Change Data Capture using Qlik, implement Type-2 dimension
  • Working as Data Architect & Data Analyst
  • Responsible for data modeling & reverse engineering for new & existing applications
  • Work with business analyst and app teams to understand requirement for data model
  • Define end to end data integration & visualization strategy
  • Database design & performance tuning of database
  • Define Data Quality and Data Governance rules and best practice
  • Data Profiling to understand referential integrity between tables
  • Co-ordination with app teams and offshore
  • Manage Sprint Planning, Daily Scrum, Sprint backlog review, Retro & PI planning
  • Administering Erwin – Upgrade, Apply Patch, etc
  • Write PL/SQL block to export/import schema and/or table in Oracle using data pump API
  • Automate data provisioning by writing code in java and integrate it with DevOps tools like Jenkins and check code in version control repository GitLab
  • Northwestern Mutual is a financial services organization that provides a range of insurance and investment products.
  • Website: https://www.northwesternmutual.com
  • Developed data pipelines to ingest and process large datasets.
  • Created ETL scripts to move and transform data from various sources into a centralized repository.
  • Performed advanced analytics on structured and unstructured data using SQL, Python, R.
  • Designed, built, and maintained high-performance databases for reporting and analysis purposes.
  • Optimized existing queries in order to improve query performance and reduce the load on the server.
  • Provided technical support in troubleshooting issues related to data processing or storage systems.
  • Built complex reports utilizing multiple sources of information from different systems.
  • Adept in troubleshooting and identifying current issues and providing effective solutions.

Data Analyst

Kohl’s
Milwaukee, US
10.2013 - 03.2020
  • Company Overview: Kohl's Corporation is an American department store chain headquartered in the Milwaukee suburb of Menomonee Falls, Wisconsin, operating 1,506 stores in 49 states
  • Website: https://www.kohls.com
  • This project is to facilitate any environment or platform related & data related requirements and challenges in Merchandise space
  • Integrate test data provisioning & encryption/masking solution with CICD build pipeline in Jenkins
  • Provide solution to any existing data issue & database performance related issues for applications to run smooth with the underlying data
  • Work with Data Architect to understand the Data Model
  • Responsible for data modeling & reverse engineering for new & existing applications
  • Work with business analyst to understand requirement for data model
  • Integrate DDL scripts with Jenkins
  • Database design & performance tuning of database
  • Define rules & policies for PCI/PII data protection
  • Data Profiling to understand referential integrity between tables
  • Data health checkup script development and publish report in Tableau
  • Write Spark code to migrate large volume data from Oracle to MongoDB
  • Publish data health checkup report to stakeholders
  • Create & manage objects in Spanner & MySQL
  • Create data backup & restore strategy for Spanner & MySQL
  • Integrate data between on-premise Oracle database with Spanner database on GCP using Java
  • Write PL/SQL block to export/import schema and/or table in Oracle using data pump API
  • Automate data provisioning by writing code in java and integrate it with DevOps tools like Jenkins and check code in version control repository GitHub
  • Kohl's Corporation is an American department store chain headquartered in the Milwaukee suburb of Menomonee Falls, Wisconsin, operating 1,506 stores in 49 states
  • Website: https://www.kohls.com

ETL & BI Architect

CNO Financial Group
Chicago, US
03.2013 - 10.2013
  • Company Overview: CNO Financial Group is an insurance company that distributes its products through various channels including independent agents and direct sales
  • Website: https://www.cnoinc.com
  • Current marketing systems exhibit noteworthy challenges like Inhibit growth and profitability due to lack of flexibility, Impact operational effectiveness: labor intensive compensation reconciliations and manual processes
  • Delay in enhancements, closed architecture, limited talent pool delay in reporting sales results and agent metrics
  • Replace current BLC Marketing Systems with configurable, integrated system for agent on-boarding, compensation, hierarchy management, reporting and analysis, reduce expenses associated with maintaining legacy platforms, manage compensation leakage, improve the speed-to-market of compensation plan changes, and provide more flexibility to define compensation plans
  • Working as Data Integration Lead
  • Involved in project estimation & preparing delivery timeline
  • Data Modeling
  • Data migration strategy preparation
  • Requirement Gathering from client
  • Requirement Analysis
  • High Level Design preparation
  • Validate project understanding & high level design with client through sign-off mail
  • Coordinate with offshore team member
  • Coding & Unit Testing on need basis
  • Production Rollout
  • Create ILM data archive jobs
  • Archive data into FAS & validate archived data
  • CNO Financial Group is an insurance company that distributes its products through various channels including independent agents and direct sales
  • Website: https://www.cnoinc.com

ETL & BI Developer

Thrivent Financial
Minneapolis, US
02.2007 - 03.2012
  • Company Overview: Thrivent Financial for Lutherans is a Fortune 500 financial services organization that offers a range of financial products and services
  • Website: https://www.thrivent.com
  • This IM Data Member Data Mart release1 (Missing Dates) install has brought 10 million CIF contract's issue and termination date information into TIW to further enhance IM Data program’s philosophy to build TIW as a comprehensive, consolidated, reusable, 360 degree view of member data for operational and/or informational use
  • Capture the change in Customer’s relationship with Thrivent into the Customer Relationship Cycle Event Fact
  • So this new fact will keep track of each cycle of customer Thrivent relationship events like when customer is acquired, migrates through membership types and terminates
  • This process also populates the Customer Relationship Cycle dimension table which stores the customer relationship cycle number and status
  • Develop the compare processes to validate the columns as per the business requirements
  • Requirement Analysis
  • Project estimation & delivery timeline
  • High Level Design preparation
  • Validate project understanding & high-level design with client through sign-off mail
  • Design data model for DWH & Data Marts
  • Identify bottleneck of DWH & Data Marts & improve performance
  • Coordinate with onshore counterpart
  • Coding & Unit Testing of ETL Jobs
  • Defect Fix
  • Performance Monitoring & fix bottleneck of ETL jobs
  • Production Rollout
  • Postproduction warranty support
  • Thrivent Financial for Lutherans is a Fortune 500 financial services organization that offers a range of financial products and services
  • Website: https://www.thrivent.com

Education

BACHELOR DEGREE IN ENGINEERING - Computer Science and Engineering

West Bengal University of Technology
India
06-2005

XII -

West Bengal Council of Higher Secondary Education

X -

West Bengal Board of Secondary Education

Skills

  • Data Warehousing
  • Machine Learning
  • NoSQL Databases
  • Data Modeling
  • Continuous integration
  • Performance Tuning
  • Python Programming
  • Business Intelligence
  • Data Visualization

Certification

  • Informatica Developer Certified
  • Oracle Certified
  • LOMA 280 Certified

Projects

Data Integration and Enterprise Data Strategy – Engineering, Northwestern Mutual, Data Architect & DBA, Milwaukee, WI, 05/19/20, Present, This project is related to building Enterprise data strategy, modernizing current on-prem ETL pipeline. As part of the project, define strategy for modern ETL pipeline using AWS Glue, Databricks, provide ETL best practice, setup naming standard for data modeling, design database architecture, manage infrastructure, performance tuning guide, solving any complex technical challenges. Perform PoC on event driven data pipeline, Change Data Capture using Qlik, implement Type-2 dimension., Working as Data Architect & Data Analyst., Responsible for data modeling & reverse engineering for new & existing applications., Work with business analyst and app teams to understand requirement for data model., Define end to end data integration & visualization strategy., Database design & performance tuning of database., Define Data Quality and Data Governance rules and best practice., Data Profiling to understand referential integrity between tables., Co-ordination with app teams and offshore., Manage Sprint Planning, Daily Scrum, Sprint backlog review, Retro & PI planning., Administering Erwin – Upgrade, Apply Patch, etc., Write PL/SQL block to export/import schema and/or table in Oracle using data pump API., Automate data provisioning by writing code in java and integrate it with DevOps tools like Jenkins and check code in version control repository GitLab. Data Analyst, Kohl’s, Data Analyst, Milwaukee, WI, 10/28/13, 03/22/20, This project is to facilitate any environment or platform related & data related requirements and challenges in Merchandise space. Integrate test data provisioning & encryption/masking solution with CICD build pipeline in Jenkins. Provide solution to any existing data issue & database performance related issues for applications to run smooth with the underlying data., Work with Data Architect to understand the Data Model., Responsible for data modeling & reverse engineering for new & existing applications., Work with business analyst to understand requirement for data model., Integrate DDL scripts with Jenkins., Database design & performance tuning of database., Define rules & policies for PCI/PII data protection., Data Profiling to understand referential integrity between tables., Data health checkup script development and publish report in Tableau., Write Spark code to migrate large volume data from Oracle to MongoDB., Publish data health checkup report to stakeholders., Create & manage objects in Spanner & MySQL., Create data backup & restore strategy for Spanner & MySQL., Integrate data between on-premise Oracle database with Spanner database on GCP using Java., Write PL/SQL block to export/import schema and/or table in Oracle using data pump API., Automate data provisioning by writing code in java and integrate it with DevOps tools like Jenkins and check code in version control repository GitHub. ETL & BI Architect, CNO, ETL & BI Architect, Chicago, IL, 03/22/13, 10/25/13, Current marketing systems exhibit noteworthy challenges like Inhibit growth and profitability due to lack of flexibility, Impact operational effectiveness: labor intensive compensation reconciliations and manual processes. Delay in enhancements, closed architecture, limited talent pool delay in reporting sales results and agent metrics. Replace current BLC Marketing Systems with configurable, integrated system for agent on-boarding, compensation, hierarchy management, reporting and analysis, reduce expenses associated with maintaining legacy platforms, manage compensation leakage, improve the speed-to-market of compensation plan changes, and provide more flexibility to define compensation plans., Working as Data Integration Lead., Involved in project estimation & preparing delivery timeline., Data Modeling., Data migration strategy preparation., Requirement Gathering from client., Requirement Analysis., High Level Design preparation., Validate project understanding & high level design with client through sign-off mail., Coordinate with offshore team member., Coding & Unit Testing on need basis., Production Rollout., Create ILM data archive jobs., Archive data into FAS & validate archived data. ETL & BI Developer, Thrivent, ETL & BI Developer, Minneapolis, MN, 02/10/07, 03/15/12, This IM Data Member Data Mart release1 (Missing Dates) install has brought 10 million CIF contract's issue and termination date information into TIW to further enhance IM Data program’s philosophy to build TIW as a comprehensive, consolidated, reusable, 360 degree view of member data for operational and/or informational use. Capture the change in Customer’s relationship with Thrivent into the Customer Relationship Cycle Event Fact. So this new fact will keep track of each cycle of customer Thrivent relationship events like when customer is acquired, migrates through membership types and terminates. This process also populates the Customer Relationship Cycle dimension table which stores the customer relationship cycle number and status. Develop the compare processes to validate the columns as per the business requirements., Requirement Analysis., Project estimation & delivery timeline., High Level Design preparation., Validate project understanding & high-level design with client through sign-off mail., Design data model for DWH & Data Marts., Identify bottleneck of DWH & Data Marts & improve performance., Coordinate with onshore counterpart., Coding & Unit Testing of ETL Jobs., Defect Fix., Performance Monitoring & fix bottleneck of ETL jobs., Production Rollout., Postproduction warranty support.

Timeline

Sr Data Engineer

Northwestern Mutual
05.2020 - Current

Data Analyst

Kohl’s
10.2013 - 03.2020

ETL & BI Architect

CNO Financial Group
03.2013 - 10.2013

ETL & BI Developer

Thrivent Financial
02.2007 - 03.2012

BACHELOR DEGREE IN ENGINEERING - Computer Science and Engineering

West Bengal University of Technology

XII -

West Bengal Council of Higher Secondary Education

X -

West Bengal Board of Secondary Education
Soumen Bag