Summary
Overview
Work History
Education
Skills
Timeline
Generic

Deepa Dave

Bensalem,PA

Summary

Innovative Lead Data Engineer known for productive and efficient task completion. Possess specialized skills in big data processing, machine learning model deployment, and cloud computing architecture. Excel in problem-solving, teamwork, and effective communication, ensuring successful project delivery and stakeholder satisfaction.

Overview

17
17
years of professional experience

Work History

Lead Data Engineer (NCG)

BlueShield California
08.2023 - Current
  • Designed comprehensive end-to-end data pipelines on Azure using the Modern Data Integration Framework (MDIF) to facilitate efficient and secure data movement
  • Implemented data ingestion processes to extract, transform, and load (ETL) data from various sources into Azure's raw and refined layers, applying necessary transformations and data quality checks
  • Established seamless integration between Azure and Snowflake, orchestrating the movement of refined data from Azure to Snowflake's load layer for further processing and analytics
  • Leveraged DBT Core within Snowflake to perform data transformations and migrations between different schemas, such as Raw Vault and Business Vault, ensuring data consistency and integrity
  • Working closely with data analysts and data scientists to understand their requirements and provide them with clean, reliable data for analysis and modeling
  • Documenting data flows, processes, and best practices for data engineering on Snowflake and DBT, and promoting their adoption across the organization
  • Implemented and managed Data Quality processes within the Collibra platform
  • Developed metrics and KPIs to measure data quality of Critical Data Elements by implementing data quality rules and validations using Collibra DQ features.

Lead Data Engineer (NCG)

Under Armor
Baltimore, MD
12.2020 - 07.2023
  • Extensively worked on handling complex data migration from SAP HANA to the Snowflake Cloud platform.
  • Collaborating with stakeholders to understand data requirements and designing solutions accordingly
  • Evaluate and design logical and physical databases; define logical views and physical data structures.
  • Support clients with business analysis, documentation, and data modeling.
  • Worked on converting SAP HANA views to Snowflake Dimension and Fact tables
  • Installed and configured Apache Airflow for S3 bucket and Snowflake Data warehouse and created DAGs to run the Airflow
  • Developing and maintaining DBT projects for data transformation and orchestration
  • Writing and managing DBT models, including SQL transformations and data pipeline configurations.
  • Ensuring best practices for version control, testing, and documentation within DBT projects.
  • Building ELT solutions and Data Modeling and Architected using DAG (Directed Acyclic Graph) for automating ETL pipelines using Python and SnowSQL.
  • Implemented Snowflake advanced concepts like data sharing, query performance tuning, tasks, streams, and zero copy cloning features.
  • Direct experience with Snowflake Utilities like Snow SQL and Snow Pipe.
  • Using the Snowflake Python APIs, I was able to quickly create tasks, resume/suspend tasks.
  • Build reusable and modular code using Jinja macros and stored procedures
  • Identifying and addressing bottlenecks or inefficiencies in data pipelines or DBT processes
  • Managed multiple deadlines across 10-15 business work streams to meet dynamic needs of multiple clients
  • Automated resulting scripts and workflow scheduling
  • Using Airflow to ensure daily execution of jobs in Production
  • Ensuring data security measures are in place within Snowflake, adhering to best practices and compliance standards
  • Implementing strategies for data lineage, metadata management, and maintaining documentation.

Data Modeler (Capgemini)

Nordea Bank
Sweden
01.2017 - 11.2018
  • Collaborating with stakeholders to understand data needs and translate requirements into effective database designs
  • Designing conceptual, logical, and physical data models based on business requirements using ERStudio or ERwin
  • Creating entity-relationship diagrams (ERDs), data flow diagrams, and other modeling artifacts to represent data structures
  • Implementing database structures, tables, relationships, and constraints based on the designed data models
  • Proficiently using ERStudio or ERwin tools to create, edit, and manage data models
  • Understanding the functionalities of these tools to optimize data modeling processes
  • Maintaining documentation of data models, including version control and change history
  • Ensuring adherence to data modeling standards, best practices, and naming conventions
  • Identifying and recommending optimizations for database structures and relationships
  • Managing metadata associated with data models, including definitions, attributes, and relationships
  • Ensuring accuracy and consistency of metadata across different data models and databases
  • Collaborating with data governance teams to ensure compliance with data standards, policies, and regulatory requirements
  • Identifying opportunities for process enhancement and implementing improvements in data modeling methodologies.

Data Architect (TCS)

GE Capital Americas
Norwalk, CT
11.2012 - 09.2015
  • Designing and developing data architectures based on business requirements using Teradata as the database platform and utilizing Informatica for ETL (Extract, Transform, and Load) processes
  • Owned and managed all changes to data models for project assigned using Teradata FSLDM
  • Creating and maintaining conceptual, logical, and physical data models in Teradata, defining tables, relationships, and structures
  • Designing, implementing, and optimizing Teradata databases to ensure efficient data storage, retrieval, and scalability
  • Developing and maintaining ETL workflows and data pipelines using Informatica PowerCenter or other Informatica tools to extract, transform, and load data into Teradata
  • Monitoring and optimizing Teradata databases and Informatica workflows for performance, ensuring efficient query execution and ETL processes
  • Identifying and resolving bottlenecks in data processing and storage to enhance overall system performance
  • Implementing data integration strategies using Informatica, transforming and consolidating data from disparate sources into Teradata
  • Ensuring data quality and consistency across different systems and databases through effective transformation processes
  • Implementing and maintaining data security measures within Teradata, adhering to industry standards and compliance requirements
  • Collaborating with security teams to ensure data protection and privacy in accordance with regulatory guidelines
  • Collaborating with cross-functional teams including data engineers, analysts, business stakeholders, and IT teams to gather requirements and implement solutions
  • Communicating technical concepts and solutions to non-technical stakeholders effectively
  • Documenting data architecture, database designs, ETL processes, and best practices for future reference and knowledge sharing
  • Implementing Referential Integrity using primary key and foreign key relationships
  • Designed and implemented star schema models, identified, and built Facts, Dimensions, Aggregate tables to support reporting needs and requirements of users
  • Generated ad-hoc SQL queries using joins, database connections and transformation rules to fetch data from legacy Oracle and Teradata database systems
  • Identifying opportunities for process improvements and implementing enhancements in data architectures and ETL workflows.

Senior Technical Lead (TCS)

Warner Bros
Burbank, CA
06.2007 - 10.2012
  • Collaborated closely with analysts to produce detailed solution approach design documents
  • Prepared low level technical design document and participated in build/review of the BTEQ Scripts, Fast Export, Multi load and Fast Load scripts, Reviewed Unit Test Plans & System Test cases
  • Worked with complex SQL queries to test data generated by ETL process against target database
  • Provided quick production fixes and proactively involved in fixing production support issues
  • Managed both internal and external production schedules, commitments, and delivery timelines
  • Served as key business /technical resource on complex and/or critical issues
  • Facilitate monthly meetings with clients to document requirements and explore potential solutions
  • Worked with Micro Strategy users to create and modify reports and validate data
  • Developed and implemented escalation procedures for contacting off-hour support personnel
  • Substantially decreased off-hour support calls by working with support personnel learning procedures to proactively resolve common failures.

Education

Bachelor of Science - Engineering

Mumbai University
07-2004

Skills

  • Cloud Datawarehouse
  • Performance Tuning
  • Data Modeling
  • Data Migration
  • Database Design
  • Data Security
  • Relational databases
  • Continuous integration
  • Business Intelligence

Timeline

Lead Data Engineer (NCG)

BlueShield California
08.2023 - Current

Lead Data Engineer (NCG)

Under Armor
12.2020 - 07.2023

Data Modeler (Capgemini)

Nordea Bank
01.2017 - 11.2018

Data Architect (TCS)

GE Capital Americas
11.2012 - 09.2015

Senior Technical Lead (TCS)

Warner Bros
06.2007 - 10.2012

Bachelor of Science - Engineering

Mumbai University
Deepa Dave