Summary
Overview
Work History
Education
Skills
Websites
Certification
Languages
Timeline
Generic

Jackline Ntongai

Lees Summit,USA

Summary

Strategic and detail-oriented Data Engineer with a background in analytics and a passion for building scalable, cloud-based data solutions. Experienced in designing and orchestrating automated data pipelines that support analytics, reporting, and machine learning workflows. Brings hands-on experience with modern tools and platforms, along with a strong understanding of data architecture, governance, and performance optimization. Adept at translating business needs into technical solutions that drive impact and efficiency.

Overview

9
9
years of professional experience
1
1
Certification

Work History

Senior Data Engineer

Burns and McDonnell
04.2024 - 08.2025
  • Migrating enterprise-scale data from on-premises systems to Azure-based cloud platforms, ensuring secure, scalable, and efficient transitions while maintaining data integrity.
  • Developing and maintaining end-to-end ETL pipelines in Azure Databricks, utilizing Delta Live Tables (DLT) and Autoloader to ingest and process data from Azure Data Lake Storage (ADLS) using the medallion architecture, ensuring high-performance data transformation.
  • Writing efficient, scalable transformations using Databricks SQL, optimizing query performance across large distributed datasets and creating materialized views for downstream analytics.
  • Built machine learning models using Python and Spark MLlib transforming large, raw datasets into actionable predictions through feature engineering and model tuning.
  • Designed and deployed scalable data ingestion pipelines in Microsoft Fabric (Data Factory) to integrate data from on-premises SQL Server into OneLake, which ensured quality performance, and seamless downstream consumption.
  • Designed medallion architecture (Bronze, Silver, Gold) in Microsoft Fabric Lakehouse to transform raw data into curated, ready for analysis that enhanced data quality and accelerated business insights.
  • Developed end-to-end ML pipelines in Databricks, automating data preprocessing, feature generation, model training, and evaluation with PySpark and MLflow to streamline workflows.
  • Deployed and monitored production ML models using MLflow and Databricks Model Serving, implementing automated retraining, CI/CD workflows, and real-time predictions for reliable, scalable performance.
  • Designing, orchestrating, and managing complex data workflows using Apache Airflow, implementing custom DAGs to automate scheduling, retries, and failure handling for streamlined pipeline execution and reliability.
  • Developed incremental and full load data ingestion frameworks across ADLS, Snowflake, and Databricks, enabling continuous syncs with Airflow-based orchestration and monitoring.
  • Integrated Snowflake for ELT workloads using Snowpipe and Tasks by automating warehouse scaling and dbt for transformation.
  • Conducted data quality assurance with dbt (Data Build Tool), performing rigorous data quality checks, schema validation, and automated testing to ensure the accuracy and integrity of the datasets.
  • Designing reusable, parameterized components in dbt using Jinja templating and custom macros to enforce DRY principles and simplify model logic across data layers.
  • Integrating GitHub with Azure DevOps to manage version control, facilitate collaborative code reviews via pull requests, and automate CI/CD workflows for streamlined pipeline deployments and versioning.
  • Conducting continuous pipeline monitoring and performance optimization, troubleshooting issues to ensure efficient data ingestion, transformation, and delivery with minimal latency.
  • Collaborating closely with data analysts, data scientists, and key stakeholders to understand business requirements and translate them into high-performance, business-ready datasets that drive data-driven decision-making.
  • Implemented and managed Unity Catalog to enforce centralized data governance and secure access control across multiple Databricks workspaces, ensuring compliance with company policies.
  • Designed and organized catalogs, schemas, tables, and external locations to streamline data storage, discovery, and consumption for analytics and reporting teams.
  • Configured access controls using Azure AD groups, service principals, and role-based permissions, restricting sensitive data access to authorized users.
  • Enforcing best practices in data reliability and observability, implementing monitoring systems, setting up alerting mechanisms for pipeline failures, and performing automated tests for anomaly detection to ensure consistent and reliable data processing.
  • Utilize Jira to manage sprint planning, backlog refinement, task assignments, track progress in Agile-based projects.
  • Collaborate with teams through Jira boards to prioritize backlog items, document blockers, and ensure timely delivery of tasks.
  • Maintain technical documentation and project notes in Confluence, ensuring knowledge sharing on data architecture, workflows, and pipeline monitoring, schedule and standards.
  • Collaborated in architectural discussions around multi-cloud strategy, including initial assessment of BigQuery for integration with existing Azure-based pipelines.
  • Designed and implemented scalable data pipelines to ensure efficient data flow across platforms.

SR MSBI Developer

State Street Corporation
06.2022 - 12.2023
  • Contributed to SQL performance enhancement and optimization of reporting data by fine-tuning indexes and stored procedures. Implemented T-SQL triggers to ensure consistent data input into the database.
  • Played a key role in the transition from Stellar to NetSuite by crafting SQL code and extracting the required data.
  • Applied ERwin Data Modeler to design and document fact and dimension table structures, ensuring consistency across data warehouse layers during migration to Azure.
  • Used ERwin Data Modeler to perform schema analysis and generate DDL scripts for creating and maintaining dimensional data models.
  • Integrated Azure Synapse Analytics with Azure Stream Analytics to process and analyze streaming data in time to gain insights from data as it is generated.
  • Wrote Spark scripts with Azure Synapse Studio to process and analyze the clickstream, data stored in data lake for user preference and trends.
  • Provided data orchestration capabilities with Azure Synapse using tools like ADF to design and manage data pipelines to move and transform data ensuring it is ready for analysis.
  • Actively participated in migrating a sizable database from an old legacy application to the Azure platform.
  • Automated the creation of Azure cloud systems, including Resource groups, Azure Storage Blobs, and tables.
  • Utilized Azure DevOps services such as Azure Repos, Azure Boards, and Azure Test Plans for collaborative code development and project planning.
  • Developed ADF2 pipelines for processing Fact and Dimension tables with complex transformations, including the maintenance of SCD Type 2/SCD Type 1 historical data.
  • Facilitated data ingestion into multiple Azure services, including Data Lake, Storage, Data Warehouse, and processing data in Azure Databricks.
  • Administered Azure Active Directory (Azure AD) users, groups, devices, and participated in migration between on-premises and Azure AD through Ad Connect.
  • Designed Power BI data visualizations, incorporating cross-tabs, scatter plots, pie charts, and density charts.
  • Leveraged Power BI Gateways to ensure the real-time updating of dashboards and reports.
  • Developed filters, reports, dashboards, and created various chart types, visualizations, and intricate calculations to manipulate data within Power BI.
  • Defined business-specific metrics and KPIs by using DAX to create custom calculations, measures and calculated columns in Power BI reports.
  • Created T-SQL stored procedures for SSRS reports and deployed these reports for account managers to access via web browsers.

Data Engineer

Oracle Cerner
09.2020 - 05.2022
  • Wrote Python scripts for automating repetitive tasks, and created custom scripts for parsing and analyzing log files, identifying and resolving system issues proactively.
  • Collaborated with cross-functional team to develop and maintain dynamic web applications using Python and Django framework.
  • In the capacity of a technical authority, provided expert guidance in evaluating novel software projects and initiatives aimed at fortifying and augmenting the existing OLTP and OLAP applications.
  • Leveraged SQL Server and T-SQL to execute, sustain, and refine tables, stored procedures, user-defined functions, views, diverse index types (clustered, non-clustered, covering) to expedite data retrieval, constraints, triggers, and relational database models utilizing SQL.
  • Engineered bespoke and efficient stored procedures, Common Table Expressions (CTEs), User-Defined Functions (UDFs), Data Manipulation Language (DML) triggers, indexes, tables, and views as integral components of the project.
  • Collaborated with developers on a myriad of aspects encompassing utility utilization, SQL performance, test data, and comprehensive troubleshooting.
  • Enhanced row-level operations by substituting while loops and cursors with T-SQL window functions (e.g., row number, lead, lag, sum () over (partition by), string split/agg) or recursive Common Table Expressions (CTEs).
  • Actively contributed to the deployment of Data Mapping, Normalization, Batch jobs, Troubleshooting, Data migration, Data collation, Data cleansing, Entity Relationship Design (ERD) models, and application-driven design.
  • Fabricated SSIS package configurations employing XML configuration files, environment variables, and SQL Server tables while instituting error-handling mechanisms through Event Handlers catering to diverse event types.
  • Demonstrated hands-on expertise in Azure Data Factory V2 for the execution of ETL/ELT processes from sundry sources, encompassing copy and foreach loop activities, alongside Data Flows.
  • Seamlessly transplanted on-premises SQL Server schemas to Azure cloud-based platforms, encompassing Azure SQL Server DB and Azure VM. Presided over the development and maintenance of databases across both on-premises and cloud environments.
  • Exhibited an advanced proficiency in Microsoft Azure Cloud Services, with a particular focus on Azure Databricks, Azure SQL Platform as a Service (PaaS) and Infrastructure as a Service (IaaS), Azure Synapse Analytics, and Azure Data Lakes, among other components.
  • Capitalized on Power BI to construct a semantic model, streamlining the analysis of business data and associations by importing pertinent data from the Azure Synapse data warehouse.
  • Cleaned and transformed raw data to resolve any inconsistencies and improve accuracy using ETL techniques.
  • Conducted QA for batch jobs, migrations, and merged datasets from multiple EHR and healthcare systems, ensuring unified and reliable patient records.
  • Monitored and tested pipeline performance, error handling, and data flows to ensure reliable delivery of production-ready data for analytics and reporting.
  • Merged different data from multiple isolated EHR and other healthcare systems to create unified and patient records and data.
  • Spearheaded the management, governance, and enforcement of row-level security mechanisms to bolster performance and security from a hierarchical vantage point.

SQL/BI Developer

Vail System Inc
03.2018 - 08.2020
  • Interacted with business users and team leads for better understanding of business specifications and requirements, identifying data sources and Development strategies.
  • Gathered and analyzed business requirements and translated business needs into long-term business intelligence solutions.
  • Involved in developing systems with a high level of performance and reliability.
  • Performed regular reviews on the database to identify any performance problems, inadequate programming, and data duplication problems.
  • Created complex stored procedures, triggers, functions (UDF), indexes, tables, views and other T-SQL code and SQL joins for SSIS packages and SSRS reports.
  • Created Views to enforce security and customized data access.
  • Involved in loading flat files, excel sheets into the OLTP database. Also, developed T-SQL stored procedures/functions/expressions, CTEs, indexed views, stage tables to process the data and populate them in the appropriate destination tables.
  • Implemented the changes in the existing ETL process to move data back and forth from ODS to all the integrated systems.
  • Deployed SSIS packages into various Environments (Development, Test and Production) using Deployment Utility.
  • Configured dimensional modeling for end users from hierarchy perspectives.
  • Experience in Oracle supplied packages, Dynamic SQL, Records and PL/SQL Tables.
  • Created complex stored procedures/execute task, updated and maintained ETL packages with high performance.
  • Created linked, drill through, drill down, cashed, snapshot and sub reports using Report Server Project & Report Model Template in SSRS and was responsible for deploying them.
  • Created interactive, accurate, insightful, and well-organized Power BI reports/dashboards to support businesses in their data driven strategic planning.
  • Involved in designing, developing, debugging and testing of reports in SQL Server Reporting Services (SSRS).
  • Translated business requirements into normalized OLTP schemas and dimensional OLAP models (Star/Snowflake) to support reporting and analysis.

SQL/SSIS/ ETL Developer

Turner Construction Co
04.2016 - 02.2018
  • Project includes analysis of complex business requirements, creating and modifying complex stored procedures also worked on performance tuning.
  • Largely used T-SQL in supporting Finance Team and constructing User Profiles, Relational Database Models, Data Dictionaries and Data Integrity.
  • Created complex T-SQL code such as Stored Procedures, functions, triggers, Indexes and views for the application.
  • Created the optimal jobs to load the data that business rules were based on flat file names or ragged hierarchy.
  • Developed SCD-Type 0, 1, 2 in SSIS to update the old data in dimension tables when the new source data is loaded. Also automated the process by deploying the packages into SQL server and creating jobs in SQL Server Agent.
  • Created packages using SSIS to extract the data from different flat files (.txt, .csv, .xlsx) and loaded to tables in SQL Server.
  • To Import and Export Data across various files and databases created ETL packages with different control flow options and data flow transformations such as Conditional Split, Multicast, Union all and Derived Column and etc.
  • Participate in creating calculated members, named set, advanced KPI’S for the SSAS cubes.
  • Collaborated with data architects and ETL developers to generate database DDL scripts and perform model-to-database synchronization using ERwin, improving consistency and reducing schema drift.
  • Used ERwin for metadata documentation and schema standardization, supporting clear communication between development and business teams.
  • Design, develop, and support complex integration processes (including interfaces) using SQL Server technology, stored procedures and SQL program code.
  • Did various complex Stored Procedures/execute task, wrote C# hard codes in script task for ETLs, updated/maintained the ETL packages with high performance.
  • Created stored procedure using nodes method to load XML file data into SQL server tables. Experienced solving issues raised by QA, UA for database.
  • Developed reports using Drill through, Drop Down and Hyperlinks to move to other reports and web pages using Microsoft SQL Server Reporting Services (SSRS).
  • Conducted QA for batch jobs, migrations, and merged datasets from multiple EHR and healthcare systems, ensuring unified and reliable patient records.

Education

Master of Business Administration - Business Analytics

Park University
USA

Bachelor of Commerce - Finance Major

University of Nairobi
Kenya

Skills

  • Databases: SQL Server (2012–2019), Oracle, MySQL, Azure SQL
  • Data Modeling & BI: SSDT, SSRS, SSIS, SSAS, Power BI Desktop/Services, Tableau
  • Languages & Web Tools: T-SQL, PL/SQL, DB2, MySQL, C#, MDX, DAX, XML, HTML, JSON, R, Python (ML, PySpark)
  • Cloud & Big Data: Azure (ADLS, ADF, Synapse), GCP (BigQuery), Databricks (Unity Catalog, Spark, DLT, PySpark, Autoloader, MLflow), Snowflake, DBT, Microsoft Fabric
  • Version Control & Collaboration: GitHub, Azure Repos, Jira, Azure DevOps
  • Orchestration & Scheduling: Apache Airflow (DAGs), Azure Data Factory
  • Others: DBT, VS Code, MS Office, SQL Server Notification Services, Microsoft TFS, ERwin Data Modeler, HER system, GIT, Jira

Certification

  • Databricks Certified Data Engineer Associate
  • Microsoft Certified: Azure Data Fundamentals

Languages

English

Timeline

Senior Data Engineer

Burns and McDonnell
04.2024 - 08.2025

SR MSBI Developer

State Street Corporation
06.2022 - 12.2023

Data Engineer

Oracle Cerner
09.2020 - 05.2022

SQL/BI Developer

Vail System Inc
03.2018 - 08.2020

SQL/SSIS/ ETL Developer

Turner Construction Co
04.2016 - 02.2018

Bachelor of Commerce - Finance Major

University of Nairobi

Master of Business Administration - Business Analytics

Park University