Summary
Overview
Work History
Education
Skills
Websites
Certification
Accomplishments
Training
Projects
Work Preference
Timeline
Generic
Open To Work

MARCELLUS N MBU

New Jersey,USA

Summary

Results-driven data engineering professional with 8 years of experience in architecting governed data lakes and managing large-scale data pipelines. Demonstrated success in optimizing data processes and automating infrastructure provisioning, significantly enhancing operational efficiency. Seeking Sr Data Engineer position to leverage expertise in data engineering and optimization for enterprise analytics initiatives.

Overview

9
9
years of professional experience
1
1
Certification

Work History

Data Engineer

Symrise
New Jersey, United States
08.2023 - 11.2025
  • Reduced AWS Glue ETL processing time by 60% across 40 workflows through PySpark optimization, accelerating data insights by 35%.
  • Designed large-scale pipelines using star-schema modeling and S3 partitioning, reducing storage overhead by 30% and improving query performance by 45%.
  • Orchestrated event-driven workflows with Glue, Lambda, Step Functions, optimizing Kinesis shard utilization and implementing S3 lifecycle policies to reduce compute costs.
  • Developed automated Python pipeline to analyze seasonal demand for raw ingredients, integrating disparate S3 sources and saving 15 hours/week in manual processing.
  • Supported global operations for a fragrance and flavor manufacturer across 150+ countries.

Database Administrator/Engineer

Pulaski Savings Bank(via BEREANS IT)
Chicago, United States
05.2022 - 02.2023
  • Engineered high-availability PostgreSQL 17 RDS environments, utilizing Multi-AZ failover and Read Replicas to sustain 99.95% uptime for critical banking transactions.
  • Spearheaded the migration of 15+ legacy SQL databases to Amazon Aurora, leveraging AWS DMS for zero-data-loss cutovers, resulting in a 60% performance boost and 45% cost reduction.
  • Developed Row/Column Level Security (FGAC) and Dynamic Data Masking (DDM) in Aurora clusters to safeguard sensitive PII, enabling internal analysts to access only data relevant to their clearance levels.
  • Enhanced HIPAA and financial data compliance by integrating AWS Secrets Manager to eliminate hard-coded credentials and employing KMS encryption for data-at-rest across clusters.
  • Implemented AAA framework for compliance; used IAM Fine-Grained Access and CloudTrail, eliminating unauthorized access incidents.
  • Modernized legacy infrastructure through cross-functional architecture reviews and AWS workload migration, reducing overhead by 40% via automation and proactive tuning.
  • Regional financial institution providing consumer and commercial banking services

Cloud Database Administrator

SYMRISE (via BEREANS IT)
New Jersey, US
05.2017 - 03.2022
  • Monitored 50+ production environments via CloudWatch and communicated proactive risk alerts to stakeholders, reducing incident response time (MTTR) by 35% and ensuring 99.95% availability.
  • Optimized high-traffic RDS PostgreSQL databases, implementing automated Python-based autovacuum tuning that cut dead-tuple accumulation by 40%.
  • Resolved RDBMS bottlenecks through critical analysis, improving system performance and reliability.
  • Reviewed execution plans and managed resources in IaaS and PaaS (RDS/Aurora) to maintain 99.99% availability.
  • Managed 40+ MySQL and PostgreSQL databases, reducing disk I/O latency by 35% through memory/cache tuning and IOPS optimization.
  • Hardened database security using IAM, RBAC, and AWS KMS, producing documentation that supported strict regulatory compliance and 100% audit pass rates.
  • Architected a proactive observability framework using AWS Performance Insights, utilizing Database Load (AAS) and Top SQL metrics to identify I/O bottlenecks and achieve a 40% reduction in query latency.
  • Global fragrance and flavor manufacturer serving 150+ countries

Education

Bachelor of General Science -

Rutgers, The State University of New Jersey
01.2012

Associate in science -

Essex County College
Newark, NJ
01.2009

Skills

  • PySpark and AWS
  • ETL pipelines
  • Data management
  • Data warehousing
  • Data modeling
  • Big data processing
  • Data ingestion
  • Data transformation
  • Data migration
  • SQL query tuning
  • DynamoDB and MongoDB
  • MySQL and Oracle
  • SQL Server and Athena
  • Amazon DAX optimization
  • S3 and ElastiCache usage
  • Cloud infrastructure tools
  • Infrastructure automation
  • IAM and VPC management
  • CloudWatch and CloudTrail monitoring
  • Resource provisioning
  • Terraform management
  • CI/CD practices
  • Kafka and Airflow
  • Linux administration
  • Python, Bash, and PowerShell scripting
  • Version control with Git
  • S3 policies
  • Data encryption
  • HIPAA compliance
  • Auto scaling solutions
  • Load balancing
  • Schema conversion
  • PGAdmin and DBeaver proficiency
  • PGAdmin and DBeaver proficiency
  • Terraform management

Certification

  • AWS Certified Data Engineer - Associate, CertMetrics
  • CompTIA Security+ (SY0-701), d90369d9cf0c42f5a04b4129e54e60e1
  • AWS Certified Cloud Practitioner, CertMetrics
  • HashiCorp Certified: Terraform Associate (003), Credly

Accomplishments

  • Cost & Performance Optimization, Spearheaded an Oracle-to-PostgreSQL migration, cutting $120K in annual licensing costs while boosting operational efficiency by 65%.
  • Extreme Performance Tuning, Optimized Redshift and ElastiCache integrations using dist/sort keys and WLM tuning, delivering 80x faster read speeds and a 70% reduction in dashboard latency.
  • Serverless Data Lake, Designed serverless data lake using S3 & Lambda, cutting storage costs by 70% and boosting conversion rates by 25%.
  • High-Availability Engineering, Consistently maintained 99.95% uptime across 50+ production environments for a €5B global supply chain and financial systems via Multi-AZ failover and Zero-Downtime Patching (ZDP).
  • Operational Efficiency, Automated infrastructure provisioning for 60+ environments using Terraform and CloudFormation, accelerating deployment cycles by 40% and eliminating 15 hours of manual work per week.
  • Scalable Data Architectures, Architected and deployed serverless pipelines ingesting 5+ TB of data per day and managed a 1M+ record data lake, driving a 20% lift in customer retention.

Training

  • AWS Training & Certification Completion Certificate, 2c484c17-6efa-4cb0-af17-1b5ca97fdc2d
  • AWS Training & Certification Completion Certificate, a2ca303d-278c-4f64-b53a-ca4330a12ecd

Projects

Cloud-Based Data Warehouse Solution on AWS (Redshift), AWS, 2025, Designed a cloud-based data warehouse using AWS to solve data fragmentation and support business insights. Applied best practices in data operations and optimization through labs and real-world cases. Deployed an Amazon Redshift Serverless data warehouse for a high-volume ticketing platform, ingesting 7M+ records daily. Provisioned secure infrastructure (IAM, VPC), designed a scalable schema, and optimized S3 COPY pipelines to cut load times by 45%, enabling near real-time analytics for 1,000+ users. Established a robust data governance and security framework by implementing Dynamic Data Masking (DDM), Row-Level and Column-Level Security (FGAC), and integrating CloudWatch audit logging for compliance and traceability. Executed SQL joins/queries on 10M+ records, delivering insights to identify top venues and sales trends, improving event planning and revenue forecasting by 25%.

Work Preference

Job Search Status

Open to work

Work Type

Full TimeContract WorkPart Time

Location Preference

On-SiteRemoteHybrid

Salary Range

$100000/yr - $200000/yr

Timeline

Data Engineer

Symrise
08.2023 - 11.2025

Database Administrator/Engineer

Pulaski Savings Bank(via BEREANS IT)
05.2022 - 02.2023

Cloud Database Administrator

SYMRISE (via BEREANS IT)
05.2017 - 03.2022

Bachelor of General Science -

Rutgers, The State University of New Jersey

Associate in science -

Essex County College
MARCELLUS N MBU