Highly skilled and experienced data engineer with expertise in ETL development, data modeling, data pipeline design, data migration, data warehousing, big data processing, SQL and databases, and cloud computing platforms. Proven track record of successfully delivering complex data projects for various industries. Adept at utilizing a wide range of tools and technologies to design and implement scalable and efficient data solutions.
· Executed complex data transformations and integrations using Talend and SSIS across multiple client databases, ensuring data quality and consistency.
· Designed data models in SAP and Erwin for financial services, enhancing compliance reporting and audit trails for key stakeholders.
· Utilized Erwin and SAP to develop and maintain robust data models for financial reporting, which supported tax optimization projects for multinational clients.
· Deployed Databricks for scalable big data processing, enabling advanced predictive analytics and machine learning capabilities.
· Architected and optimized a multi-cloud data management solution on GCP and AWS, promoting flexibility and cost savings in data operations.
· Implemented solutions using Apache Spark and Cloudera Data Platform to analyze large-scale, diverse datasets, providing insights that drove strategic business decisions.
· Configured and maintained cloud-based data warehousing solutions using Google Cloud BigQuery, reducing data storage costs.
· Automated multiple data pipelines using Apache Airflow and Databricks, reducing manual intervention and increasing data reliability.
· Created a C# application to facilitate secure data transfers between client and internal systems.
· Implemented JavaScript-based interactive dashboards in web applications to enhance user engagement and data visualization capabilities.
· Developed a centralized data processing architecture using Apache Kafka and Apache Spark, reducing data latency from hours to minutes.
· Implemented IBM InfoSphere for comprehensive data migration, ensuring high availability and consistency across sales and distribution centers.
· Optimized data pipelines in Apache NiFi, enhancing data ingestion and extraction processes, which supported customer service analytics.
· Automated data quality checks using Microsoft Azure Data Factory, reducing manual interventions and ensuring high standards of data integrity.
· Led the development of a centralized operational data store using Microsoft SQL Server and SSIS, improving data accessibility for e-commerce logistics analysis.
· Engineered and optimized data integration flows with Talend, enhancing data synchronization across distributed systems.
· Developed and optimized data marts and implemented Tableau dashboards that provided actionable insights into customer behavior and sales trends.
· Managed the transition of data infrastructure to Microsoft Azure Synapse, achieving more scalable and cost-effective data storage solutions.
· Engineered an automated monitoring system using Shell Script to oversee data pipeline health, reducing downtime.
· Utilized Python to construct a recommendation engine which boosted sales conversions
· Spearheaded the development of an enterprise data warehouse (EDW) using Oracle Database and ODI, supporting complex financial reporting requirements.
· Engineered real-time fraud detection systems with Apache Flink and Apache Beam, decreasing fraudulent activities.
· Led the transition to cloud-based data warehousing with Amazon RedShift and Microsoft Azure Synapse, significantly improving data scalability and access.
· Implemented a strategic data model using Star and Snowflake schemas, facilitating enhanced analytical capabilities across business units.
· Constructed and maintained an EDW using Oracle Database and ODI that centralized critical financial data, supporting regulatory compliance and financial reporting.
· Utilized Apache Spark and Apache Kafka to develop a real-time fraud detection system, reducing fraudulent transactions.
· Configured and managed data pipelines in Apache NiFi, improving data flow efficiency between internal and external systems.
· Assisted in cloud migration initiatives to AWS, which included setting up AWS Glue for seamless data integration and analytics processes.
· Developed complex financial models in C++ for risk analysis, significantly enhancing portfolio management.
· Spearheaded the creation of an internal portal using HTML, CSS, and TypeScript, increasing operational efficiency.
· Managed Apache Hadoop and Cloudera Data Platform clusters for processing large-scale datasets, enabling efficient data analysis and insights.
· Developed comprehensive data pipelines using Google Cloud Dataflow and Apache Airflow, ensuring robust data governance and workflow management.
· Created and maintained NoSQL databases (MongoDB, Cassandra) to handle unstructured data, improving system performance and query response times.
· Integrated Tableau and Power BI solutions with data warehousing systems to deliver advanced visual analytics and business intelligence.
· Deployed and managed Apache Hadoop clusters, facilitating efficient processing of large-scale data sets used for predictive analytics in retail.
· Implemented Apache Flink for stream processing to provide near-real-time analytics for dynamic pricing models.
· Enhanced data lake capabilities using Amazon EMR, significantly improving data ingestion and retrieval times.
· Developed complex analytical models on Databricks, providing insights that drove high-value business strategies.
· Automated Hadoop cluster management tasks using Ruby scripts, resulting in a reduced manual workload.
· Built a Python-based data validation tool that improved data accuracy in ETL processes.
· Designed and deployed ETL processes using ODI and Apache Spark, streamlining data collection and aggregation from various sources.
· Constructed and managed a data migration framework with AWS DMS, supporting seamless transitions during technology upgrades.
· Developed dynamic data models with Toad for Oracle, supporting scalable applications in a high-growth technology environment.
· Pioneered the adoption of cloud computing platforms (Azure and AWS), enhancing data storage and processing capabilities across the organization.
· Created a JavaScript framework for real-time analytics visualization used across the organization.
· Led the development of a Java-based application for workflow management that improved process efficiency.