Results-driven Data Engineer known for high productivity and efficient task completion. Skilled in big data processing frameworks like Hadoop and Apache Spark, database management using SQL, and data visualization with tools such as Tableau. Excel in problem-solving, collaboration, and adaptability to leverage technical skills in developing innovative data solutions across diverse environments.
Automated Data Pipeline for E-commerce Analytics Sep 2022 - Jan 2023
• Developed an automated data pipeline using Apache Airflow to extract, transform, and load e-commerce data into Redshift.
• Implemented data quality checks and validation scripts using Python and Pandas, ensuring data accuracy.
• Created Tableau dashboards to visualize sales performance, customer behavior, and product trends.
Real-Time Data Processing with Spark Streaming for IoT Devices Feb 2023 - May 2023
• Designed and implemented a real-time data processing system using Apache Spark Streaming and Kafka.
• Processed and analyzed streaming data from IoT devices, generating real-time insights for predictive maintenance.
• Deployed the solution on AWS EMR, leveraging S3 for storage and Redshift for data warehousing.
Real-Time Financial Data Analysis Platform Aug 2023 - Dec 2023
• Built a real-time data processing platform for financial transactions using Apache Flink and Kafka.
• Enabled real-time fraud detection and risk assessment by processing streaming data from financial systems.
• Integrated the platform with AWS services (Kinesis, S3, Redshift) for scalable and reliable data processing.