Experienced Junior Data Engineer with over 1 year of hands-on experience in Python, SQL, and BigQuery. Skilled in deploying advanced analytics tools such as Vertex AI, Pandas, NumPy, Scikit-learn, and PySpark to optimize data workflows and deliver actionable insights. Proficient in designing and implementing ETL pipelines for efficient data processing and integration. Dedicated to improving data infrastructure and ensuring high data quality to support informed decision-making. Strong collaborator with cross-functional teams, focused on driving organizational growth through innovative data-driven solutions.
Skilled Associate Data Engineer with background in designing, building and maintaining data processing systems. Familiarity with identifying patterns and trends in large datasets adds to ability to deliver innovative solutions for complex business challenges. Demonstrated strengths include teamwork, problem-solving skills, and proficiency in SQL, Python, and Hadoop. Previous roles have involved enhancing data collection procedures to improve overall data reliability and quality.
Real-time Data Processing Pipeline, Designed and implemented a real-time data processing pipeline to handle streaming data from IoT devices. The pipeline involved ingestion, transformation, and storage of data using Apache Kafka for message queuing, PySpark for data processing, and BigQuery for storage and querying., Python, PySpark, Apache Kafka, BigQuery, Reduced data processing latency by 30% and improved overall system throughput by implementing optimized data partitioning and parallel processing techniques.