Shivani Vadla

Spark Scala Developer

JPMorgan Chase

11.2023 - Current

Responsibilities:

· Develop and deploy ETL (Extract, Transform, Load) processes for efficient data migration from on-premises systems to AWS.

· Implement and optimize data transformation logic in Spark using Scala and Spark SQL for data cleansing, standardization, and enrichment

· Configure and manage Amazon EMR clusters to run Spark jobs efficiently, ensuring scalability and reliability.

· Optimize EMR cluster performance through effective resource allocation, job scheduling, and cost management practices.

· Monitor EMR clusters for performance issues and implement tuning strategies to enhance job execution.

· Utilize Amazon S3 for robust data storage solutions, ensuring efficient data access and retrieval for Spark processing.

· Develop and manage S3 bucket policies to maintain data integrity, security, and compliance with best practices.

· Implement data lifecycle management policies in S3 to optimize storage costs and performance.

· Implement and manage AWS IAM (Identity and Access Management) policies to control secure access to AWS resources and data.

· Ensure all Spark jobs and data processing activities adhere to security best practices and compliance requirements.

· Use Apache Airflow (through Astronomer) to orchestrate and manage complex data workflows, scheduling, and task dependencies.

· Develop and maintain Airflow DAGs to automate and streamline ETL processes and data pipeline operations.

· Monitor and troubleshoot Airflow workflows to ensure reliable and timely execution of data tasks.

· Utilize Amazon SQS for managing message queues and integrate them with Spark applications to ensure reliable data ingestion and processing.

· Perform comprehensive testing, including unit testing, integration testing, and end-to-end testing, to validate data integrity and migration success.

· Continuously monitor and optimize the performance of Spark jobs, EMR clusters, and overall data processing workflows.

· Implement best practices for efficient data processing and resource utilization, including tuning Spark applications and optimizing AWS resource configurations.

· Document technical designs, configurations, migration processes, and best practices to facilitate knowledge sharing and support future projects.

Tools : AWS EMR, S3, SQS, IAM, Spark, Scala, Python, Airflow, Cassandra, Airflow, Control-M, JIRA, Kafka.

Summary

Overview

Work History

Spark Scala Developer

Education

Master of Science - Computer Science

Bachelor of Science - Computer Science

Skills

Certification

Languages

Timeline

Spark Scala Developer

Master of Science - Computer Science

Bachelor of Science - Computer Science

Similar Profiles

PASTELL JENKINSPASTELL JENKINS

Supraj CBSupraj CB

Noemie VillanuevaNoemie Villanueva

Kimberly De La RosaKimberly De La Rosa

Edward DiMauro JrEdward DiMauro Jr