Summary
Overview
Work History
Education
Skills
Certification
Timeline
Personal Information
background-images

Mithelesh Solasa

Charlotte,VT

Summary

Data Engineer with 11+ years of experience in Analyzing large sets of data, Data visualizations, Developing Data pipelines and Requirement Gathering and Data Validation.

Big Data professional with strong focus on data architecture, data analytics, and ETL processes. Skilled in Hadoop, Spark, and Python, with significant experience in designing and implementing scalable data solutions. Known for effective team collaboration, adaptability to changing project needs, and results-driven approach that consistently drives project success.

Overview

8
8
years of professional experience
1
1
Certification

Work History

BigData Engineer

Wells Fargo
03.2025 - Current
  • Project outline: Part of consumer lending project, worked on EDL components that involve data processing using pyspark and hql and oracle and teradata systems.
  • Responsibilities:
  • Worked on HQL scripts that combine data from oracle and Teradata. Implementing complex logic in sql and also in hql
  • Worked on SQL querying for analysis purpose.
  • Made autosys changes that include multiple child jobs, made changes to conditions and time of runs.
  • Debugging issues that arise during job runs.
  • Implemented changes in PROD raised changed requests and co-ordinated with teams to- gather information.
  • Implemented Pyspark changes to handle error handling and file changes for loading customer list data.
  • Implementing Shell script changes based on the requirement and understanding business logic and workflow.
  • Tools: Spark,Shell scripting,Oracle,HQL, Linux, kafka, Hadoop, SQL., Teradata, Pyspark,DATABRICKS

BigData Engineer

Bank Of America
06.2023 - 01.2025
  • Project outline: Worked on Loan processing module that uses hadoop and oracle systems to load the data and process the data and provide insights to the end customer.
  • Responsibilities:
  • Implementing changes in Pyspark and Scala code for spark application development as part of customer-matching pipeline that loads data to s3 based on input data
  • Implementing Shell script changes based on the requirement and understanding business logic and workflow.
  • Performed data cleansing and transformation using PySpark functions such as filter, select, withColumn
  • Wrote some python test cases to test scenarios by loading sample data.
  • Implement Autosys job changes as well as changes to the shell script logic based on requirement
  • Setup data in the UAT and dev lanes to makes sure jobs run successful
  • Copying data to different clusters and handling missing data. worked on Delta Live Tables in databricks
  • Implement hive script and spark job debugging and used python scripts for logging
  • Worked in databricks for automation of reports using python
  • Imported real time weblogs using Kafka as a messaging system and ingested the data to Spark Streaming.
  • Implemented Data integrity and Data quality checks in Hadoop using Hive and Linux scripts.
  • Implemented scheduling notebooks using databricks scheduler.
  • Wrecked on spark streaming using kafka using pyspark and added stream data.
  • Worked on File System utility(dbutils.fs) of Databricks Utilities in Databricks
  • Wrecked in filesystem in databricks and used muting and utilities in databricks
  • Worked on building real-time pipelines using Kafka and Spark Streaming.
  • Wrecked on consumer related code to handle data from external sources in kafka
  • Implemented changes in Pyspark code for logging, transformations, and testing as per requirements.
  • Implemented ETL pipeline using EMR, worked on step Functions and running lambda-using Python.
  • Work on hive jobs as well as spark configuration changes to accommodate recent changes in scripts and make sure jobs run successfully
  • Handling GitLab issues and resolve conflicts and raise PR and make changes according to comments
  • Talking to stakeholders and understanding business logic and write SQL queries to analyze and report data
  • Tools: Spark, AWS, Shell scripting, SCALA, T-Sql, Linux, kafka, Hadoop, Data bricks, Snowflake, Pyspark,,EMR,S3,Glue,DATABRICKS, Kafka

Data Engineer

Capital one
04.2021 - 06.2023
  • Project outline: As a part of Model development, we are implementing changes to the spark code and send the data-to-data science team for analysis.
  • Responsibilities:
  • Implementing changes in Java code based on requirements and reading data from Cassandra database to and writing to s3.
  • Wrecked on loading data to data lake and created multiple tables in hive and used for analysis.
  • Implementing spark programs and send the data to one lake and work on schema registration and sending the data to one lake.
  • Development of code as a part of enhancement of the pipeline to include new scenarios based on stakeholder feedback.
  • Identifying vulnerabilities and sonar issues and handling KT for newcomers.
  • Experienced working on containerisation using Docker
  • Extensively used Databricks community provides a unified, open platform for all your data.
  • Participating, identifying, and resolving on-call or Prod issues.
  • Worked on Python scripts changes for AWS lambda updates and cluster updates which helped in deployment of jobs
  • Worked on AWS Step functions to submit jobs and used Airflow also.
  • Experience in Writing SQL queries and PL/SQL Functions, Procedures, Triggers.
  • Built Python logic in AWS Lambda functions to handle event-driven workflows, such as real-time data ingestion, file processing from S3, and triggering downstream processes.
  • Automated data quality checks in Python within AWS Glue to validate schema integrity, detect anomalies, and ensure data consistency across the data lake.
  • Implemented changes in Pyspark code for logging, transformations, and testing as per requirements.
  • Implemented ETL pipeline using EMR, worked on step Functions and running lambda-using Python.
  • Used Data bricks cluster to enable Audit of the spark output from s3
  • Querying and reporting using Snowflake and Data bricks.
  • Implementing Step function changes and Cloud watch rules and build pipelines
  • Tools: Spark, AWS, Splunk, Sql, Cassandra, Python, Hadoop, Linux, Data bricks, Snowflake, Pyspark, Scala,RDS,Glue

Bigdata Engineer

Blue Cross Blue Shield
11.2019 - 03.2021
  • Project outline: As a part of internal data migration, we are implementing spark data migration from the mainframe, which was legacy. We were mainly working on claims data migration
  • Responsibilities:
  • Involved in review of functional and non-functional requirements.
  • Worked on Large sets of structured, semi structured, and unstructured data.
  • Analyzed large data sets by running spark jobs.
  • Loading data to Hive from CIP database perform ELT operations.
  • Tuning spark jobs for performance.
  • XML development using spark and Scala.
  • Development of test scripts for enrollment data testing.
  • Performing reconciliation between source and target attribute
  • Implemented ETL pipelines on AWS and using storage layer to save the data.
  • Debugging data, issues using spark and identify data issues.
  • Strong knowledge and implementation experience in EC2/EMR
  • Worked on Hadoop Sqoop scripts for data loading to hive.
  • Expertise in writing Jobs for analyzing data using Hive and Spark.
  • Debugging data issues using Pyspark and identify data issues.
  • Involved in design discussions of CIP Migration. Assist with planning and execution of unit, integration, and user acceptance testing.
  • Work with Continuous Integration (CI)/CD using Jenkins with the integration of the GIT repository for the build, testing, code review and the deployment of the build Jar file, shell-scripts and OOZIE workflows to the destination HDFS paths.

Big Data Developer

AT&T
12.2017 - 10.2019
  • Responsibilities:
  • Responsibilities include gathering business requirements, developing strategy for Data Cleansing and Data Migration, writing functional and technical specifications, creating source to target mapping, designing Data Profiling and Data Validation jobs.
  • Extending the functionality of Hive, Spark with custom UDF s.
  • Developed Tableau workbooks to perform year over year, quarter over quarter, YTD, QTD and MTD type of analysis.
  • Implemented Partitioning-using HIVE to assist the users with data analysis.
  • Worked on AWS cloud for data storage and scheduling and jobs submission.
  • Writing and executing Test Plans and Test Cases from Requirements
  • Worked on Development of BRDS and DIMS, which are used for content portal.
  • Used Data frames in development of spark code and worked on performance tuning of spark applications.
  • Wrote HiveQL and spark SQL queries to test developed tables and write shell scripts.
  • Used Oozie scripts for deployment of the application and perforce as the secure versioning software.
  • Perform DB activities such as indexing, performance tuning, and backup and restore.
  • Expertise in writing Hadoop Jobs for analyzing data using Hive QL (Queries) and used Sentry.
  • Developed and deployed automated data pipelines for extracting data from multiple sources to Data Lake.
  • Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries.
  • Environment: Scala, Cloudera CDH, Hadoop, Pig, Tableau, Hive, AWS, MapReduce, HDFS, Sqoop, Impala, Tableau, Flume, Oozie, Linux, Java, Python.

Education

Masters - Informatics

Northeastern University
BOSTON
01.2016

Skills

  • Spark development
  • Apache Kafka
  • Performance tuning
  • Scala programming
  • Amazon web services
  • Big data analytics
  • Hadoop ecosystem
  • SQL and databases
  • SQL programming
  • Shell scripting

Certification

https://credentials.databricks.com/a2444547-505e-4ab4-83ff-b076b16bc59d#acc.FJ9wMkI5

Timeline

BigData Engineer

Wells Fargo
03.2025 - Current

BigData Engineer

Bank Of America
06.2023 - 01.2025

Data Engineer

Capital one
04.2021 - 06.2023

Bigdata Engineer

Blue Cross Blue Shield
11.2019 - 03.2021

Big Data Developer

AT&T
12.2017 - 10.2019

Masters - Informatics

Northeastern University

Personal Information

  • Work Permit: H1b
  • Visa Status: H1b
Mithelesh Solasa