Accomplished Senior Data Engineer with a proven track record at Blue Cross Blue Shield of Florida, enhancing data processing and analytics through expert ETL development and performance tuning. Skilled in collaborating across teams to deliver scalable data solutions, significantly improving data accuracy and efficiency. Demonstrates strong analytical skills and a commitment to driving project success.
Projects:
Enterprise Claims Streaming Application:
· Developer multiple data ingestion batch jobs between different sources systems to downstream databases.
·· Building the JSON structured data objects and loading them into PostgreSQL and Mongo DB.
. Heavily used case classes, spark Actions/transformations, dataframe joins, window functions.
.Kafka Streaming sessions to connect topic and to publish messages for real time application.
· Monitoring the Kafka Streaming jobs and provide real time updates to the business teams.
Ø CIP Batch Extracts: (Legacy Mainframe to Spark/Scala Reporting)
· Using Bean IO framework, generated Fixed length and Pipe delimited files as per the vendor specification.
· Used Spark 2.0 API to integrate with Bean IO.
· Configured XML mapping files and Bean classes in order generate the Fixed length files.
· Used Spark-Streaming APIs to perform necessary transformations and actions on the fly for building the common data model which gets the data from HDFS.
Ø Reconciliation between HBase and Mongo
· Identified rowkey and column family of the Source HBase table.
· Define a catalog schema for the HBase tables.
· Used mongo spark connector to connect Mongo database.
· Created a CSV file using CSVWriter.
· Listed out the data discrepancies and Prepared a excel sheet with missing data in Mongo
Ø ETL Generic Framework:
. It's a configuration based framework where developer doesn't required to code for straight data ingestion.
Ø UI for the Florida Blue Login Page:
· Worked on changing the UI components for the Florida Blue website Login page this will be shared across the organization for Members, BA’s and Agent.
· Using Servlet and JSP added some UI components such as, forgot user ID, Forgot Password, Notification Page after resetting the password, Need Help and Accessibility functionalities.
Business Continuity- Disaster Recovery
RxSS business model is heavily reliant on client providing critical files Formulary, Network, Plan for accurate pricing model for the Members. A process has been developed to relay 100% only on internal claims and analyze the data in a bigger picture and provided a pricing model.
· Created multiple DeltaLake table using Scala/Pyspark.
· Devolped databricks Notebooks using multiple commands and created a Notebook sequencer to execute the multiple notebooks.
· Performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.