Experienced Data Engineer and AI Solutions Engineer with 12 years of expertise in cloud migration, application design, and cutting-edge AI implementations. Proficient in AWS and Large Language Models (LLMs), with a strong background in the financial and pharmaceutical industries.
My current role as Principal Software Engineer at Fidelity Investments has been instrumental in orchestrating strategic migrations—transitioning from EC2 to SageMaker for unstructured document processing and from Airflow to Step Functions—reducing maintenance efforts and costs. I've expanded into AI-driven solutions, building Intelligent Document Processing (IDP) systems using Claude LLM for automated data extraction, developing a Document Flow application with session-based chat capabilities and custom prompt templates, and conducting proof-of-concept work on Model Context Protocol (MCP) for agentic applications. These innovations demonstrate our commitment to leveraging AI for continuous improvement and enhanced client satisfaction in the dynamic financial sector.
Part of the Intelligent Automation and performing the following activities
1. Created a common CI/CD pipeline in Jenkins to deploy AWS Services , PostgresSQL Scripts, EKS apps
2. Performed migration from EC2 to Sagemaker for unstructured document processing to extract 70 plus elements. Migrated from Airflow to Stepfunction for orchestration, which in turn reduced the maintenance effort and cost
4. Deployed LLM based chat-bots using Langchain Framework
5. Created a common IDP (Intelligent Document Processing) framework using many AWS services such as Lambda, Step Functions, SQS, SNS, and Textract to orchestrate and process documents to support the operations team in processing customer financial requests.
6. Enhanced the Hardships entity extraction using Claude models and designed a robust prompt configuration system
7. Support the team with all EKS-related deployments.
8.Design the multi-journey application to support the operations team in processing participants money out requests
9. Deploy and maintain third-party applications to support the labeling needs for the Data Scientists.
10. Helping and guiding the team on AWS related problem statements
11. Designed and developed dynamic queuing capability for IDP to process multiple journey documents without any performance impact
As a part of the Intelligent Automation team worked as a Technical Lead and performed the following
1. Created an end to end pipeline with Airflow as orchestrator and Celery on EC2 ASG for compute, to perform real time model inference for the 50+ models created by DS to do value extractions from PDF documents
2. Created Devops Pipelines in Concourse for deployment of AWS services such as EC2, S3, ELB, Lambda, SNS, RDS, KMS and pipelines to deploy database objects in Postgres
3. Deployed Rest API(s) in EKS using Helm Templates
4. Created a serverless OCR pipeline using Lambda and Textract
As a part of the Enterprise Data Lake Team worked as a Data Engineer and performed the following activities
1. Created Dynamic DAGs in Airflow to perform data loads into Snowflake from multiple sources.
2. Wrote efficient SQL queries against tables with Billions of records in Snowflake
3. Created solutions to perform history loads into Snowflake in SCD6 format
Worked as a Big Data Engineer for US based pharma client in modernizing their data lake platform using Redshift, S3 & Glue
1. Migrated the data lake from RDS Postgres to Redshift
2. Contributed towards the platform development to perform data loads into Redshift using Spark
3. Created a configurable & automated mechanism to load SAP data into Redshift using Lambda which reduced the load time and cost
4. Mentored a team of seven people to enable them to deliver the projects on time
5. Awarded as Best Cloud Developer for the solutions provided
1. Created and maintained AWS Redshift cluster which was the source of data for Commercial data of the enterprise which is used for Reporting and Analytics
2. Designed the tables in a way which can provide good performance for reporting
3. Automated the maintenance of the cluster , to automatically react and rectify any issues related to the performance
4. Performed POC on AWS Redshift Spectrum and Athena to show case the potential cost reduction and performance gains in storing and retriving the data
5.Automated the access control for the database objects in the cluster
6. Implemented Lift Cycle management in S3 to reduce cost and be compliance with the data governance requirements
1. Collect data from the hospitals such as patient data , procedure , diagnosis and load into their corresponding database in our system
2. Performed analytics on the data to provide metrics on the performance of the care provider etc for the hospital to help make better decisions
3. Automated the internal metric creation process using SSIS
4. Create a new ETLs for any new file requirements
5. Worked on multiple POCs in AWS to move the application and data from on-perm to cloud
1. Worked for Financial Times through Pearson English and Gyanmatrix during the mentioned period
2. Create the data marts using the FT home grown ETL framework which in turn was used by many downstream applications
3. Create jobs to load the marts on a periodic basis
4. Optimize the query performance in AWS Redshift
5. Created a report on the FT's subscription metric and their corresponding usage which was in turn used by the FT top management on a weekly basis to decide on the campaign strategy
1. Software Engineer worked with the Data Integration team as a ETL developer for the client "Microsoft"
2. Created ETLs to pull sales data from multiple systems and load it into the data warehouse
3. Created logics to calculate the incentives for the sales team based on the sales data.
4. Also, implemented a new logic for the file upload process which in turn improved the end user experience which also reduced the developer's manual effort required.
5. Part of another project created a framework which helps in mointor the MS SQL Server and report them via PowerBI
1. Worked for a healthcare client on preparing their existing system to support for the affordable care act
2. Fine tuned complex Stored Procedures in SQL Server by completely understanding it
3. Created a test suite which was used by close to 50+ developers for their unit testing , which in turn helped save close to one hour per developer per day on testing effort