Summary
Overview
Work History
Education
Skills
Websites
Certification
Awards
Projects
Timeline
Generic

Akankshya Biswal

White Plains,NY

Summary

Data Engineer with over 2 years of experience in developing data pipelines using Informatica Powercenter in United Services Automobile Association(USAA Insurance company) with strong analytical skills along with proficiency in SQL & Python.I am eager to contribute to development projects that enable data to serve as a strategic asset. Equally confident working independently and collaboratively as needed and utilizing excellent communication skills.

Overview

9
9
years of professional experience
1
1
Certification

Work History

Data Engineer

United Services Automobile Association, USAA
09.2022 - Current
  • Currently involved in developing ETL programs for supporting data extraction, transformation, and loading of data from SNOWFLAKE using Informatica Power center 10.2
  • Extracting data from SNOWFLAKE and performing data manipulation using Informatica Transformations like Expression,Joiner, Aggregator, Lookup, Filter,Sorter,Source Qualifier,Union etc.
  • Debugging and performance tuning the Informatica mapping,sessions and workflow.
  • Developing ETL pipelines in and out of the data warehouse using Snowflake and writing SQL queries using the Snowflake console.
  • Collaborating on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
  • Optimizing SQL queries in the Informatica Source Qualifier.
  • Developing SQL queries to extract and manipulate data from multiple sources, including databases and flat files.
  • Using Unix scripts to run the Informatica workflows and controlling the ETL flow. Contributing to migrate data from Legacy system with CLR data to Guidewire with CC(Claims center) data.
  • Designing and scheduling workflows using Control-M.
  • Improved data accuracy by implementing rigorous data validation processes.
  • Using MS-Office tools like MS-Excel to validate data(using VLOOKUP,Filter, Concat,etc)
  • Running CI/CD pipelines through Git-lab. Working in an Agile environment along with 17 members where I am involved in scrum meeting, sprint planning, sprint review and the progress is tracked with Jira.
  • Maintaining a thorough understanding of applicable regulations and standards related to Insurance data Analysis.

Data Scientist

SilverXis
08.2022 - 09.2022
  • Collected requirements from different customers and performed various statistical analysis using Pandas, Numpy,etc.
  • Built dashboards with Tableau for visualizing actionable insights from large datasets.
  • Created reports summarizing key findings from data exploration activities.
  • Worked in an Agile environment.

SAP ECC (ERP Central Component) Security Consultant & SAP GRC (Governance, Risk & Compliance)

Capgemini Technology Services India Limited
05.2015 - 10.2019
  • Technical Skills and Tools: SAP ECC Security, GRC10.1 Access Control (ARA, EAM, ARM) and SAP FIORI security, BW security, HANA security working for cross-functional teams.

Education

Master of Science in Computer Science -

Pace University
New York, NY
05.2022

Skills

  • Python
  • SQL and Databases(Oracle, MySQL, SAP HANA)
  • Python (NumPy, Pandas, Scikit-learn, TensorFlow, Keras, PyTorch, Matplotlib, Statsmodel)
  • Data Visualization (Seaborn, Ggplot, Plotly, Tableau),
  • ETL development(Informatica Powercenter 102)
  • Snowflake
  • Microsoft Azure (Azure Machine Learning Studio, Azure Databricks, ADLS, ADF, Synapse)
  • AWS (Sagemaker, S3, IAM, EC2)
  • Google Collaboratory, Google Cloud Platform (GCP)
  • Windows XP/Vista/Windows 7/Windows 10, UNIX/LINUX, Microsoft Office Suite (Excel, PowerPoint, etc)

Certification

Microsoft Certified Azure Fundamentals & Azure AI Fundamentals

Awards

  • Customer Delight Award
  • Valuable Contribution Award
  • Star Award at Capgemini Technology Services India Limited

Projects

Data Analysis of Health-care charges incurred by patients in the United StatesAug. 2020 – Dec. 2020

  • The goal of the project was to study the distribution of the amount a patient has to pay for different regions or urban/rural areas of the country for different diseases/health conditions.
  • Implemented linear regression for making the prediction for health-care charges that the patient has to pay using NumPy, Pandas, Seaborn, Matplotlib, ggplot, Scikit-learn on AWS Sagemaker studio with 83% model accuracy.

Credit Card Fraud DetectionJan.2021 – Mar.2021

  • The goal of the project was to recognize fraudulent credit card transactions and dealing with highly unbalanced dataset.
  • Executed different classification models like Decision tree, KNN, Naïve Bayes, SVC, Random Forest & XGBoost to predict fraud using Numpy, Pandas, Seaborn, Matplotlib for data anomaly detection, ANOVA for feature selection, SMOTE to treat oversampling & Scikit-learn on AWS Sagemaker studio with 89% model accuracy.

Data analysis of NYC taxi ride duration

Aug.2021 – Oct.2021

• The goal of the project was to perform an explanatory data analysis (EDA) on NYC’s Yellow Taxi Trip Records from 2020 to find correlation among the various variables to improve ride time predictions.

• Applied deep neural network (NN) and MLP (Multi-layer perceptron) to perform regression modeling analysis on the NYC Yellow Cab dataset and have also used Azure Databricks, Azure Data Lake Gen2, Azure Data Factory and Spark core for the analysis of NYC taxi ride duration with spark SQL and PySpark where required and designed ETL pipeline.

Timeline

Data Engineer

United Services Automobile Association, USAA
09.2022 - Current

Data Scientist

SilverXis
08.2022 - 09.2022

SAP ECC (ERP Central Component) Security Consultant & SAP GRC (Governance, Risk & Compliance)

Capgemini Technology Services India Limited
05.2015 - 10.2019

Master of Science in Computer Science -

Pace University
Akankshya Biswal