Summary
Overview
Work History
Education
Skills
Notable Mentions
Timeline
background-images
Shivani Gupta

Shivani Gupta

Summary

Experienced data scientist with a solid foundation in statistical analysis, machine learning, and data visualization. Skilled in Python, R, SQL, and diverse data processing tools, prioritizing the delivery of actionable insights. Renowned for problem-solving prowess and innovative thinking in leveraging data to drive business decisions.

Overview

4
4
years of professional experience

Work History

Data Scientist

National Institutes of Health
01.2023 - 01.2025

Objective

  • Enable seamless search of Big Data medical research images & documents on new NLM search interface

Responsibilities

  • Mapped biomedical research repository attributes to defined DATMM classes
  • Implemented JSON LD data structure aligned with RDF schema
  • Designed GraphQL queries for enhanced search and generated tableau dashboards for visualizations
  • Implemented python based A/B testing to evaluate impact of incremental schema updates on search experience
  • Automated repetitive tasks using scripting languages such as Python or R, saving time during the analytical process significantly.
  • Utilized advanced querying, visualization and analytics tools to analyze and process complex data sets.
  • Applied statistical and algebraic techniques to interpret key points from gathered data.
  • Worked with stakeholders to develop quarterly roadmaps based on impact, effort and test coordinations.
  • Designed interactive Tableau dashboards and reports, providing real-time business intelligence to stakeholders.

Data Modeling Engineer

National Library of Medicine
01.2022 - 01.2023

Objective:

  • Automated standardization of ambiguous medical research document data submitted to NLM's common data repositories

Responsibilities:

  • Translated business requirements into data-driven solutions, providing value-added insights that directly contributed to the project goals.
  • Developed custom ETL data pipelines for streamlined data ingestion and processing, improving workflow efficiency & extract data from research documents
  • Analyzed large datasets to identify trends and patterns in submitted repository data
  • Generated embeddings using Sentence Transformers & loaded them in Vector DB FAISS & metadata in GCP to aid similarity search for documents
  • Compiled, cleaned and manipulated data for proper handling
  • Coached and mentored junior data scientists on SAS and data mining techniques.
  • Identified, measured and recommended improvement strategies for KPIs across business areas.


Database Programmer

National Library of Medicine
01.2021 - 01.2022

Objective

  • Standardize PubMed portal search input fields based on end user search behavior analytics

Responsibilities

  • Programmed complex SQL queries on PubMed portal search logs of 5 years & identified user input pattern across 48 free text input fields
  • Enumerated succinct list of input values for 48 search fields to remove free text user input
  • Enhanced application functionality by developing custom stored procedures, triggers, and functions using SQL
  • Optimized search performance by implementing efficient indexing and query optimization techniques.
  • Contributed to project planning by providing accurate time estimates for programming tasks and milestones
  • Ensured data integrity with the implementation of comprehensive error handling and validation procedures.
  • Collaborated with system architects, design analysts and others to understand program requirements

Education

Master of Science - DSBA (Data Science & Business Analytics)

University of North Carolina Charlotte
08-2021

Master of Technology - Biotechnology

Kurukshetra University
08-2013

Skills

  • BERT data modeling
  • OpenAI model utilization
  • FAISS implementation
  • Proficient in TensorFlow
  • Experience with Scikit-learn
  • Statistical A/B testing
  • Data visualization expertise using Tableau
  • ETL process optimization
  • AWS data warehousing
  • Databricks platform expertise
  • Proficient in Pandas for data processing
  • Sklearn data analysis expertise
  • Skilled in PySpark data manipulation
  • Natural language processing
  • Proficient in Python programming
  • Statistical analysis

Notable Mentions

Published research paper: Synergistic antimicrobial activity of essential oil and chemical food preservatives

Timeline

Data Scientist

National Institutes of Health
01.2023 - 01.2025

Data Modeling Engineer

National Library of Medicine
01.2022 - 01.2023

Database Programmer

National Library of Medicine
01.2021 - 01.2022

Master of Science - DSBA (Data Science & Business Analytics)

University of North Carolina Charlotte

Master of Technology - Biotechnology

Kurukshetra University
Shivani Gupta