Summary
Overview
Work History
Education
Skills
Websites
Academic Projects
Certification
Timeline
Generic

Yamini Sri Vandana Penamakuru

Cincinnati

Summary

Insightful Data Scientist skilled in machine learning, big data analytics, and statistical modeling to deliver actionable insights. Strong in communication, problem-solving, and teamwork, enabling successful collaboration across departments to drive projects to completion.

Overview

11
11
years of professional experience
1
1
Certification

Work History

Machine Learning Engineer

Merck
08.2021 - Current

I worked in Agile Methodology for below projects

  • Conducted research on chatbots such as Amazon Lex, Dialogflow, and Merck self-service chatbot.
  • Researched NLP techniques for intent detection and named entity recognition in chatbot systems.
  • Initiated AI-driven pilot to evaluate semi-automated compendium review using similarity analysis and entity extraction.
  • Created an algorithm for comparing text and highlighting differences across files.
  • Optimized retrieval processes by creating an automated solution for API data extraction.
  • Created Python test scripts to validate data pipeline integrity.
  • Designed a data-driven user interface with Streamlit for dynamic visual representation.
  • Developed and executed business capability strategy aligned with overall domain goals.
  • Revamped application framework to shift focus from disease-centric to product and drug name-centric.
  • Implemented Coveo Intelligent Search for efficient document retrieval solutions.
  • General Pre-Review Tool (GRT): The project aims to use an enhanced AI approach in a tool solution that aims to increase efficiency, save time and costs, reduce manual screening efforts and reduce errors, thus increasing compliance and innovation in the departments
  • Optimized tokenizer for more accurate keyword combination handling.
  • Processed text by utilizing OCR technology from PDFs.
  • Examined several Hugging Face NER models, including NER-Chemical-Bionlp, Biomedical-NER, HunFlair NLP, and BERT.
  • Worked on the Generative AI summarization of insights system (GENESIS) to generate the summarization of insights, experimented with the LexRank algorithm, and developed AWS Lambda functions, Step Functions, and Terraform templates.
  • Engineered synthetic data generation using Generative AI for clinical trial forms.

Graduate Research Assistant

IU School of Informatics & Computing
11.2020 - 05.2021
  • Detection of low-head river dams using deep Learning (Project under Federal Government): Worked closely with Polis center to develop a deep learning model that detects low-head dams from Geo Spatial images provided
  • Engineered a CNN model leveraging Pytorch with an 84% accuracy rate.
  • Early detection of low head dams helps in reducing the risk of accidents for kayakers
  • Collaborated with Polis Center to obtain geospatial images of Indiana counties for classification.
  • Collected, analyzed, and labeled raw datasets for model development using ArcGIS.
  • The project further aims to classify building types using the generated labels.

Analyst

Oracle India Pvt Ltd
09.2014 - 12.2019
  • Resolved incidents in employee and job database using Core HR module.
  • Experienced and skilled in SQL, managing Peoplesoft modules like Enterprise Learning Management, Recruiting, Retrofitting
  • Skilled in writing, testing, and deploying code into production efficiently.
  • Leveraged Oracle HCM Cloud to handle employee information and track absences.

Education

Masters - Applied Data Science

Indiana University-Purdue University
Indianapolis
05.2021

M.A - English Literature

IGNOU
Delhi, India
12.2017

B. Tech - Electronics and Communication Engineering

JNTU
Kakinada, India
05.2014

Skills

  • Python
  • R
  • SQL
  • HTML
  • CSS
  • JavaScript
  • PHP
  • Pandas
  • NumPy
  • SciPy
  • Pytorch
  • TensorFlow
  • Keras
  • Scikit-learn
  • Beautiful Soup
  • Matplotlib
  • D3 JS
  • Oracle Global HR Cloud
  • PySpark
  • Tableau
  • ArcGIS
  • MS Access
  • MS Office
  • Confluence
  • Elasticsearch
  • Workplace search
  • Leadership
  • Effective Communication skills
  • Dynamic Team Player
  • Agile

Academic Projects

  • Forecasting Air Quality for Kern County, California using PySpark, Pre-processed the data using machine learning, implemented time series models ARIMA and SARIMA.
  • Exploratory Data Analysis (EDA) on Internet Movie Database (IMDb), Performed an in-depth EDA to identify the genre with highest average rating.
  • Visualization and Analysis of World map, Developed an interactive visualization by plotting the world map.
  • 'Little Bibliophile' - Online Story Books for Kids, Developed an interactive website for kids to read stories.
  • Latency Prediction and Anomaly Detection, Predicted latency between different AWS cloud regions using Random Forest and Linear regression.

Certification

  • Dataiku Core Practitioner
  • Dataiku Machine Learning
  • Oracle HCM Cloud

Timeline

Machine Learning Engineer

Merck
08.2021 - Current

Graduate Research Assistant

IU School of Informatics & Computing
11.2020 - 05.2021

Analyst

Oracle India Pvt Ltd
09.2014 - 12.2019

Masters - Applied Data Science

Indiana University-Purdue University

M.A - English Literature

IGNOU

B. Tech - Electronics and Communication Engineering

JNTU
Yamini Sri Vandana Penamakuru