Insightful Data Scientist skilled in machine learning, big data analytics, and statistical modeling to deliver actionable insights. Strong in communication, problem-solving, and teamwork, enabling successful collaboration across departments to drive projects to completion.
Overview
11
11
years of professional experience
1
1
Certification
Work History
Machine Learning Engineer
Merck
08.2021 - Current
I worked in Agile Methodology for below projects
Conducted research on chatbots such as Amazon Lex, Dialogflow, and Merck self-service chatbot.
Researched NLP techniques for intent detection and named entity recognition in chatbot systems.
Initiated AI-driven pilot to evaluate semi-automated compendium review using similarity analysis and entity extraction.
Created an algorithm for comparing text and highlighting differences across files.
Optimized retrieval processes by creating an automated solution for API data extraction.
Created Python test scripts to validate data pipeline integrity.
Designed a data-driven user interface with Streamlit for dynamic visual representation.
Developed and executed business capability strategy aligned with overall domain goals.
Revamped application framework to shift focus from disease-centric to product and drug name-centric.
Implemented Coveo Intelligent Search for efficient document retrieval solutions.
General Pre-Review Tool (GRT): The project aims to use an enhanced AI approach in a tool solution that aims to increase efficiency, save time and costs, reduce manual screening efforts and reduce errors, thus increasing compliance and innovation in the departments
Optimized tokenizer for more accurate keyword combination handling.
Processed text by utilizing OCR technology from PDFs.
Examined several Hugging Face NER models, including NER-Chemical-Bionlp, Biomedical-NER, HunFlair NLP, and BERT.
Worked on the Generative AI summarization of insights system (GENESIS) to generate the summarization of insights, experimented with the LexRank algorithm, and developed AWS Lambda functions, Step Functions, and Terraform templates.
Engineered synthetic data generation using Generative AI for clinical trial forms.
Graduate Research Assistant
IU School of Informatics & Computing
11.2020 - 05.2021
Detection of low-head river dams using deep Learning (Project under Federal Government): Worked closely with Polis center to develop a deep learning model that detects low-head dams from Geo Spatial images provided
Engineered a CNN model leveraging Pytorch with an 84% accuracy rate.
Early detection of low head dams helps in reducing the risk of accidents for kayakers
Collaborated with Polis Center to obtain geospatial images of Indiana counties for classification.
Collected, analyzed, and labeled raw datasets for model development using ArcGIS.
The project further aims to classify building types using the generated labels.
Analyst
Oracle India Pvt Ltd
09.2014 - 12.2019
Resolved incidents in employee and job database using Core HR module.
Experienced and skilled in SQL, managing Peoplesoft modules like Enterprise Learning Management, Recruiting, Retrofitting
Skilled in writing, testing, and deploying code into production efficiently.
Leveraged Oracle HCM Cloud to handle employee information and track absences.
Education
Masters - Applied Data Science
Indiana University-Purdue University
Indianapolis
05.2021
M.A - English Literature
IGNOU
Delhi, India
12.2017
B. Tech - Electronics and Communication Engineering
Forecasting Air Quality for Kern County, California using PySpark, Pre-processed the data using machine learning, implemented time series models ARIMA and SARIMA.
Exploratory Data Analysis (EDA) on Internet Movie Database (IMDb), Performed an in-depth EDA to identify the genre with highest average rating.
Visualization and Analysis of World map, Developed an interactive visualization by plotting the world map.
'Little Bibliophile' - Online Story Books for Kids, Developed an interactive website for kids to read stories.
Latency Prediction and Anomaly Detection, Predicted latency between different AWS cloud regions using Random Forest and Linear regression.
Certification
Dataiku Core Practitioner
Dataiku Machine Learning
Oracle HCM Cloud
Timeline
Machine Learning Engineer
Merck
08.2021 - Current
Graduate Research Assistant
IU School of Informatics & Computing
11.2020 - 05.2021
Analyst
Oracle India Pvt Ltd
09.2014 - 12.2019
Masters - Applied Data Science
Indiana University-Purdue University
M.A - English Literature
IGNOU
B. Tech - Electronics and Communication Engineering