Around 8 years of professional IT experience with 5 years of experience in Data Science
Familiar with gathering, cleaning and organizing data for use by technical and non-technical personnel. Advanced understanding of statistical, algebraic and other analytical techniques. Highly organized, motivated and diligent with significant background in Data Analytics, Machine Learning (ML), Predictive Modelling, Natural Language Processing (NLP), Time series analysis and Deep Learning algorithms.
Experienced in facilitating the entire lifecycle of a data science project: Data Extraction, Data Pre-Processing, Feature Engineering, Dimensionality Reduction, Algorithm implementation, Back Testing and Validation.
Expert knowledge in machine learning algorithms such as Linear, Polynomial, Logistic Regression, Regularized Linear Regression, SVMs, Neural Networks, Extreme Gradient Boosting, Decision Trees, K-Means, Gaussian Mixture Models, Hierarchical models, Naïve Bayes.
Well versed with dealing with Structured and Unstructured data, Time Series data and statistical methodologies like Hypothesis Testing, ANOVA, multivariate statistics, regression, classification, modeling, decision theory, time-series analysis and Descriptive statistics.
Good communication and presentation skills, willing to learn and adapt to new technologies
Overview
1
1
Certification
2
2
years of post-secondary education
7
7
years of professional experience
Work History
Sr. Data Scientist
Caterpillar Inc
Peoria, IL
11.2017 - Current
Implemented multiple analytics projects like fluid analysis, Stock optimization, Work force optimization, Sentiment analysis, Machine failure prediction and Ad hoc reports of sales, process optimization and minimizing defects etc. to improve the business
Developed analytical models to predict lifetime of various components and actionable recommendations to prevent unscheduled maintenance.
Followed Cross Industry Standard process for Data mining (CRISP) model to develop analytics models.
Developed model to monitor work overload in various facilities/ teams based on employee time sheet data and provide insights to business.
Participated in all phases of project life including data collection, data mining, data cleaning, EDA, developing models, validation, and creating reports.
Applied Time Series model – ARIMA for stock optimization and work force optimization.
Used Logistic Regression Classifier, Random forest and XGBoost for fluid analysis and machine failure prediction
Found the model Performance, set threshold on prediction probabilities and Recall and Precision tradeoff is entirely based on business requirement.
Built sentiment analysis on CAT products using NLP (bag of words model) which was useful tools for managers who were interested in transforming web-based text into meaningful, quantifiable information.
Created Ad hoc reports for sales, process optimization and minimizing defects
Developed quarterly roadmaps based on impact, effort and test coordinations, working with business to achieve short-term and long-term goals.
This project involves development analytical models for Predicting the likelihood of getting a disease based on the data analysis
Performed data extraction, validation, summarizations insights related to analytical projects from EHR system.
Output of the model is to inform the insights to healthcare provides so that they can use to improve their process.
Built Decision Tree, Random Forests and Neural Network models to find health risk levels of customers.
Validated the model using different metrics.
Also used different concepts like supervised classification techniques like K-Nearest Neighbor (KNN), Logistic Regression and, Decision Trees, Random Forest for other use cases
Environment: Python, NumPy, Pandas, Sci-kit learn, Matplotlib, Seaborn, Random forest and Oracle.
Java/J2EE developer
Caterpillar, Caterpillar Inc
Peoria, IL
11.2014 - 10.2015
In order to provide adequate support to the application teams, the Global Information Systems division undertook a project to build an internal application for application and infrastructure support personnel for lower environments to monitor and trouble shoot issues.
Developed RESTful services for performing business logic and to act as a façade for the data tier.
Followed Service Oriented Architecture (SOA) and created various services for easy code maintenance.
Hibernate is used for object relational data mapping.
Implemented SPRING MVC architecture to develop presentation tier and business layer.
Used various front end technologies for web interfcae
Java developer
HDMS
Chicago, IL
09.2013 - 11.2014
Design & development of various internal web applications
Developed front end applications using Core Java, Servlets, HTML5, JavaScript, JQuery, DOM & CSS3.
Worked on JUnit to develop unit test cases.
Worked on SQL/PLSQL programs to validate and code the database tables.
Education
Master of Science - Computer Science
University Of South Dakota
Vermillion, SD
01.2012 - 08.2013
Skills
Machine learning
undefined
Accomplishments
Senior Data Scientist (SDS) – Data Science Council of America (DASCA), 2020
https://www.credbadge.com/credit/member/6442063714/.