Summary
Overview
Work History
Education
Skills
Data Science Projects
Certification
Timeline
Generic

Kawkab Abid

Data Scientist
New York,NY

Summary

With a solid background in software engineering and machine learning, I leverage data science to deliver business insights and improve decision making. I thrive in environment where I bridge the gap between data science and business. Data Scientist familiar with gathering, cleaning and organizing data for use by technical and non-technical personnel. Advanced understanding of statistical, algebraic and other analytical techniques. Highly organized, motivated and diligent with significant background in Mathematics, Statistics, Object Oriented Programming such as Python, Java, SQL and Excel.

Overview

2
2
Certifications
4
4
years of post-secondary education
8
8
years of professional experience

Work History

Full Stack Data Scientist

Duke.AI
Dallas, TX
10.2018 - Current
  • Build and deployed State-of-the-Art Optical Character Recognition (OCR) Model which extracts text from PDF and Images on AWS.
  • It involves working with deep learning models and flask.
  • Configured Relational RDS Postgres database on AWS, defined schema, fact and look tables.
  • Build Rest API’s on API Gateway with authentication which involves connection with EC2 and Lambda.
  • Developed end to end pipeline to automate process of inputting purchase order, it involves extracting text from purchase order, storing it into RDS databases and finally sending it to ERP system.
  • Assisted fronted developers in taking incoming documents, classify them, perform OCR and KVT and later map to AWS DynamoDB table.
  • Developed Backend Flask Application in Microsoft Visual Studio to take incoming JSON and convert it to XML, which is then injected into Microsoft Dynamics GP via eConnect.
  • Coordinated with clients to develop and present full pipeline that would take data via email and inject it to company’s ERP System.

Data Scientist

Seedstages
Los Angeles, CA
01.2017 - 09.2020
  • Designed and implemented auto-posting tool with hashtag intelligence using Python which translates daily subscription email into twitter post leading to 7% increase in social media revenue.
  • Developed 87% accurate time series model using Deep AR Algorithm to predict revenue based on web traffic metrics and built KPI dashboard to help influence strategic decision making.
  • Spearheaded and developed pricing model using Python and SQL which analyzed competitor prices to optimize product data feed for PLAs, leading to increase in revenue by gross margin of $1M.
  • Achieved $100k reduction in manual labor cost by developing GPT-2 model to auto generate articles using NLP by pre-processing training data, to produce accurate result texts for each product type.
  • Led and managed third-party analytics team to automate existing leading to reduction in manual work time by 4 hours/weak.
  • Performed basic exploratory data analysis on consultants’ psychology survey to gain insights.
  • Joined and pivoted multiples tables of 75000 consultants based on transactional and product purchase history using SQL queries.
  • Implemented Random Forest, Boosting, Logistic Regression, LDA and classification tree.
  • KNN Clustering algorithm was used to segment 2200 consultants based on 12 month rolling basis from year 2017-2018.
  • Achieved 92% Model accuracy to predict consultant churn within two months of enrolling as consultant.

Data Analyst Intern

New York Times
New York, NY
09.2015 - 12.2016
  • Performed sentiment analysis based on comments from Facebook post to identify likes and dislikes of users and created custom audiences for products to target personalized ads.
  • Implemented sort algorithm to personalize product pages using Python which resulted in 8% uplift on Purchase/User.
  • Completed data cleaning and data validation of existing spreadsheets to promote robust data management platform, resulting in accurate data analysis and entry.
  • Identified, analyzed and interpreted trends or patterns in complex data sets by finding correlations and visualizing with charts.
  • Presented reports to clients and teammates regarding project progress and results.
  • Utilized various professional statistical techniques and maintained large databases to collect and analyze data from partners and customers.
  • Created various Excel documents to assist with pulling metrics data and presenting information to stakeholders for concise explanations of best placement for needed resources.

Education

Bachelor of Science - Mathematics Concentrated in Engineering, Chemistry

Hofstra University
Hempstead, NY
09.2015 - 12.2019

Skills

    Machine learning

undefined

Data Science Projects

Salary Prediction for Future Employee [Python]

  • Deployed a salary prediction application for HR and talent acquisition team to help predict future salary for new job positing based on historical hiring data.
  • Implemented various ML algorithms in Python OOP for predictive modeling. Gradient Boosting outperformed with Mean Squared Error of 313, increasing the performance by 18%.

Understanding and Predicting Employee Turnover [Python]

  • Developed an employee turnover model to understand the attributes for an employee to leave the current company.
  • Utilized upsampled and down-sampled method to balance the data and applied 5-fold cross validation using Logistic Regression to determine which method resulted in the best F-1 score. Used SMOTE to train Random Forest and achieved 28.6% improvement in F-1 score over baseline.

Predicting Heart Disease | [Python]

  • Developed a ML model using Scikit-Learn which predicts heart diseases of a patient based on various medical attributes.
  • Implemented Logistic Regression, K-Nearest Neighbors and Random Forest Classifiers. Achieved 88.5% accuracy score by implementing hyper parameter tuning of Logistic Regression using RandomizedSearchCV and GridSearchCV.

Certification

AWS Certified Solutions Architect – Associat

Timeline

AWS Certified Solutions Architect – Associat

12-2020

Microsoft Certified: Data Analyst Associate

12-2020

Full Stack Data Scientist

Duke.AI
10.2018 - Current

Data Scientist

Seedstages
01.2017 - 09.2020

Bachelor of Science - Mathematics Concentrated in Engineering, Chemistry

Hofstra University
09.2015 - 12.2019

Data Analyst Intern

New York Times
09.2015 - 12.2016
Kawkab AbidData Scientist