Overview
Work History
Education
Skills
Certification
Languages
Personal Information
Timeline
Generic

LYDIA MASSOKO

Dallas,United States

Overview

6
6
years of professional experience
1
1
Certification

Work History

Data Scientist | AI Architect

Exxon Mobil
08.2022 - 12.2024
  • Company Overview: Client: Exxon Mobil, Texas, United States
  • Business Domain: Audit & AI
  • Hands-on Presales Expertise in AI/ML as I collaborated with major analytics projects to design and implement tailored AI solutions, enhancing client engagements and operational success
  • Developing a PoC on document question-answering/matching for long audit procedures documents using Langchain, longformer, Milvus, AutoGPT, FAISS & Cohere APIs
  • A digital coach in audit process which aims to provide semantic search and re-ranking in Audit guidance documents using Cohere, Azure Openai, Azure Cognitive Search
  • A PoC on Auditor performs risk assessment of an entity's business using various industry internal and external factors to identify overall financial position using Azure Openai, Azure Cognitive Search & Falcon 7B models
  • Scrapping input from Audit guidance standards like FASB, SEC Regulations, AICPA
  • Configure API keys & preprocess input documents like read, split/chunking
  • Storing and indexing vectors in vector DB - Azure CognitiveSearch & FAISS
  • Vector embeddings using sentence embeddings like Ada, Davinci, SentenceBERT
  • Document matching, re-ranking documents
  • Data summarization using LLM models GPT-3.5 turbo, GPT-4 & Falcon 7B
  • Fine-tune LLM models GPT-3.5 turbo, GPT-4 & Falcon 7B
  • Validating prompts using Azure OpenAI documentation
  • Architect & deploy CI/CD pipelines using Azure Devops & Github Actions
  • Deploy Chat app into Azure production environment
  • Validating prompt drift processes
  • Client: Exxon Mobil, Texas, United States
  • Business Domain: Audit & AI
  • Tools & Technologies: Python, VS Code, PyTorch, TensorFlow, Github Actions, Huggingface, Cohere API, Azure CognitiveSearch, Azure OpenAI Studio, AutoGPT, Langchain, FAISS, Milvus, BlobStorage, SentenceBERT, Chainlit, Git, Jira, Azure Data Factory, Azure Devops

Data Scientist

One Main Financial
Buffalo, United States
09.2018 - 08.2022
  • Company Overview: Client: One Main Financial (Bank) in Buffalo, NY
  • Business Domain: Banking, Auditing, IoT, Healthcare, Insurance & Pharmaceutical
  • Build Supervised and Unsupervised traditional AI models
  • Understanding client requirements, mapping problem definitions with AI/ML solutions
  • Working on RFPs, PoCs and MVPs, creating roadmaps, architectures, strategies to develop AI solutions & active solution review meetings
  • Day-to-day interactions with end clients, leading client teams & project deliveries
  • Worked with other partner clients on EHR, claims and health insurance datasets
  • Designed a block level technical architecture & facilitate in architecture
  • Developed & deployed a Sharecare QnA & semantic search chatbot using Azure CognitiveSearch, Formrecognizer, Azure OpenAI Studio, Azure Devops components
  • Developed an MVP on medical document parser for tables and text extraction in fax scan documents from MGB channel DB which is created on Azure Databricks using AWS Comprehend, Azure Formrecognizer & deployed it in Azure Devops
  • Assesment of denoising & two-stage object-detection models like RCNN variants
  • Testing feasibility of custom OCR models and fine-tuning BERT model variants
  • Validating state of the art transformer based NER models and fine-tuning with few-shot & zero-shot learning models
  • Develop & test various custom NLU pipelines and validation using customized BERT
  • Creating intents, rules, stories and slots and running validation experiments using Dialogflow, Amelia and RASA NLU
  • Review API's & CI/CD deployment processes using Azure Devops & AWS Sagemaker
  • Lead the Data-driven Scrum practices and tracking team activities using Azure Devops
  • Review models deployment processes in production environment - AWS Sagemaker
  • Client: One Main Financial (Bank) in Buffalo, NY
  • Business Domain: Banking, Auditing, IoT, Healthcare, Insurance & Pharmaceutical
  • Tools & Technologies: Python, Pycharm, VS Code, scikit-learn, Numpy, Matplotlib, Azure Devops, PyTorch, TensorFlow, DialogFlow, Amelia, Kore BotBuilder, RASA Stack, BERT models, GPT-3, Azure Cognitive Search, Azure Openai, AutoGPT, Langchain, Milvus, Large Language Models (LLMs), Huggingface, Elasticsearch, Tableau, Git, OpenCV, Pillow, Detectron2, Prodigy, EasyOCR, LayoutLM, FormRecognizer, CascadeTablenet, AWS Textract, Layout Parser, spaCy, AWS Comprehend, AWS Sagemaker, Vertex AI, Azure Data Factory, Azure Databricks, CosmosDB, Azure Blob Storage, GCP, Jira, whylogs

Education

Bachelor's degree - Mechanical Engineering

University at Buffalo
NY
01.2019

Skills

  • Python
  • R
  • SAS
  • JAVA
  • SQL
  • FQL
  • XML
  • MySQL Workbench
  • MongoDB
  • PostgreSQL
  • Oracle 10g
  • EXistDB
  • Neo4j
  • Redis
  • CosmosDB
  • Azure BlobStore
  • Pinecone
  • Milvus
  • Qdrant
  • NumPy
  • Sklearn
  • SciPy
  • Pandas
  • Matplotlib
  • Seaborn
  • Plotly
  • Bokeh
  • ScraPy
  • PyTorch
  • TensorFlow
  • PyCharm
  • R Studio
  • KNIME 34
  • SAS Enterprise Miner 71
  • Miner 52
  • Alteryx Designer
  • H2O
  • Spark MLlib
  • H2O AutoML
  • AWS Sagemaker
  • Azure ML Studio
  • AutoKeras
  • Ggplot
  • Power BI
  • Tableau
  • Qlikview
  • SAS VA
  • Keras
  • PyTorch Lightning
  • NVIDIA cuDNN
  • Deeplearning4j
  • N-Grams
  • Skip-grams
  • Word2vec
  • GloVe
  • NNLM
  • FastText
  • ELMo
  • ULMFiT
  • BERT-base
  • ROBERTA
  • DistillBERT
  • Wav2Vec2
  • NLTK
  • SpaCy
  • INLTK
  • IndicNLP
  • TextBlob
  • Gensim
  • AWS Comprehend
  • RASA
  • Kore Bot Builder
  • Botfront
  • Dialogflow
  • DeepPavlov
  • MS Bot Framework
  • Lucene
  • Elasticsearch
  • Solr
  • Torchaudio
  • Deepspeech
  • Speechrecognition
  • Google Cloud Speech API
  • OpenCV
  • Pillow
  • TensorFlow Lite
  • YOLO
  • SimpleCV
  • Scikit-image
  • EasyOCR
  • PyTesseract
  • LayoutLM
  • LayoutParser
  • Azure FormOCR
  • PaddleOCR
  • AWS Textract
  • Detectron2
  • Prodigy
  • AWS EC2
  • GCP
  • Azure
  • Heroku
  • H2O Driverless AI
  • Azure ML Service & Studio
  • Google Cloud AutoML
  • Azure Databricks
  • KNIME Cloud Server
  • SHAP
  • LIME
  • Whylogs
  • Evidently
  • Alibi-detect
  • Facebook4j API
  • Facebook Graph API
  • Flask
  • FastAPI
  • Streamlit
  • Chainlit
  • Nginx
  • Uvicorn
  • Gunicorn
  • TomcatEE 7
  • Eclipse Kepler

Certification

  • Mastering Git with BitBucket & Github, Udemy
  • Kubeflow Fundamentals: How to build ML/AI Pipelines, Udemy
  • Natural Language Processing: NLP with Transformers in Python, Udemy
  • Speech Recognition A-Z with Hands-on, Udemy
  • Large Language Models (LLMs), National University of San Diego, CA, USA

Languages

  • English
  • French
  • Italian

Personal Information

Title: Data Scientist

Timeline

Data Scientist | AI Architect

Exxon Mobil
08.2022 - 12.2024

Data Scientist

One Main Financial
09.2018 - 08.2022
  • Mastering Git with BitBucket & Github, Udemy
  • Kubeflow Fundamentals: How to build ML/AI Pipelines, Udemy
  • Natural Language Processing: NLP with Transformers in Python, Udemy
  • Speech Recognition A-Z with Hands-on, Udemy
  • Large Language Models (LLMs), National University of San Diego, CA, USA

Bachelor's degree - Mechanical Engineering

University at Buffalo
LYDIA MASSOKO