Overview

Work History

Education

Skills

Certification

Projects

Timeline

RISHI ALLA

Ashburn,VA

Overview

year of professional experience

Certification

Work History

Associate Data Scientist

LinQuest

08.2023 - Current

- Created text classification models to help automate transformation of raw data to relevant domain specific labeled datasets.

- Created pipelines that takes in uploaded documents (PDFs, Word documents, OCR'd scanned documents, etc.) and cleans it to be used in NLP tasks.

- Used NLP methods and packages (spaCy, BERT, Hugging Face transformers, etc.) to extract metadata (titles, keywords, and summaries) from uploaded documents to custom search engine.

- Use fuzzy matching to find similarities between entries in multiple datasets to create a master dataset with all relevant information within.

USSF Software Developer Intern

LinQuest

05.2023 - 08.2023

- Decoupled hard-coded data from investment tool and implemented functionality which allows users to select data through a GUI which queries into a SQL Database.

- Improved upon existing Plotly Dashboard by making it more user friendly allowing users not well versed in code to utilize the investment tool.

- Created visualizations to view differences in portfolio weights and constraints and improved existing data visualizations to make it easier to understand for all users.

Education

M.S. - Data Science

George Mason University

Fairfax, VA

05.2024

B.S. - Data Science

George Mason University

Fairfax, VA

05.2023

Skills

Python, SQL, R, C, Fortran
Docker
Spark
Plotly
AWS

NLP: spaCy, BERT, Hugging Face, LangChain
Modeling: Support Vector Machines (SVM), Decision Trees, Naive Bayes, Deep Learning (Transformers, Convolutional Neural Networks, LLMs)
Python Packages including Scikit-learn, Pandas, Matplotlib, Seaborn, Numpy, Pytorch, TensorFlow
GUI Development with Dash

Certification

AWS Solutions Architect Associate Certified (Issued by AWS)
DeepLearning.AI TensorFlow Developer (Issued by Coursera)

Projects

Patent Doc Code Classification
- Worked on a model that takes textual data of patent descriptions
and classifies them based on document codes.
- Implementing ability to extract text from images of pdfs using pytesseract.
Airline Delay Model
- Created a regression model in TensorFlow to help predict monthly airline
delays caused to factors controllable by airline carriers.
Plant Disease Classification
- Created multiple classification models including k-nearest neighbors and
convolutional neural networks to help detect disease on plant leaves using a
large image dataset.

Timeline

Associate Data Scientist

LinQuest

08.2023 - Current

USSF Software Developer Intern

LinQuest

05.2023 - 08.2023

M.S. - Data Science

George Mason University

B.S. - Data Science

George Mason University

RISHI ALLA

Overview

Work History

Associate Data Scientist

USSF Software Developer Intern

Education

M.S. - Data Science

B.S. - Data Science

Skills

Certification

Projects

Timeline

Associate Data Scientist

USSF Software Developer Intern

M.S. - Data Science

B.S. - Data Science

Similar Profiles

Candace M. RichardsCandace M. Richards

Evelyn TannerEvelyn Tanner

Casey GriggsCasey Griggs

ARCHIBOLD NIKOIARCHIBOLD NIKOI

Felicia DyittFelicia Dyitt