Computer Science Ph.D. student and a Crerar Fellow at The University of Chicago with a joint appointment as an AI researcher at the Argonne National Laboratory. Bringing forth academic and industrial experience in exascale Natural Language Processing, Bioinformatics, Drug Discovery, Protein Design, and Self-Driving Laboratories. Passionate about training cutting-edge AI models on large datasets and fast supercomputers for protein design and drug discovery. Google-certified as a Tensorflow Developer, proficient in PyTorch, and experienced in production-ready software development in dynamic start-up environments as well as within billion-dollar Nasdaq-listed corporations including Google X.
Evaluated the value of reasoning traces as retrieval resources for improving LLMs and SLMs on biological multiple-choice and open-ended question answering.
Link to manuscript: https://tinyurl.com/4xp3umr4
Working with Prof. Rick L. Stevens and Prof. Ian Foster on AI for Biology with a focus on Large Language Models, Retrieval-Augmented Generation, Reasoning Models, and Scientific Question-Answering.
Led the development of the first open-source HPC frameworks for scientific retrieval-augmented generation (RAG). Our technology allowed for indexing over 3.6 million scientific papers for retrieval by LLMs. The effort was recognized by the Argonne National Laboratory with an Impact Award.
Successfully developed the PoC for a multi-agent tool-calling trading assistant. The PoC helped the team secure a successful round of funding. The technology was developed and deployed on the Amazon Bedrock Agents platform.
Worked on the development of an RL framework for designing novel malate dehydrogenase (MDH) enzymes via genome-scale LLMs. The project culminated in the AI-guided design of enzyme blueprints that are superior to their natural counterparts. The technology developed led to a patent application.
Link to the publication: https://dl.acm.org/doi/abs/10.1145/3624062.3626087
Worked on various projects related to high-throughput binding ligand-protein binding affinity prediction via transformers. Additionally, helped develop an RL-based framework for multi-objective alignment of LLMs for drug discovery. These efforts each led to a publication.
Developed and deployed Graph Neural Networks (GNNs) for molecular property prediction of proprietary PROTAC designs. Successfully navigated the challenges surrounding a lack of sufficient relevant training data by inventing a novel data augmentation technique for PROTACS.
Worked as a member of the inaugural team of researchers at the Rapid Prototyping Lab at ANL. Developed a DNA assembly protocol in Python to automate the assay process via Opentrons OT-2 pipetting robots.
Teaching Assistant for CMSC 23360 Advanced Networks in Spring 2020 at the University of Chicago.
Teaching Assistant for CMSC 20370/30370, Inclusive Technology: Design For Underserved and Marginalized Communities.
Teaching Assistant for CMSC 209, Computers for Learning.
Developed, tested, and deployed a cloud-based Android application called Feed-it that facilitates pre-service training for K-12 mathematics and science teachers at MIT Teaching Systems Lab.