Accomplished Data Scientist and Software Engineer with 9+ years of experience delivering innovative solutions in advanced analytics, machine learning, software engineering, and cloud infrastructure. Expert in designing end-to-end AI/ML pipelines, high-availability cloud systems, and fast, scalable APIs. Deep expertise in model training, deployment, optimization, and LLM (Large Language Model) integrations. Passionate about driving business value through data-driven insights, automation, and scalable solutions.
Overview
12
12
years of professional experience
Work History
Data Scientist
LexisNexis Legal & Professional
Raleigh, NC
01.2018 - Current
Developed LLM training data pipelines with a proxy integration layer (Claude 3.5, GPT-4o, Nova Premier) for scalable, taxonomy-aware passage vetting and cost-optimized inference.
Designed and deployed multi-stage passage validation frameworks combining LLM vetting, SME grading alignment to improve on-point recall and reduce noise in training sets.
Achieved >85% classification accuracy on legal document classification using Convolutional Neural Networks (CNNs).
Developed training data generation pipelines using LLM proxy (Claude 3.5, GPT-4o) for document expansion and vetting.
Constructed memory-efficient CNN models with mixed precision training to optimize for cloud deployment.
Architected end-to-end AWS pipelines (EC2, Lambda, S3, DynamoDB, SQS, SSM) for large-scale document ingestion, classification, and model inference.
Automated CI/CD pipelines via Jenkins and AWS Systems Manager, cutting deployment time and improving environment reproducibility.
Enhanced observability and fault tolerance for distributed ML services using custom CloudWatch logging and structured event monitoring.
Designed FastAPI-based microservices for real-time document classification and retrieval APIs.
Built scalable embedding-based search systems combining BM25 and dense retrieval (FAISS) for legal document retrieval.
Software Engineer
LexisNexis Legal & Professional
Raleigh, NC
01.2016 - 01.2018
Built features for LexisNexis Plus using C#, .NET, Typescript, and JSON/XML.
Developed Python-Selenium based bots for mass content update and quality assurance.
Automated deployment of services using Azure DevOps, Jenkins, and AWS Systems Manager.
Managed production releases and monitored pre-production logs through Splunk.
Participated in Agile ceremonies, driving user story delivery and sprint planning.
Software Engineer (.NET Developer)
Greater New York Hospital Association (GNYHA)
New York, NY
01.2014 - 01.2016
Designed and deployed scalable AWS cloud solutions using EC2, ECR, and CloudFormation.
Automated deployment of Dockerized applications to EC2 using AWS SSM and Python scripts.
Defined Infrastructure-as-Code templates for EC2, Security Groups, and Load Balancers in YAML.
Developed ASP.NET MVC-based healthcare systems and APIs are integrated with SQL Server backends.
Education
Master of Science (M.S.) - Data Science
Cabrini University
Pennsylvania
01.2020
Master of Technology (M.Tech) - Construction Technology Management
Indian Institute of Technology-Delhi
India
01.2012
Bachelor of Science (B.Sc.) - Engineering & Information Technology
Content Development Editor, Regulatory Compliance at LexisNexis Legal & ProfessionalContent Development Editor, Regulatory Compliance at LexisNexis Legal & Professional