Summary
Overview
Work History
Education
Skills
Websites
Timeline
Generic

Vikramjeet Singh

Seattle

Summary

Results-driven Staff Data Scientist with over 8 years of experience in AI/ML engineering, data science, and machine learning model development. Expertise in building and deploying cutting-edge solutions, including AI-powered chatbots, regression models, and causal inference engines. Proven track record of leading complex projects, from feature engineering and model optimization to full-stack development with technologies like PyTorch, TensorFlow, LangChain, and AWS. Strong foundation in data mining, statistics, and causal inference, leveraging advanced techniques to drive business outcomes and improve product quality. Adept at mentoring teams, fostering technical growth, and optimizing processes through automation and scalable architectures. Passionate about leveraging data-driven insights to solve real-world challenges and create innovative solutions.

Overview

7
7
years of professional experience

Work History

Staff Data Scientist

Cisco Meraki
Seattle
09.2021 - Current

Tech Stack: Llama 3, Weaviate, LangChain, Hugging Face, GitLab CI/CD, Docker, Kubernetes, Streamlit, Airflow, PySpark, AWS Glue, FAISS, SHAP, RAGAS

  • Led AI Chatbot Development: Designed and deployed an AI-powered chatbot using a Retrieval-Augmented Generation (RAG) pipeline built on Llama 3.1, Weaviate, LangChain, and Hugging Face. Fine-tuned Llama 3.1 for domain-specific tasks and implemented CI/CD deployment using Docker and Kubernetes. Fine-tuned Instructor Embeddings using contrastive learning to improve retrieval quality
  • Evaluation Framework for LLMs: Built an innovative evaluation pipeline combining BLEU, ROUGE, BERTScore, and WMD with Natural Language Inference (NLI) scores to assess semantic and factual accuracy of LLM outputs. Integrated RAGAS (LLM-as-a-Judge) for automatic evaluation via GPT-based models, improving response reliability and evaluation scalability.
  • Improved Search and Reranking: Enhanced AI assistant performance using hybrid semantic search and reranking models to ensure more relevant, context-aware results.
  • Built Text-to-SQL Tool: Developed a natural language to SQL interface powered by GPT-3.5, FAISS, and LangChain, enabling non-technical users to query databases seamlessly.
  • Feature Registry Tool: Engineered a Streamlit-based analytics platform to track adoption and revenue impact of 170+ product features, backed by real-time data pipelines using AWS Glue and PySpark (patent pending).
  • Causal Inference Engine: Built a Propensity Score Matching engine to assess firmware quality and performance, guiding release decisions with statistical rigor.
  • Data Quality Enhancement: Implemented an NLP pipeline using FuzzyWuzzy for entity resolution and pattern matching, significantly improving data cleanliness.
  • Mentorship & Leadership: Led a team of 3 junior data scientists across initiatives in Generative AI, causal inference, and statistical data analysis. Provided technical mentorship, project direction, and career guidance.

Data Scientist

Marex Spectron
10.2018 - 09.2021

Tech Stack: Python, Linux, SSIS, NLTK, TextBlob, LDA, RNN

  • Developed Crude Oil Price Prediction Models:
    Built and deployed multiple models (Time Series, Regression, and Recurrent Neural Networks) to predict crude oil prices, leveraging historical data and market trends.
  • Applied Natural Language Processing (NLP) for Market Sentiment Analysis:
    Utilized NLTK and TextBlob for pre-processing large volumes of unstructured financial data, performing sentiment analysis to capture market sentiment and inform pricing models. Extracted critical insights for decision-making.
  • Implemented Latent Dirichlet Allocation (LDA) for Topic Modeling:
    Used LDA to analyze large datasets of financial news articles, identifying underlying themes and topics. This helped in forecasting market movements based on trending financial discourse.
  • Data Integration and Pipeline Automation:
    Automated data extraction, transformation, and loading (ETL) workflows using SSIS and custom Python scripts, ensuring the seamless integration of various datasets for accurate model training and real-time analytics.

Data Scientist

TTX Company
Chicago
08.2018 - 10.2018
  • Developed ETA Prediction Model:
    Built and fine-tuned a Random Forest regression model to predict Estimated Time of Arrival (ETA) for product deliveries, improving delivery scheduling and customer satisfaction. Cleaned and processed over 10 million data points, performing advanced data manipulation, feature engineering, and handling missing values and outliers to enhance model accuracy and ensure reliable predictions.
  • Categorical Data Encoding:
    Applied Label Encoding and Target Encoding techniques to handle categorical data, ensuring that non-numeric features were properly transformed for use in the regression model without losing valuable information.

Data Scientist

LAZ Parking
Connecticut
04.2018 - 07.2018
  • Worked with 80 million rows and more than 30 features to create a sequence model using LSTM with 4 neurons and 1 dense layer

Education

Master of Science - Industrial Engineering

State University of New York At Buffalo
Buffalo
12.2017

Bachelor of Engineering - Mechanical Engineering

SRM University
05.2016

Skills

Technical Skills

  • Machine Learning & AI: Machine Learning, Deep Learning, Generative AI, Neural Networks, Computer Vision, Natural Language Processing (NLP), Large Language Models (LLM)
  • Frameworks & Libraries: PyTorch, TensorFlow, Keras, Scikit-Learn, Hugging Face, LangChain, NLTK
  • Vector Databases: Weaviate, FAISS
  • Data Engineering & Processing: PySpark, SQL, PostgreSQL, Snowflake, AWS, SSIS, Airflow, Linux
  • Cloud & Deployment: Docker, Kubernetes, GitLab CI/CD, Streamlit, Flask
  • Programming Languages: Python, R, Java
  • Statistics & Data Analysis: Statistics, Data Mining, Causal Inference

Timeline

Staff Data Scientist

Cisco Meraki
09.2021 - Current

Data Scientist

Marex Spectron
10.2018 - 09.2021

Data Scientist

TTX Company
08.2018 - 10.2018

Data Scientist

LAZ Parking
04.2018 - 07.2018

Master of Science - Industrial Engineering

State University of New York At Buffalo

Bachelor of Engineering - Mechanical Engineering

SRM University
Vikramjeet Singh