Highly skilled and results-driven Data Scientist with over 10 years of experience in applying advanced analytics, machine learning, and artificial intelligence to solve complex business problems. Expertise in developing and deploying machine learning models, creating data pipelines, and leveraging big data technologies. Proven track record of delivering impactful solutions in various industries.
Designed and implemented neural network architectures such as multi-layer perceptron (MLP), convolutional neural network (CNN), and recurrent neural network (RNN) for various machine learning tasks such as image classification, natural language processing, and time series analysis.
Trained neural networks using backpropagation algorithm and different optimization techniques such as stochastic gradient descent (SGD), Adam optimizer, and RMSprop to improve model performance and convergence speed.
Regularized neural networks using techniques such as dropout, L1/L2 regularization, and early stopping to prevent overfitting and improve generalization ability.
Utilized Snowflake data platform for large-scale data storage, processing, and analytics in data science projects.
Conducted data exploration, cleaning, and transformation using Snowflake's SQL and Python APIs to prepare data for machine learning models.
Built and trained machine learning models using databricks Python such as scikit-learn, TensorFlow, and PyTorch to perform various data science tasks such as classification, regression, clustering, and time series analysis.
Conducted hyperparameter tuning for machine learning models using GridSearchCV and RandomizedSearchCV techniques to improve model performance and accuracy.
Used Spark MLIB and applied classification and Regression models using vectorization.
Developed intent-based chatbot using python & RASA 1.8 framework solutions to engage with customers and deliver immediate, tailored assistance by offering pertinent travel and hospitality-related information using LLM and NLTK based approach.
Good understanding of LLMs, Gen AI, Langchain, EmbedChain, Transformers,Hugging Face.
Conducted fine-tuning of advanced LLM models like LAMA2 and GPT4, PaLM to improve their performance and adaptability to specific industry needs.
Hands-on experience with NLP (NLTK, Spacy, BERT, SBERT models).
Expertise in Computer vision and Image processing, developed and maintained brand identification for retail inventory management, facial recognition for secure authentication. using YOLOV2 on the darknet framework.
Evaluated multiple models based on affinity analysis, Deep & Wide Neural Network, ALS Matrix Factorization Model, RBMs, Hybrid Collaborative Filtering with user based and item based models
Used RMSE score, F-SCORE, Precision,Top-K Mean Average Precision, KAPPA score, Recall, and A/B testing to evaluate recommender's performance in both simulated environment and real world.
Applied research efforts to build a machine learned ranking solution to improve targeting for E-commerce mobile advertising.
Implemented Classification of Analysis on data using supervised algorithms like Logistic Regression, Decision trees, KNN, Naive Bayes.
Worked on Recurrent Neural Networks models like LSTM and GRUs to analyze user reviews on businesses.
Proficient in crafting data pipelines through the utilization of AWS services, encompassing EC2, EMR, S3, Redshift, Glue, Lambda functions, Step functions, CloudWatch, SNS, DynamoDB, and SQS.
Built multiple ML models like Fully-Connected Multi-Layer Neural Networks, RNN-GRU, Memory Network, Conv1D, GLM.
Involved in Training and Testing the Machine Learning Supervised and Unsupervised models, worked on Comparing GPU models for optimization and performance.
Skilled in building and deploying machine learning models using deep learning libraries such as Keras and TensorFlow.
Overview
7
7
years of professional experience
Work History
Data Scientist / Machine Learning Engineer
Pilot Flying J
Atlanta, GA
04.2023 - Current
Spearheaded the implementation of Large Language Models (LLMs) for natural language processing tasks, improving customer interaction and satisfaction
Developed and deployed machine learning models for predictive maintenance, reducing downtime by 25%
Utilized Databricks for data preprocessing, model training, and deployment, streamlining the machine learning workflow
Collaborated with cross-functional teams to identify opportunities for applying machine learning and AI techniques to business problems
Implemented advanced machine learning algorithms for demand forecasting, reducing inventory costs by 10%
Designed and implemented data pipelines using AWS services such as EC2, S3, and Glue, improving data processing efficiency by 20%.
Led the development of a personalized recommendation engine for Pilot Flying J's loyalty program using advanced machine learning techniques.
Designed and implemented a scalable data pipeline on Databricks to ingest, preprocess, and analyze large volumes of customer transaction data.
Utilized MLflow for model tracking and experimentation, achieving a 25% increase in model accuracy through hyperparameter tuning and continuous model optimization.
Implemented Top-K Mean Average Precision evaluation metric to measure recommendation engine performance, ensuring high-quality recommendations for customers.
Conducted A/B tests and experiments to evaluate model variations and enhancements, contributing to
data-driven model improvements.
Evaluated model on multiple evaluation metrics like Top-K Mean Average Precision, KAPPA score,Precision &
Recall.
The model successfully achieved a 30% reduction in the time necessary for the customization of configurations.
The recommender model played a pivotal role in driving a 13% growth in product sales, encompassing parts and warranty packages, through add-to-cart recommendations and session-based suggestions. It also embraced current trends and offered complementary parts.
Comprehensive LLM Architecture: worked on Gen-ai architecture, making sure that it could be scaled and adjusted to work in a variety of industries for travel and hospitality domain data in order to provide a tool for hotel management customer care help.
Swift Engineering & Adjustment: carried out prompt engineering to improve the Generative AI's query structures and increase the accuracy and contextual relevance of the results it produced.
Good knowledge of Transformers, EmbedChain, Langchain, Gen AI, and LLMs
Fine-tuned GPT-3 for the purpose of personalizing cold emails, resulting in a 20% increase in email open rates, a 4% improvement in response rates, and a 2% enhancement in conversion rates.
Employed NVIDIA CUDA and RAPIDS libraries for the acceleration of machine learning (ML) and data processing tasks, harnessing the formidable computational power of GPUs for high-performance computing.
Proficiently composed, executed, and optimized SQL queries to conduct comprehensive data analysis and
profiling.
Data Scientist / Machine Learning Engineer
Bayer
St Louis, MO
08.2021 - 03.2023
Built machine learning models for image recognition and classification, enhancing crop monitoring and yield prediction
Implemented deep learning algorithms, including Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), for analyzing agricultural data
Developed machine learning models to analyze agricultural data and optimize crop yields, leading to a 12% increase in productivity
Implemented a real-time monitoring system using Apache Kafka and Spark Streaming for data ingestion and processing
Utilized Azure Data Factory and Databricks for building ETL pipelines and performing data transformations
Collaborated with agronomists and domain experts to translate business requirements into machine learning solutions
Utilized LLMs for text analysis and sentiment analysis, providing valuable insights into customer feedback and market trends
Worked with Databricks to develop scalable data pipelines for processing large volumes of agricultural data.
Involved in computer vision tasks, particularly utilizing the YOLO algorithm to identify insects that pose a threat to crops, thereby enabling the prompt implementation of preventive measures.
Implemented a model for this application using a variety of Neural Networks and different neural architectures, subsequently deploying the model into a web application.
Established a comprehensive MLOps framework covering feature engineering, data preprocessing, versioning, model training and evaluation, batch inference, data validation, drift detection (both data and target), and model performance monitoring. Automated notifications were integrated to prompt model retraining as required.
Demonstrated expertise in employing Spark's machine learning library (MLlib) to construct and deploy scalable machine learning models
Data Scientist / Machine Learning Engineer
Safeway
10.2017 - 06.2021
Led the development of a recommendation system using collaborative filtering, increasing customer engagement by 20%
Implemented computer vision algorithms for automated inventory management, reducing out-of-stock instances by 15%
Created dashboards using Tableau for visualizing key metrics and trends, enabling data-driven decision-making
Developed and deployed machine learning models for demand forecasting, optimizing inventory management and reducing stockouts
Leveraged Databricks for data preprocessing, model training, and performance monitoring, ensuring scalability and efficiency
Collaborated with business stakeholders to identify key performance metrics and develop data-driven strategies.
Applied Advanced Deep learning algorithms on different datasets, by cleaning, handling missing data, feature engineering as layers.
Used different algorithms Random Forest, Logistic Regression., Gradient Boost, XGBoost.
Predicted upcoming labor demand in store for next 30 days which helps accommodating sales cutting off extra costs. as a results increased sales by 15% and cut down to costs around 30%.
Used Tensorflow Framework for building machine learning models and wrapped in Databricks ML Flow to keep track of experiments and compare the model runs
Data Profiles on multiple datasets to analyze relationship between features amongst both categorical and Numerical (continuous and discrete ) data using Seaborn and matplot lib
Applied feature selection algorithms (PCA) to predict potential outcomes.
Leveraged Pandas API on top of spark for analyzing large data sets and used Spark MLIB Pipelines to run models in distributed cluster.
Used HyperOpt for hyper parameter tuning of model using search space and increased the accuracy of model from 69% to 94%.
Created full scala Delta Lake Solution in Databricks and mounted on top of Azure Data Lake Gen2 using SAS Authentication.
Used Azure Data Factory for orchestration of Spark Jobs/ Notebooks in Databricks using triggers like Scheduled and tumbling window triggers.
Captured Streaming Data from Kafka using Delta Live tables in Databricks and incorporated with Batch layer data using Lambda architecture and created Periodic aggregated stats on top of Delta Live table
Hands - on experience in Azure Cloud Services (PaaS & IaaS), Azure Synapse Analytics, SQL Azure, Data Factory, Azure Analysis services, Application Insights, Azure Monitoring, Key Vault, Azure Data Lake.
Used Spark MLIB to predict Customer Demand for certain products for long-weekend sales and created score for high demand products which helped to manage store inventory to accommodate the Customer's demand
Machine Leaning/ Data Engineer
Aisle411 Inc.
St Louis, MO
01.2017 - 09.2017
Developed machine learning models for in-store navigation and personalized shopping recommendations.
Integrated Kafka with Spark Streaming for real-time analytics on streaming data sets.
Optimized Spark jobs for improved performance, scalability, and reliability.
Built dashboards and visualizations using Tableau for better insights into business operations.
Implemented data preprocessing and feature engineering techniques to improve model performance
Conducted data analysis and provided insights to improve customer shopping experience by detecting most visited Aisle and most checked products which helped to reduce the unsold products by 30%.
Collaborated with software engineers to deploy machine learning models into production environments.
Used Spark Dataframes, Spark-SQL extensively to build multiple ETL pipelines.
Converted RDD's to data frames to improve the performance and optimization using in-memory procedures with Spark Context, Spark-SQL, Data Frame, and Pair RDD.
Utilized various techniques like Histogram, bar plot, Pie-Chart, Scatter plot, Box plots to determine the condition of the
data.
Education
Master of Science - Data Science
Northwest Missouri State University
Maryville, MO
04-2017
Bachelor of Science - Computer Science And Programming
Andhra University
India
04-2015
Skills
Large Language Models (LLM)
Natural Language Processing (NLP)
Snowflake
Data Monitoring
Vertex AI
Generative AI
Prompt Engineering
Kubernetes
Docker
ML ops
Apache Spark and SparkML
Azure Databricks
TensorFlow
Pytorch
XGBoost
Timeline
Data Scientist / Machine Learning Engineer
Pilot Flying J
04.2023 - Current
Data Scientist / Machine Learning Engineer
Bayer
08.2021 - 03.2023
Data Scientist / Machine Learning Engineer
Safeway
10.2017 - 06.2021
Machine Leaning/ Data Engineer
Aisle411 Inc.
01.2017 - 09.2017
Master of Science - Data Science
Northwest Missouri State University
Bachelor of Science - Computer Science And Programming