Book Recommendation System:
- Built a book recommendation System based on Collaborative Filtering Method.
- Conducted comprehensive Exploratory Data Analysis (EDA) to gain insights into the dataset, ensuring data quality and accuracy by cleaning and removing redundancies.
- Utilized Matplotlib and Seaborn libraries to create insightful data visualizations, enabling a deeper understanding of user-book interactions and preferences.
- Demonstrated proficiency in Collaborative Filtering methodologies, including User-Based and Item-Based approaches, to provide tailored book recommendations to users.
Tech Stack: Python, numpy, pandas, seaborn, matplotlib,Machine learning,Scipy, flask, html, CSS, pickle, Google Collab
Spring-Boot Microservices:
- Designed and developed a microservices-based web application as a part of my coursework.
- Utilized Java and Spring Boot to create microservices for various application components such as user authentication, product catalog, and order processing.
- Designed and developed microservices-based web application components, reducing API response time by 40%.
- Designed a MySQL database schema to store product and order data.
Tech Stack: Java, Spring Boot, Spring Eureka, Restful services, MYSQL, Apache Tomcat, Intelli J, Postman, J UNIT, OATH2, SLF4J
Sales Data Analytics using Pyspark:
- Cleaned and preprocessed sales data using PySpark DataFrame operations, reducing data processing time by 50%.
- Engineered relevant features such as total sales, average sales per customer, and sales trends over time to enhance the analytical capabilities of the dataset.
- Utilized PySpark's SQL functionalities and DataFrame operations to perform in-depth exploratory data analysis, identifying correlations, outliers, and patterns within the sales data.
- Conducted statistical analysis using PySpark's built-in functions to derive meaningful insights into sales performance, customer demographics, and product preferences. Created insightful visualizations including bar charts, line plots, and histograms using PySpark's integration with libraries like Matplotlib and Seaborn, facilitating clear communication of analytical findings.
- Optimized PySpark code for improved performance by leveraging techniques like caching, partitioning, and parallel processing, ensuring efficient processing of large-scale sales datasets.
Tech Stack : Pyspark, PySpark DataFrame, Matplotlib, Seaborn,Python, Pandas.