Project – 1) Spark Streaming Application [GitHub Link]
- Technology used :- ( Apache PySpark, Apache Airflow, Kafka, Cassandra, Docker Compose, Git)
- Relevant Coursework: Big data for Business
- Designed and constructed an end-to-end Kafka-based data pipeline using the Apache Airflow scheduler
- Leveraged Spark Structured Streaming to process incoming data in real-time and applied transformations to seamlessly ingest data from an API into a Cassandra Database
- Employed Docker containerization to package and deploy pipeline components
Project – 2) Data Warehouse design and Performance Optimization [GitHub Link]
- Technology used :- ( Apache Spark, Oracle, AWS S3, Tableau)
- Relevant Coursework: Data Warehousing
- Performed EDA and cleaned the data using Apache Spark PySpark
- Designed a Data cube for 7 million records and 28 columns
- Queried the Data warehouse using OLAP operations like ROLLUP, CUBE, LEAD, NTILE etc
- Optimized the data warehouse using Indexing, Partitioning and by creating Materialized Views
- Answered the business question by using Tableau
Project – 3) South St.Petersburg Community Redevelopment Area (CRA) [Confidential]
- Technology used :- ( Pandas, NumPy, Sqlite3, Python, Tableau, Jira)
- Relevant Coursework: Enterprise Information Systems Management
- Exploratory Data Analysis (EDA) on the Properties sold over the corridors for the years 2018-2022 in St
- Petersburg
- Developed Interactive dashboard in Tableau to answer business questions and used JIRA for Monitoring
Project – 4) Bull Credit Web Application [Website Link]
- Technology used :- ( Java Script, Flask, Chart.js, Sqlite3, GIT, HTML, CSS)
- Relevant Coursework: Distributed Information Systems
- Developed a USF credit application prototype with login authentication and directed user information to main application
- Data from the user is stored in a database and Credit charts are made using Chart.js for the users are shown in the application
Project – 5) Starbucks System Design [GitHub Link]
- Technology used :- ( UML)
- Relevant Coursework: Advance System Analysis and Design
- Designed Information system for a Starbucks outlet at University of South Florida
- Done SWOT Analysis and Requirement Engineering
- Developed UML diagrams (Use case, Sequence, Class, Activity, Deployment) for Real Time order tracking and Inventory management
- Recommended Software Methodology (Agile vs Plan driven)
Project – 6) Credit Risk Analysis [GitHub Link]
- Technology used :- ( Python, Azure ML Studio)
- Relevant Coursework: Data Science Programming
- Exploratory Data Analysis and Data Cleaning using Python
- Built Machine learning models (Multiclass Logistic Regression, Decision Forest)
- Classified whether an applicant is risky or not