Over 2+ years of hands-on experience in Machine Learning,Statistical Modeling, Data Analysis, Data Manipulation, and Data Mining, demonstrating proficiency in leveraging these skills for impactful data science projects.
Utilized Google Cloud Platform (GCP) and AWS for cloud infrastructure provisioning, configuration, and management, demonstrating proficiency in leveraging diverse cloud environments to support scalable and resilient applications.
In-depth understanding of a range of machine learning algorithms, encompassing Linear Regression, Logistic Regression, Decision Trees, Supervised Learning, Unsupervised Learning, Classification, Random Forests, Naive Bayes, KNN, K-means, and CNN.
Expertise in using data manipulation and analysis packages such as NumPy, Pandas, Matplotlib, SciPy, Scikit-learn, Seaborn, and TensorFlow, empowering efficient and effective handling of data.
Proficient in designing compelling visualizations using Tableau, Power BI software, and Storyline on both web and desktop platforms. Capable of publishing and presenting dashboards that enhance data-driven decision-making.
Adept in working with various Relational Database Management Systems like MySQL and SQL Server, ensuring seamless integration of data sources for comprehensive analysis.
Utilized Git for version control and collaboration, coupled with advanced data analysis skills in MS Excel, strengthening the ability to manage and document data science projects effectively.
Methodology:
SDLC, Agile, Waterfall
Languages:
Python, R, SQL, SAS
ML Algorithm:
Linear Regression, Logistic Regression, Decision Trees, Supervised Learning, Unsupervised Learning, Classification, SVM, Random Forests, Naive Bayes, KNN, K Means, CNN
IDE’s:
Visual Studio Code, PyCharm Packages: NumPy, Pandas, Matplotlib, SciPy, Scikit-learn, Seaborn, TensorFlow, Ggplot2 Visualization
Tools:
Tableau, Power BI, Microsoft Excel
Database:
SQL Server, MySQL
Cloud Technologies & Other Tools:
GCP, AWS, ETL, A/B Testing, Git, GitHub
Operating Systems:
Windows, Mac
Data Visualization of YouTube Statistics Sep 2021 - Dec 2021
· Designed geospatial visualizations in JavaScript on YouTube datasets from Kaggle and obtained conclusions on the trending videos.
· Created Heatmaps, and area charts using Seaborn, Altair, and Bokeh libraries in Python for decision making.
Data Base Management Project on Automobile Insurance Sep 2021- Dec 2021
· Developed a data normalization algorithm for the automobile insurance company, reducing data redundancy by 45% and increasing query response time by 60% using MySQL and Microsoft SQL Server.
· Designed and implemented a database security system using encryption and access control mechanisms, resulting in zero unauthorized data breaches during project implementation.
Regression Data Challenge on Kaggle Apr 2022 - May 2022
· Predict the target variable that captures a patient's medical condition.
· Implemented regression techniques such as decision trees, random forest, and external gradient boosting and got better accuracy with random forest technique.
Applications of Machine Learning in Late Delivery Prediction May 2023 – Aug 2023
· Developed and implemented a machine learning-based late delivery prediction system, utilizing historical and real-time data sources. This resulted in optimized logistics operations, reducing operational costs by 15% and enhancing customer satisfaction by improving on-time delivery rates by 20%.
· Utilized data preprocessing techniques, feature engineering, and various machine learning algorithms (such as logistic regression, random forest, and gradient boosting) to construct a predictive model. Achieved an impressive accuracy rate of 95% in delivery predictions, surpassing industry standards.