Summary
Overview
Work History
Education
Skills
Websites
Accomplishments
Projects
Timeline
Generic

PRAHARSHA PRATEEK MORE

Data Engineer
New Bedford,MA

Summary

Over 2+ years of hands-on experience in Machine Learning,Statistical Modeling, Data Analysis, Data Manipulation, and Data Mining, demonstrating proficiency in leveraging these skills for impactful data science projects.

Utilized Google Cloud Platform (GCP) and AWS for cloud infrastructure provisioning, configuration, and management, demonstrating proficiency in leveraging diverse cloud environments to support scalable and resilient applications.

In-depth understanding of a range of machine learning algorithms, encompassing Linear Regression, Logistic Regression, Decision Trees, Supervised Learning, Unsupervised Learning, Classification, Random Forests, Naive Bayes, KNN, K-means, and CNN.

Expertise in using data manipulation and analysis packages such as NumPy, Pandas, Matplotlib, SciPy, Scikit-learn, Seaborn, and TensorFlow, empowering efficient and effective handling of data.

Proficient in designing compelling visualizations using Tableau, Power BI software, and Storyline on both web and desktop platforms. Capable of publishing and presenting dashboards that enhance data-driven decision-making.

Adept in working with various Relational Database Management Systems like MySQL and SQL Server, ensuring seamless integration of data sources for comprehensive analysis.

Utilized Git for version control and collaboration, coupled with advanced data analysis skills in MS Excel, strengthening the ability to manage and document data science projects effectively.

Overview

2
2
years of professional experience

Work History

Data Engineer

Psquare Technologies
Aurora, IL
10.2023 - Current
  • Engaged in the end-to-end software development lifecycle, utilizing the Agile model for requirements gathering, analysis, design, development, and testing of applications
  • Developed and executed predictive models using machine learning algorithms, including Linear Regression, Classification, Naive Bayes, Random Forest, and K-means Clustering, enhancing the application's analytical capabilities by 100%.
  • Utilized essential data science packages: NumPy and Pandas for data manipulation, Matplotlib, SciPy, and Seaborn for visualizations, Scikit-learn for machine learning, TensorFlow for Deeplearning, and Ggplot2 in R for comprehensive data exploration.
  • Integrated these tools seamlessly with PySpark as an API Streamlined and automated data processing pipelines using TFX GCP Cloud AutoML and TensorFlow Enterprise for model training and deployment
  • Conducted data blending and preparation using SQL for Tableau consumption, and published data sources to the Tableau server, facilitating comprehensive and accessible data visualization
  • Utilized Git for version control and collaboration, ensuring thorough documentation and traceability of code changes, contributing to a well-organized development workflow.

Data Scientist

Wipro LTD
Hyderabad, India, Telangana
01.2020 - 04.2021
  • Developed and implemented predictive models using SAS, employing techniques such as linear regression, logistic regression, decision trees, Naive Bayes, Random Forests, K-means, and KNN.
  • Utilized Python for data manipulation, loading, and extraction, leveraging libraries such as Matplotlib, NumPy, SciPy, and Pandas for robust data analysis
    Generated professional reports using Power BI, focusing 100% on aligning with business requirements and ensuring effective data visualization
  • Analyzed and interpreted A/B test results, providing insights to inform decision-making processes.
  • Conducted data analysis and profiling using complex SQL queries on diverse source systems, including SQL Server, ensuring data accuracy and integrity
    Optimized data pipelines using AWS Glue, Athena, and S3, allocating efforts to achieve 30% efficiency in extraction, 40% in transformation, and 30% in loading in the data processing lifecycle.

Education

Masters in Data Science -

University of Massachusetts Dartmouth
Dartmouth, MA
05.2023

Bachelor in Electronics and Communications -

Keshav Memorial Institute of Technology
Hyderabad, Telangana
05.2019

Skills

Methodology:

SDLC, Agile, Waterfall

Languages:

Python, R, SQL, SAS

ML Algorithm:

Linear Regression, Logistic Regression, Decision Trees, Supervised Learning, Unsupervised Learning, Classification, SVM, Random Forests, Naive Bayes, KNN, K Means, CNN

IDE’s:

Visual Studio Code, PyCharm Packages: NumPy, Pandas, Matplotlib, SciPy, Scikit-learn, Seaborn, TensorFlow, Ggplot2 Visualization

Tools:

Tableau, Power BI, Microsoft Excel

Database:

SQL Server, MySQL

Cloud Technologies & Other Tools:

GCP, AWS, ETL, A/B Testing, Git, GitHub

Operating Systems:

Windows, Mac

Accomplishments

  • Successfully led and managed cross-functional projects, consistently delivering desired outcomes and meeting project objectives on time and within budget, contributing to the organization's success.
  • Provided effective training and support to both internal and external stakeholders, leading to increased knowledge sharing and overall productivity improvements.
  • Recommended and assisted in the implementation of standardized policies and procedures, ensuring a high level of compliance, efficiency, and transparency across the organization.
  • Delivered data and solutions to internal and external stakeholders in a clear and logical manner, facilitating informed decision-making and enhancing communication within the organization.
  • Leveraged strong technical and data management skills to effectively support departmental sandbox usage, contributing to the success of various projects.
  • Demonstrated the ability to apply strategic thinking and innovative approaches to address complex data challenges, contributing to a more effective and data-driven decision-making process within the organization.

Projects

Data Visualization of YouTube Statistics Sep 2021 - Dec 2021

· Designed geospatial visualizations in JavaScript on YouTube datasets from Kaggle and obtained conclusions on the trending videos.

· Created Heatmaps, and area charts using Seaborn, Altair, and Bokeh libraries in Python for decision making.

Data Base Management Project on Automobile Insurance Sep 2021- Dec 2021

· Developed a data normalization algorithm for the automobile insurance company, reducing data redundancy by 45% and increasing query response time by 60% using MySQL and Microsoft SQL Server.

· Designed and implemented a database security system using encryption and access control mechanisms, resulting in zero unauthorized data breaches during project implementation.

Regression Data Challenge on Kaggle Apr 2022 - May 2022

· Predict the target variable that captures a patient's medical condition.

· Implemented regression techniques such as decision trees, random forest, and external gradient boosting and got better accuracy with random forest technique.

Applications of Machine Learning in Late Delivery Prediction May 2023 – Aug 2023

· Developed and implemented a machine learning-based late delivery prediction system, utilizing historical and real-time data sources. This resulted in optimized logistics operations, reducing operational costs by 15% and enhancing customer satisfaction by improving on-time delivery rates by 20%.

· Utilized data preprocessing techniques, feature engineering, and various machine learning algorithms (such as logistic regression, random forest, and gradient boosting) to construct a predictive model. Achieved an impressive accuracy rate of 95% in delivery predictions, surpassing industry standards.

Timeline

Data Engineer

Psquare Technologies
10.2023 - Current

Data Scientist

Wipro LTD
01.2020 - 04.2021

Masters in Data Science -

University of Massachusetts Dartmouth

Bachelor in Electronics and Communications -

Keshav Memorial Institute of Technology
PRAHARSHA PRATEEK MOREData Engineer