Summary
Overview
Work History
Education
Skills
Timeline
Generic

Tai Nguyen

Union City,CA

Summary

Aspiring data analytics engineer with a passion for generating data and delivering insights for sale and financial data through Cloud and Analytical tools. Committed to helping companies advances in data quality, data modeling, dashboard experiences. Extensive knowledge in data science project life cycle, web app development, AWS/Azure cloud applications and services.

Overview

6
6
years of professional experience
3
3
years of post-secondary education

Work History

Data Engineer

Peapod
Boston, NY
09.2022 - Current
  • Developed Pyspark scripts to analyze T-log transactional grocery data. Commnunicated results across multiple banners, channels and deliver pattern insights to Business Intelligence teams.
  • Developing PySpark programs to generate monthly T-log transactional reports including trend analysis, model performances through Power Bi.
  • Developed a streamlit web app for leveraging endpoint results from multiple in-house models that allows internal team members to visually evaluate model performances.

Data Analytics Engineer

Epsilon
Chicago, IL
05.2021 - 08.2022
  • Developed Pyspark scripts on Databricks platform to process large mobile customer data from transformation pipelines to modeling techniques that were previously written in SAS.
  • Optimized data sources and processing rules to enhance data quality through new designs and development phases.
  • Drafted technical documentation for internal business areas and processes, incorporating factors such as technical design, data manipulation, ETL and storage management.
  • Leveraged mathematical techniques to develop engineering and scientific solutions.

Data Scientist

ServiceNow
Cupertino, CA
03.2020 - 03.2021
  • Developed text modeling classifications to leverage industrial domain mapping and enable business partner to obtain statistical visions in decision making.
  • Recommend and suggest significant topics through text analytics, topic clustering, text mining.
  • Enhanced text data quality through text pre-processing techniques for better modeling results.

Data Analyst

Autodesk
San Francisco, CA
10.2019 - 03.2020
  • Developed pattern recognition through text clustering to uncover how/why an IT ticket is classified as a problem or incident or change.
  • Modeling software/tool that is used for resolving tickets through text mining (deriving high quality information).
  • Developed SQL queries to extract, clean and transform data from cloud storage AWS Athena.
  • Utilizing visual tools - Power BI/Tableau to construct dashboards that are able to emerge insights.
  • Developed models to predict software resolutions for ticket problems/incidents.
  • Technologies used: Python, NLTK, TextBlob, Stanford NLP, Power BI, SQL queries, AWS Athena, Lambda, Scikit-learn, NLP.

Data GTS Analyst

IBM
San Bruno, CA
01.2019 - 10.2019
  • Developed data ingestion pipeline to digest batching data by using ELK tool.
  • Creating impactful dashboards to engage stakeholder to monitor changes regarding internal company aspects.
  • Adaptable and proficient in learning new concepts quickly and efficiently.
  • Developed strong communication and organizational skills through working on group projects.

NLP Engineer

Wells Fargo
Fremont, CA
01.2018 - 12.2018
  • Cooperate with team partners to design Part-Of-Speech definitions, and annotations for bank documents.
  • Build a multi-dialog smart, state managed chatbot with IBM Watson Assistant.
  • Develop DL LSTM models to deploy generated agent dialogue for multi-industry client.
  • Building automated crawling applications that scape text data from input documents.
  • Building text classification models driven by LSTM architecture framework.
  • Design SQL Schema, Tables, Views to orchestrate all data processing steps.
  • Build REST-APIs to validate data quality.

Data Science Intern

Verisk Analytics Inc.
San Francisco, CA
08.2017 - 12.2017
  • Partnering with team members to propose and prepare data storytelling that address possible business concerns.
  • Developing web scraping processes to collect data from multiple data sources (70% time).
  • Utilize data visualizations and data mining techniques such as scaling, clustering to uncover insights.
  • Building machine learning classifier with 78% performance rate in predicting disaster events.

Education

Master of Science - Data Science

University of New Haven
West Haven, CT
01.2017 - 12.2017

Bachelor of Science - Computer Science

San Francisco State University
San Francisco, CA
08.2014 - 08.2016

Skills

  • Pyspark, Python, SQL
  • Databricks, RestAPI, Data Factory, Git, Azure
  • EC2, EMR, RDS, S3
  • MS SQL, MySQL, Mongo DB, PostgreSQL
  • Dimensional modeling, Data Analysis, Project Management
  • Regression, Random Forest, Neural Network, Decision tree, K-Means, kNN
  • Tableau, Jupyter Notebook, Plotly, PyCharm, Atom, PowerBI
  • NLP: Text similarity, topic modeling
  • HTML, JavaScript, CSS

Timeline

Data Engineer

Peapod
09.2022 - Current

Data Analytics Engineer

Epsilon
05.2021 - 08.2022

Data Scientist

ServiceNow
03.2020 - 03.2021

Data Analyst

Autodesk
10.2019 - 03.2020

Data GTS Analyst

IBM
01.2019 - 10.2019

NLP Engineer

Wells Fargo
01.2018 - 12.2018

Data Science Intern

Verisk Analytics Inc.
08.2017 - 12.2017

Master of Science - Data Science

University of New Haven
01.2017 - 12.2017

Bachelor of Science - Computer Science

San Francisco State University
08.2014 - 08.2016
Tai Nguyen