Summary

Overview

Work History

Education

Skills

Timeline

Tai Nguyen

Union City,CA

Summary

Aspiring data analytics engineer with a passion for generating data and delivering insights for sale and financial data through Cloud and Analytical tools. Committed to helping companies advances in data quality, data modeling, dashboard experiences. Extensive knowledge in data science project life cycle, web app development, AWS/Azure cloud applications and services.

Overview

years of professional experience

years of post-secondary education

Work History

Data Engineer

Peapod

Boston, NY

09.2022 - Current

Developed Pyspark scripts to analyze T-log transactional grocery data. Commnunicated results across multiple banners, channels and deliver pattern insights to Business Intelligence teams.
Developing PySpark programs to generate monthly T-log transactional reports including trend analysis, model performances through Power Bi.
Developed a streamlit web app for leveraging endpoint results from multiple in-house models that allows internal team members to visually evaluate model performances.

Data Analytics Engineer

Epsilon

Chicago, IL

05.2021 - 08.2022

Developed Pyspark scripts on Databricks platform to process large mobile customer data from transformation pipelines to modeling techniques that were previously written in SAS.
Optimized data sources and processing rules to enhance data quality through new designs and development phases.
Drafted technical documentation for internal business areas and processes, incorporating factors such as technical design, data manipulation, ETL and storage management.
Leveraged mathematical techniques to develop engineering and scientific solutions.

Data Scientist

ServiceNow

Cupertino, CA

03.2020 - 03.2021

Developed text modeling classifications to leverage industrial domain mapping and enable business partner to obtain statistical visions in decision making.
Recommend and suggest significant topics through text analytics, topic clustering, text mining.
Enhanced text data quality through text pre-processing techniques for better modeling results.

Data Analyst

Autodesk

San Francisco, CA

10.2019 - 03.2020

Developed pattern recognition through text clustering to uncover how/why an IT ticket is classified as a problem or incident or change.
Modeling software/tool that is used for resolving tickets through text mining (deriving high quality information).
Developed SQL queries to extract, clean and transform data from cloud storage AWS Athena.
Utilizing visual tools - Power BI/Tableau to construct dashboards that are able to emerge insights.
Developed models to predict software resolutions for ticket problems/incidents.
Technologies used: Python, NLTK, TextBlob, Stanford NLP, Power BI, SQL queries, AWS Athena, Lambda, Scikit-learn, NLP.

Data GTS Analyst

IBM

San Bruno, CA

01.2019 - 10.2019

Developed data ingestion pipeline to digest batching data by using ELK tool.
Creating impactful dashboards to engage stakeholder to monitor changes regarding internal company aspects.
Adaptable and proficient in learning new concepts quickly and efficiently.
Developed strong communication and organizational skills through working on group projects.

NLP Engineer

Wells Fargo

Fremont, CA

01.2018 - 12.2018

Cooperate with team partners to design Part-Of-Speech definitions, and annotations for bank documents.
Build a multi-dialog smart, state managed chatbot with IBM Watson Assistant.
Develop DL LSTM models to deploy generated agent dialogue for multi-industry client.
Building automated crawling applications that scape text data from input documents.
Building text classification models driven by LSTM architecture framework.
Design SQL Schema, Tables, Views to orchestrate all data processing steps.
Build REST-APIs to validate data quality.

Data Science Intern

Verisk Analytics Inc.

San Francisco, CA

08.2017 - 12.2017

Partnering with team members to propose and prepare data storytelling that address possible business concerns.
Developing web scraping processes to collect data from multiple data sources (70% time).
Utilize data visualizations and data mining techniques such as scaling, clustering to uncover insights.
Building machine learning classifier with 78% performance rate in predicting disaster events.

Education

Master of Science - Data Science

University of New Haven

West Haven, CT

01.2017 - 12.2017

Bachelor of Science - Computer Science

San Francisco State University

San Francisco, CA

08.2014 - 08.2016

Skills

Pyspark, Python, SQL
Databricks, RestAPI, Data Factory, Git, Azure
EC2, EMR, RDS, S3
MS SQL, MySQL, Mongo DB, PostgreSQL
Dimensional modeling, Data Analysis, Project Management

Regression, Random Forest, Neural Network, Decision tree, K-Means, kNN
Tableau, Jupyter Notebook, Plotly, PyCharm, Atom, PowerBI
NLP: Text similarity, topic modeling
HTML, JavaScript, CSS

Timeline

Data Engineer

Peapod

09.2022 - Current

Data Analytics Engineer

Epsilon

05.2021 - 08.2022

Data Scientist

ServiceNow

03.2020 - 03.2021

Data Analyst

Autodesk

10.2019 - 03.2020

Data GTS Analyst

IBM

01.2019 - 10.2019

NLP Engineer

Wells Fargo

01.2018 - 12.2018

Data Science Intern

Verisk Analytics Inc.

08.2017 - 12.2017

Master of Science - Data Science

University of New Haven

01.2017 - 12.2017

Bachelor of Science - Computer Science

San Francisco State University

08.2014 - 08.2016

Tai Nguyen

Summary

Overview

Work History

Data Engineer

Data Analytics Engineer

Data Scientist

Data Analyst

Data GTS Analyst

NLP Engineer

Data Science Intern

Education

Master of Science - Data Science

Bachelor of Science - Computer Science

Skills

Timeline

Data Engineer

Data Analytics Engineer

Data Scientist

Data Analyst

Data GTS Analyst

NLP Engineer

Data Science Intern

Master of Science - Data Science

Bachelor of Science - Computer Science

Similar Profiles

Manoj Kumar BollojuManoj Kumar Bolloju

Deepak GhuleDeepak Ghule

Charan V Charan V null

Venkata BellamVenkata Bellam