Summary
Overview
Work History
Education
Skills
Certifications Training
Timeline
Generic

Pon Chan

Santa Clara,CA

Summary

Driven software engineer specializing in data platform and infrastructures with several years experience wrangling big datasets. Eager to build robust data infrastructures and platforms that lay the groundwork for revealing analytics insights and deep learning training system. Implement ETL tools to make datasets ready for stakeholders and develop data pipelines using SQL, Spark, Kubeflow, Airflow, etc to improve efficiency in data processing. Designing and Developing RESTFUL APIs with Python Flask framework for React project web applications.

Overview

14
14
years of professional experience

Work History

Senior Software Engineer - Data Platform

Phantom AI
07.2022 - Current
  • Build, deploy and support batch & real-time, fault-tolerant, self-healing data pipelines
  • Create centralized automotive data management, services, tools, and APIs for data-intensive autonomy and product applications
  • Manage end-to-end data generation & consumption from onboard system to offboard applications, and define best practices and metrics for data ingestion, ETL, and processing
  • Develop continuous testing and validation systems to ensure robustness of data and data architectures
  • Lead team efforts to leverage and improve deep learning infrastructure for model development, training, deployment needs and scaling DL systems
  • Utilize Phantom AI's deep learning and machine learning technologies to implement high performance data pipelines of Big Data solutions in real time
  • Utilize Data Engineering, ETL, AI/ML/DL technologies for training, inferencing and deployment of ML/DL models into production applications.

Data Engineer III

Uber Technologies
07.2021 - 06.2022
  • Worked with marketing teams across Growth organization to achieve creative products that help improve content sharing, user-to-user communication and user retention
  • Ensured right data and developed workflows to supply data
  • Expanded Uber SEO experimentation framework to understand impact of content selection, interlinking changes etc.
  • Designed, sourced data instrumentation, ETL pipeline optimization, and data model implementation
  • Collaborated with HDFS, Hive, Presto, and Spark teams to build scalable and reliable solutions that process big data for end-users
  • Contributed to ETL tooling framework to implement and extend its functionality using Python
  • Automated and scheduled data pipeline by developing Python scripts to define Airflow DAG objects.

Software Engineer - Big Data

Yahoo
12.2020 - 06.2021
  • Worked on 5G real-time Edge Intelligence Platform (MEC) for Verizon Media
  • Built end-to-end backend pipeline for training and deployed computer vision models on MEC using MLlow and Seldon Core
  • Built and designed MLOps to standardize and streamline lifecycle of ML in production
  • Designed, built, and optimized data pipelines' containerization and orchestration with Docker and Kubernetes.

Data Engineer

Cisco Systems, Inc
10.2019 - 11.2020
  • Collaborated with marketing analytics team to understand requirement, designed and developed automation scripts and batch jobs to create ETL pipelines between multiple data sources using PySpark and Hive
  • Used Spark API to stream data from various sources in Hadoop
  • Developed Spark code in Python and used Spark SQL & Data Frames for aggregation
  • Worked with Sqoop to ingest & retrieve data from Oracle DB
  • Wrote Hive queries to transform data for further downstream processing.

Data Engineer

Walkwater Technologies, Inc
04.2019 - 09.2019
  • Ingested data to create tables in BigQuery and achieved high-level optimization of ingested data
  • Resolved compatibility data architecture issues including data ingestion and pipeline design
  • Migrated data stores to reliable and scalable Google Cloud Storage
  • Built production-grade data backup and restore and disaster recovery solutions.

Data Scientist

KCT Food Service Inc
02.2016 - 03.2019
  • Designed and ran models with qualitative and quantitative variables; transformed variables and estimated models' parameters in Python Pandas
  • Worked on consumer data to build classification models using Python on consumer attributes and delivered different insights
  • Performed statistical analysis and built Machine Learning models in Python using various Supervised, Unsupervised Machine Learning algorithms and dimensionality reduction techniques
  • Analyzed data from SQL databases to drive optimization and improvement of consumer segmentation and business strategies
  • Used different ensemble models, feature engineering and feature selection techniques to improve models' performance.

Data Analyst

D Skyline
02.2010 - 07.2013
  • Performed data extraction using SQL queries for analysis purposes
  • Performed data entry, data auditing, creating data reports & monitoring all data for accuracy.

Education

Master of Science - Predictive Analytics

Northwestern University
Evanston, IL
12.2015

Bachelor of Science - Mathematics And Economics, Minor in Statistics

University of California
Los Angeles, CA
09.2009

Skills

  • Programming Languages: Python, Javascript, Linux
  • Databases: MongoDB, Redis, MySQL, PostgreSQL, Oracle, Elasticsearch
  • Big Data Technologies Tools: Hadoop, Spark, Hive, Presto, Airflow, Sqoop, Kafka
  • Data Warehouses: BigQuery, Redshift, Snowflake
  • ML Libraries/Platforms: PyTorch, Scikit-Learn, OpenCV, Pandas, MLflow
  • Cloud Platform: GCP, AWS
  • Devops: Git, Docker, Kubernetes

Certifications Training

  • Google Cloud Certified Professional Data Engineer, 08/2019
  • Google Cloud Certified Associate Cloud Engineer, 06/2019

Timeline

Senior Software Engineer - Data Platform

Phantom AI
07.2022 - Current

Data Engineer III

Uber Technologies
07.2021 - 06.2022

Software Engineer - Big Data

Yahoo
12.2020 - 06.2021

Data Engineer

Cisco Systems, Inc
10.2019 - 11.2020

Data Engineer

Walkwater Technologies, Inc
04.2019 - 09.2019

Data Scientist

KCT Food Service Inc
02.2016 - 03.2019

Data Analyst

D Skyline
02.2010 - 07.2013

Master of Science - Predictive Analytics

Northwestern University

Bachelor of Science - Mathematics And Economics, Minor in Statistics

University of California
Pon Chan