Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

SOPHIA LIAN

New York,NY

Summary

Data engineer skilled at cloud computing, automating and scaling processes, and big data management. Demonstrated success in distributed computing, data architecting, and collaborating with cross-functional teams and senior decision-makers to build high impact solutions aligned with business needs.

Overview

10
10
years of professional experience
1
1
Certification

Work History

Data Engineer

DISH MEDIA SALES
12.2021 - Current


•Build continuous integration/continuous deployment CI/CD pipelines that promote jobs from development -> test -> production channels (GitLab and Databricks)
•Perform ETL/ELT jobs (using technologies like dbt, AWS Glue, Azure Data Factory, Google Cloud Dataflow) to ingest, transform, and load data into data warehouses and data lakes
•Orchestrate complex workflows using services like AWS Step Functions and Airflow

•Build and deploy environments using container technologies (Docker)
•Utilize various tech stacks including using platforms like Snowflake and Databricks

•Architect streaming and batch pipelines from proof of concept to production
•Design, test, and refactor jobs using Python, R, SQL, PySpark
•Experienced in various ecosystems (AWS, GCP, and Azure)
•Define, document, and maintain architecture of data ecosystem, including data models, data flows, and data governance policies
•Experienced with distributed computing platforms (AWS ecosystem, Google Cloud Platform (GCP), Oracle Cloud Infrastructure)

Data Architect Fellow

NYC DATA SCIENCE ACADEMY
07.2020 - 12.2021
  • Deploy applications using AWS Elastic Beanstalk
  • Build pipelines and orchestrate workflow using distributed platforms (AWS and GCP)
  • Utilized regression, clustering, and tree-based models to predict likelihood to churn and to improve retention
  • Used modeling to predict likelihood of default


Energy Security Consultant

COUNCIL ON FOREIGN RELATIONS (CFR)
09.2018 - 07.2020
  • Analyzed economic impact of energy industry digitalization through modeling and data visualization

Energy Industry Analyst

FEDERAL ENERGY REGULATORY COMMISSION (FERC)
07.2016 - 09.2018
  • Investigated cases of potential market manipulation through A/B testing

Financial Compliance Analyst

INSTITUTIONAL SHAREHOLDER SERVICES (ISS)
01.2016 - 07.2016
  • Rated companies' environmental, social, and governance (ESG) profiles

Education

Master of Science - Computer Science

COLUMBIA UNIVERSITY
12.2014

Bachelor of Science - Computer Science

UNIVERSITY OF FLORIDA
05.2008

Skills

  • AWS / GCP / Azure
  • Docker / Kubernetes
  • Databricks
  • Snowflake (OLAP databases)
  • MongoDB (noSQL databases)
  • PySpark
  • Streaming data pipelines (Kafka)
  • Batch data pipelines
  • R
  • Python
  • SQL
  • dbt
  • Orchestration techologies (Airflow, AWS Step Functions)
  • Hadoop
  • Spark
  • CI/CD
  • Big data
  • A/B testing
  • Data engineering
  • Data mining
  • Rest API
  • Feature engineering
  • Google Cloud
  • AWS solutions
  • Digital advertising
  • Graph database

Certification

  • Algomap.io - Data Structures and Algorithms
  • AWS Developer Associate Certificate (summer 2025)


Timeline

Data Engineer

DISH MEDIA SALES
12.2021 - Current

Data Architect Fellow

NYC DATA SCIENCE ACADEMY
07.2020 - 12.2021

Energy Security Consultant

COUNCIL ON FOREIGN RELATIONS (CFR)
09.2018 - 07.2020

Energy Industry Analyst

FEDERAL ENERGY REGULATORY COMMISSION (FERC)
07.2016 - 09.2018

Financial Compliance Analyst

INSTITUTIONAL SHAREHOLDER SERVICES (ISS)
01.2016 - 07.2016

Master of Science - Computer Science

COLUMBIA UNIVERSITY

Bachelor of Science - Computer Science

UNIVERSITY OF FLORIDA
SOPHIA LIAN