Over three years of hands-on experience with Data Warehouse System Analysis, Design, Development, and Supporting in Microsoft SQL Server
Adept at writing complicated T-SQL stored procedures, functions, and views to roll up large datasets more efficiently for operational reporting
Highly analytical team player with the aptitude for prioritization of needs. Creative troubleshooter and loves challenges
Using Azure relational database including Azure database and Azure Database for MySQL.
Quality-driven and hardworking with great communication and project management skills.
Dedicated big data industry professional with a history of meeting company goals utilizing consistent and organized practices.
Strong team collaborator, adept at working with engineering and product teams to solve data defects and improve processes. Committed to Agile practices, ensuring adaptive and efficient project management.
Proven ability to manage complex data sets, create materialized views, and conduct data modeling for fraud analysis.
Skilled in advanced data pipeline development, orchestration, and optimization with Databricks. Specialized in migrating legacy pipelines, enhancing data accuracy, and performing rigorous data validation.
Overview
9
9
years of professional experience
Work History
Senior Data Analyst
Capital One
07.2022 - Current
Using databricks Pysark and SQL language to perform data validation and data orchestration, like joining and unioning tables, creating CTEs, and performing minus operations for target tables
Experiencing and monitoring multiple batched data pipelines using databricks, and negotiating quickly with the team if defects are found
Creating automation notebooks by setting up databricks triggers to check the team’s bulk actions.Sent well-formatted daily email reports using databricks HTML
Converting and refactoring the legacy data pipeline and doing performance testing in databricks
Negotiating works and reports to the upstream product team and solving the defects positively with both engineers and the product team on Slack or Zoom
Working strictly under an Agile environment that attending daily sprints, planings, reviews, backlog refinement, and retrospectives.
Experience with migration of legacy data pipelines, optimizing the performance and reliability in a Databricks simulation environment, by diurectly supporting two technical teams with differennt pipeline APIs.
Undertook the backfilling of histoirical data, incorporating deduplication logic extensiove testing.
Developed and maintained multiple materialized views to improve data performance and accessibility, ultilizing stored procedures and tasks for regular updates.
Contributed to data modeling initiatives, particularly focusing on ERD development for Fraud&Dispute data to improve data structure and analysis
Data Analyst and Operation Assistant
Mint Servicing
05.2017 - 10.2020
Implement and test web-based software products covering the whole loan management process, from marketing, and underwriting to loan origination and payment tracking
Interface with stakeholders including product, production to understand their requirements and provide engineering team, and analytical assistants
Identifying and analyzing complex T-SQL stored procedures created the necessary indexes and SQL hints to improve query performance in both production and development/QA databases
Optimized the performance of queries with modifications in T-SQL queries, removed unnecessary columns, eliminated redundant and inconsistent data, normalized tables, established joins, and create indexes whenever necessary
Build and maintain the company’s data pipelines/architect to extract, transfer and load data from various sources to analytics DB for essential business data analysis
Work closely with the Engineering team to prepare modeling data, implement/validate risk & conversion models
Implement automation scripts/tools for customer retention, collection, and promotion campaigns, integrated with 3rd party APIs for email and voicemail distribution
Attended Team meetings for database analysis and design as well as data mapping for a conversion from a legacy flat file database to a relational database
Maintained and enhanced existing Data Warehouse, exports, and reports; provided business users with the data required for metrics and analysis
Data Analyst
Mint Quantum Beijing
05.2015 - 08.2016
Leveraged text, charts, and graphs to communicate findings in an understandable format
Analyzing large amounts of data to identify trends and find patterns, signals, and hidden stories within data using T-SQL
Assessed large datasets from SSMS, drew valid inferences and prepared insights in narrative or visual forms using SSMS
Identified, reviewed, and evaluated data management metrics to recommend ways to strengthen data across enterprises using SAS scripts
Aggregated and cleaned data on thousands of customers' credit attributes using Python and T-SQL
Performed missing value imputation using population median, checked population distribution for numerical and categorical variables to screen outliers and ensured data quality via Python scripts
Built logistic regression model to predict the probability of default; used stepwise selection method to select model variables using Python
Led recruitment and development of strategic alliances to maximize utilization of existing talent and capabilities
Education
Master of Science - Business Analytics
Rensselaer Polytechnic Institute
12.2021
Bachelor Of Arts - Economics, Statistics
Rutgers University
05.2019
Skills
Programming and query language: T-SQL, Python, Pyspark, R, SAS
Tools: Databricks, Snowflake SQL, Microsoft SQL Server, MySQL, Microsoft Visual Studio, Jupyter notebook
Appointment / Transportation Scheduler/Nursing Unit Clerk at Bowling Green Health And Rehab CenterAppointment / Transportation Scheduler/Nursing Unit Clerk at Bowling Green Health And Rehab Center