Summary
Overview
Work History
Education
Skills
Timeline
Generic

VAMSI ANAMANENI

Edison,USA

Summary

Data Engineer specializing in analytics infrastructure for fast-moving AI products. 8+ years building end-to-end data platforms: infrastructure, ETL, modeling, and self-service tooling that scales with product growth. Driven by making data accessible and actionable for non-data stakeholders.

Overview

9
9
years of professional experience

Work History

Software Engineer III (Data Engineer)

GitHub
Edison, NJ
08.2021 - Current

Customer-Facing Analytics

  • Embedded as data engineering specialist on cross-functional team, contributing across infrastructure, ETL pipelines, and analysis; partnered with data producers (Microsoft IDE teams) to scope and define data contracts
  • Implemented core Scala/Spark ETL pipelines processing millions of users of GitHub Copilot products, powering customer-facing analytics dashboards
  • Built geo-distributed data platform across 7 Azure regions with GDPR-compliant data residency

Analytics & Insights

  • Designed a configuration driven product analytics platform serving 65+ products, enabling self-serve metrics across Product, Sales, and Finance; enabling executive trust in dashboards
  • Architected custom segments system for 12+ AI products, allowing product teams to self-serve new dimensions without schema changes which accelerated time-to-insight from weeks to hours
  • Led cross-functional GTM initiative partnering with product teams across GitHub to define key metrics, instrument telemetry, and deliver leadership dashboards
  • Drove Feature Store v1 for Product Qualified Leads, reducing ML model training time by 50%; automated sales pipeline end-to-end, cutting Marketing Ops effort from 3 days/month to 1 hour

Platform & Infrastructure

  • Spearheaded Airflow 2.0 migration across all data platform customers, enabling RBAC and improved orchestration; created migration playbooks and documentation
  • Optimized critical database snapshot pipelines, reducing processing time by 47% and eliminating daily failures that caused 24+ hour downstream delays

Data Engineer

BitTorrent
San Fransisco, CA
05.2019 - 08.2021
  • Owned end-to-end data pipelines for revenue and product metrics serving 100M+ users across BitTorrent products

Data Engineer

Allstate (D3)
Menlo Park, CA
07.2017 - 05.2019
  • Engineered and optimized MapReduce/Spark pipelines to process over 50 million insurance records daily across 2000+ column datasets.

Education

Bachelor of Arts - Mathematics, Minor: Computer Science & Economics

Rutgers University
New Brunswick, NJ
01.2017

Skills

  • Languages: SQL, Python, Scala, KQL
  • Pipelines: Spark, Airflow, dbt, Delta Lake
  • Databases: Kusto/ADX, Trino, Redshift
  • Platforms: Azure, AWS
  • Visualization: ADX Dashboards

Timeline

Software Engineer III (Data Engineer)

GitHub
08.2021 - Current

Data Engineer

BitTorrent
05.2019 - 08.2021

Data Engineer

Allstate (D3)
07.2017 - 05.2019

Bachelor of Arts - Mathematics, Minor: Computer Science & Economics

Rutgers University