Overview
Work History
Education
Skills
Timeline
Generic

Abhinav Pundir

Morgan Hill,CA

Overview

11
11
years of professional experience

Work History

Software Engineer 5

Meta
03.2022 - Current
  • Stability Service: Team size - 5. Led, designed, implemented and launched a stability service to streamline issue management within the Ads ML organization. This service is now the primary method for reporting and triaging production issues (30+ orgs and 150 engineers) that hinder all ads ML iterations using Blueprint framework (90% of ads models). It helps document root causes, unblock issues, and develop reliability metrics to accelerate resolution. Tech stack: Async Tier, Chronos, Node APIs, Unidash, Daiquery, Smart Platform, iData.
  • Blueprint: I engaged in multiple projects under the Blueprint framework aimed at enhancing platform reliability. My efforts included optimizing health monitoring applications, which initially had a low success rate, by co-designing and implementing a build-time validation framework that reduced user errors by 5%. Additionally, I tackled system revision issues to boost deployment reliability and led initiatives to retire outdated applications, thereby improving resource efficiency and maintaining the integrity of the codebase. Tech stack: Fblearner, Ml hub, Coghwheel, Testx.
  • MLPP: Team size - 3 In the ML Productionization platform (MLPP), I lead the observability aspect and focused on enhancing the platform's transparency by tracking metrics such as adoption rate and throughput. This was achieved through the implementation of comprehensive logging, event monitoring, and tracing of machine learning artifacts to provide deeper insights. In the monitoring realm, I developed a framework designed to oversee critical applications. This framework helps in monitoring latencies, ensuring reliability, facilitating debugging, and managing alerts. Additionally, I established an on-call process to further enhance the platform's reliability. Tech stack: Wormhole, Tupperware, Scuba, Scribe, Logview, Xdb, Manifold.

Princ Software Dev Engineer

Yahoo Inc
07.2021 - 03.2022

ML Batch Processing Inference Stack

  • Developed language/runtime independent scalable system to run ML models inside Hadoop containers powering all ML batch systems for Mail Intelligence including pipelines that extract data from 20 million emails daily.
  • Collaborated with Hadoop, Security and BigML teams to prototype, resolve security issues and perf benchmarking that resulted in generic, scalable and efficient stack.
  • Tech Stack : Hadoop, Pig, Spark, GRPC, Unix Domain Socket, TF Serving, Protobuf.

Sr Software Dev Engineer

Yahoo! Inc.
10.2018 - 07.2021
  • ML Email Extraction System: Spearheaded built of end-to-end scalable email extraction system on long tail domains using Neural Networks by working cross functionally with data scientist, product and editors to generate ~$7M in revenue QoQ. Built training pipeline scaling millions of emails, wrote ML extractions libraries and lead engineers to design sampling, evaluation and model playground systems. Tech Stack: PySpark, TensorFlow on Spark, NER, Java.
  • ML Real Time Inference Stack: Designed and implemented ML inference services powering millions of real time Yahoo email experience achieved via containerization, incorporating scalability, security and simplified deployment process for On-Premises K8s. Published work at internal conference leading stack to be the standard for Real Time ML Inference across company. Tech Stack : Kubernetes, TF Serving/Saved Model, Docker, Envoy Proxy, mTLS, Splunk, Grafana.
  • Graph clustering: Team Size-2. Invented a new clustering algorithm for mails which is agnostic to template change which resulted in ~ 5x reduction in the number of clusters for the top domains.
  • GDPR: Team Size-2. Lead the GDPR effort for the team wrt grid pipelines. Gathered requirements, identified workflows that required sanity checks/clean ups and consolidated various workflows to eliminate unnecessary data. Design and implementation of fast (O(1)) and efficient (less memory requirements) Bloom Filters to find mail which is GDPR compliant.

Tech Yahoo Software Dev Eng

Yahoo! Inc.
04.2016 - 09.2018
  • Automated rules: Team Size-3. Design and implemented creation of rules automatically which extracts relevant information from mails based on statical Machine Learning. Submitted a joint paper that got accepted in WWW 2018 conference held in France. Also held the Patent for the same.
  • Xclusters: Team Size-4. Design and implemented a new way of clustering unstructured data, which has overly simplified the way rules are maintained by editors. The clusters were further added with tags.
  • Designed, owned and maintained multiples grid pipelines which runs hourly, daily, weekly and retroactively and is consumed by several team across Yahoo.

Tech Yahoo Software Dev Eng Interm

Yahoo! Inc.
01.2015 - 03.2016
  • Rule Management Framework: Team Size-3. This forms the backend, which supports serving of rules, content and execution. Tech Stack: JPA, REST, MySQL, Hbase, Pig, Jesey-client, Kryo, Guice.
  • Email Extraction: Designed and developed scalable batch processing email extraction pipeline running hourly and retroactively; processing 4B emails daily, extracting data from 800M emails and responsible for generating $100M in revenue YOY. Tech Stack: Oozie, Pig udf, Pig distributed cache, Druid.

Senior Software Engineer

YAHOO SOFTWARE DEVELOPMENT INDIA (P) LTD
07.2013 - 01.2015
  • License Management System (LMS): Team Size-1. I was the technical owner of LMS, a critical part of Yahoo's Content Acquisition-Processing-Serving pipeline. LMS is used to process licenses and enforce license rules for Yahoo's entire Content corpus, including Articles, Videos, Images and Slide Shows. I have built auto re-stamping workflow where SLA was brought down from a week to 1 day, developed features like license expiry notification, designed REST API etc.
  • URL Metadata Services (UMS): Team Size: 2. UMS operates in real time and returns spam score for a URL synchronously as well as asynchronously. UMS exposes a HTTP end-point, which further invokes topologies deployed via Storm.
  • Content Asset Tracking (CAT): Team Size-6. CAT has been designed for Internet scale, with hundreds of millions of events being tracked in it every single day. I worked in building the search feature of CAT using HBase API's and did performance testing/tuning of CAT HBase.

Education

M. Tech - Information Technology, Computer Science

International Institute of Information Technology
Bangalore
07.2013

B.Tech - Computer Engineering

College of Technology, GBPUAT
Pantagar
06.2011

Skills

  • Machine Learning: Tensorflow, Tensorflow Serving, DNN ML Hub
  • Big Data: Hadoop, Pig, Tez, Hbase, Spark, PySpark
  • Programming: Java, Python
  • Real time : Storm, Kubernetes, Jetty
  • Frameworks (Node API, Ent)
  • Test Infra: Cogwheel, TestX
  • Messaging & Queueing : Scribe, Wormhole
  • Workflows: Fblearner, Chronos, Oozie
  • Serverless computing: Async Tier
  • Services: Tupperware, Smart Platform
  • Monitoring: ODS, Scuba, OneDetection, Logview
  • Data Analysis & Visualization: iData, Unidash, Daiquery
  • Infrastructure: Xdb, Manifold

Timeline

Software Engineer 5

Meta
03.2022 - Current

Princ Software Dev Engineer

Yahoo Inc
07.2021 - 03.2022

Sr Software Dev Engineer

Yahoo! Inc.
10.2018 - 07.2021

Tech Yahoo Software Dev Eng

Yahoo! Inc.
04.2016 - 09.2018

Tech Yahoo Software Dev Eng Interm

Yahoo! Inc.
01.2015 - 03.2016

Senior Software Engineer

YAHOO SOFTWARE DEVELOPMENT INDIA (P) LTD
07.2013 - 01.2015

M. Tech - Information Technology, Computer Science

International Institute of Information Technology

B.Tech - Computer Engineering

College of Technology, GBPUAT
Abhinav Pundir