Summary
Overview
Work History
Education
Skills
Certification
Websites
Timeline
Generic

Dilip Rajgor

Redmond,WA

Summary

  • Seasoned Senior Data Engineer with background in developing, testing, and maintaining Datamarts & Data pipelines, Possess strong skills in Big Data processing frameworks, data modeling and warehousing.
  • Have successfully collaborated with other teams for creating innovative data solutions to improve system efficiency and business decision-making processes.
  • Demonstrated impact through enhanced data availability and accuracy.
  • Strong experience in designing etl/elt solutions for handling the petabytes of data using various tools, Big data technologies and frameworks including SPARK, Teradata SQL, Hive, Hadoop, Apache Kylin, UNIX Shell Scripting, Abinitio and Informatica, Tableau, Python & Scala
  • Extensive working skills in dimension modelling and transforming business requirement into Dimensions and Fact tables.
  • Extensive Experience in interaction with users and functional people for gathering the business requirements & functional specifications.
  • Strong experience in ecommerce, Insurance, banking and financial domains
  • Positive attitude towards challenges & Always eager to learn new things not just limited to technology or computers

Overview

19
19
years of professional experience
8
8
Certification

Work History

Member of Technical Staff

Ebay
Seattle
08.2011 - Current

eBay’s Data Services and Solutions organization provides creation and maintenance of data products for improvements to eBay’s business. These products include reports, framework and self-service tools for many Business Users & analysts.

Worked on providing architecture, design and code for below data products using the Technologies including Teradata, Hadoop, HDFS, Hive, Spark, Kylin, Scala, Java

  • Integrated Data Layer for Ads: Ebay e-commerce platform generates massive behavioral data daily. This project is to generate single source of truth for all first party Ads products in ebay.
    o we extract the tracking & behavioral data for first party Ads (Sponsored Listings), Organic Listings & offsite Seller funded ads to analyze the performance of various products and features
    o Every day, the job processes 20TB of data to produce the data output of around 6TB.
  • Brand Insights – Dashboard that provides single source of truth for Brand information which will also provide deep analytics for each of 800K Brands at different seller level & product level slicing and dicing of data. With over 4.5Billion rows to query at multiple levels of dimensions and metrics, I designed and built robust & low latency DataMart. Technologies used: Spark, Scala, Hive, Shell Scripting, Apache Kylin

    Counterfeit Detection Model – Creation of dataset from various sources and scoring the items based on the statistical model to identify if the item is counterfeit or not. Technologies used: Spark, Scala, Teradata, Hive, Shell Scripting

    Coupons – Creation of dataset using spark to provide information of how many users have contacted the customer support after each hour of providing coupons via multiple campaigns for the window of 24 hours. Technologies used: Spark, Scala

    Financial analytics metrics for the Internet marketing (IM) channels based on the last click in last 24hrs, first click in first user session in last 10 days. Technologies used: Teradata SQL, Hive, HDFS

    Datamart for Display IM channel. This provides the unified view of Display channel metrics including the Goals, forecast, ad-viewability from external vendors like Next Tuesday, Media math and conversion and engagement metrics Technologies used: Teradata SQL, Python, Hive, HDFS, Apache Knox, Unix Shell Scripting

    Paid search IM channel. This helps in running different campaigns and providing the data points to make decisions for effectively spending/bidding on the keywords for the Text and PLA ads during the user searches in Search engines like google, yahoo, Bing etc. Technologies used: Teradata SQL

    EPN (ebay Partner Network) data mart for Affiliate (like Ebates, Slickdeals etc) IM channel. This helps in providing the unified view of performance of each publisher (affiliate) and their campaigns including the spend, bonus, reversals, conversion (txn, revenue, GMB etc) and engagement (click, Impression etc.,) metrics. Technologies used: Teradata SQL, Hive, HDFS, Apache Kylin, Unix Shell Scripting

    Guest Checkout - Designed and implemented the solution for Handling the Guest checkout to identify New/Reactivated/Retained users which plays important role in identifying the key ebay metrics. Technologies used: Teradata SQL, Unix Shell Scripting

    Unified Email Program is designed and implemented to provide the unified view of metrics related to different types of emails like, Site email, Marketing email (Targeted & Triggered Emails), Transactional emails by various dimensions’ like user segments, country, age group etc., Technologies used: Teradata SQL, Unix Shell Scripting, Tableau

    Various User (Buyer/Seller) segmentation for targeted campaigns & support. Technologies used: Teradata SQL, Unix Shell Scripting

    360 Linking - Identifying and linking the multiple users based on the multiple attributes including data from PayPal, Acxiom to identify the customers, which in turn used during the segmentation of users and customers. Technologies used: Teradata SQL, Unix Shell Scripting

Senior Software Engineer

Microsoft
Redmond
08.2019 - 01.2021
  • As part of Supply Chain Plan BI - our team was Responsible for Providing data for Forecast, Demand, Supply & Actuals key measures from Different sources for the Microsoft Devices so that the Planners & data scientist can do their planning, forecasting and Al Model Building
  • For this, we used Azure HDInsights, Azure Data Factories V1 and V2, spark-sql, Azure Data Warehouse, SQL Server, Azure Analysis Service, Power BI, ARM Templates, Logic Apps, and Implemented CI/CD For streamlining change management for our codebase.

Project Lead

Syntel Inc for Allstate Insurance
Chicago
01.2011 - 07.2011
  • This project involves creation of Standard layer model for the producer data
  • This includes the Designing and developing the abinitio graphs for the data acquisition from the external sources to provide standard layer for the business users
  • Tools & Technologies used: Abinitio, Oracle, UNIX.

Technology Lead

Infosys
10.2005 - 12.2010
  • Worked with various clients like Citibank, Morgan Stanley Smith Barney, WellPoint at various location including NYC, New Jersey, India
  • Net New Asset (Citigroup) - Calculation of the total Assets for each FA & Identification of Quality client relationships
  • Each FA is ranked based on the Inflows and outflows of assets
  • Process involves identification of eligible and ineligible accounts by applying complex rules for the calculation of net new assets for each FA
  • Tools & Technologies used: Abinitio, Db2, Autosys, Unix
  • New Money Filter(Citigroup) - Calculation of new money balances for new product offerings that attracts new money in to the firm
  • The calculated balances are then stored in the datamart along with the demographic information after applying business rules
  • Tools & Technologies used: Abinitio, Db2, Autosys, Unix shell Scripting
  • MRS is a data mart to provide financial and marketing reports for AEGON USA
  • This project involved the integration of the IDEA system with the core admin system for data synchronization and validation
  • The Scope of work involves Extracting data from different Applications and loading to a data mart
  • Tools & Technologies used: Informatica, Oracle
  • MyFI Datamart - myFI accounts are the special accounts which are the part of normal accounts in Citigroup
  • This Project involves in the identifying myFI accounts, integrating the demographic and financial data of the myFI accounts and loading them to the data-mart for usage by the myFI user Tools & Technologies used: Abinitio, Db2, Autosys, Unix shell Scripting
  • WellChoice (WellPoint Insurance) - As part of WELLCHOICE-EDL project, WellPoint intends to support around 300 data elements through the enterprise data layer for the following 4 prime subject areas Claims, Membership, Revenue and Capitation
  • These prime subject areas will be supported by Chartfield, Product and Provider subject areas
  • Tools & Technologies used: UNIX, Informatica, DB2 and Teradata

Education

Bachelor of Engineering - Computer Science

University of Madras

Master of Engineering - Computer Science

Anna University

Skills

  • Spark
  • Scala
  • Python
  • NoSQL - Hbase
  • Teradata SQL
  • Apache Kylin
  • Hive
  • UNIX Shell scripting Abinitio
  • Informatica
  • DB2
  • Tableau

Certification

  • Data Science & Machine Learning - University of Washington - Oct 2017 - May 2018
  • Cloudera Certified Specialist in Apache HBase (CCSHB) - License 100-002-617 - Dec 2012
  • Cloudera Certified Hadoop Developer (CCHD) - License 100-002-617 - Dec 2011
  • IBM Certified Database Associate (UDB 8.1 Fundamentals) - Completed in 2008
  • IBM Certified Solution Designer (DB2 Business Intelligence V 8) - Completed in 2008
  • IBM Certified Solution Designer (DB2 Data Warehouse edition v9.1) - Completed in 2008
  • Teradata Basics V2R5 Level 1 - Completed in 2007
  • Informatica 7.1 Certified Developer - Completed in 2006

Timeline

Senior Software Engineer

Microsoft
08.2019 - 01.2021

Member of Technical Staff

Ebay
08.2011 - Current

Project Lead

Syntel Inc for Allstate Insurance
01.2011 - 07.2011

Technology Lead

Infosys
10.2005 - 12.2010

Bachelor of Engineering - Computer Science

University of Madras

Master of Engineering - Computer Science

Anna University
Dilip Rajgor