Summary
Overview
Work History
Education
Skills
Websites
Summary
Certification
Timeline
Generic

PAWAN SHARMA

Data Engineer
Jersey City,NJ

Summary

Responsive expert experienced in Data Engineering, ETL processes, monitoring database performance, troubleshooting issues and optimizing database environment. Possesses strong understanding of cloud services, database technologies and optimizing data pipelines to drive data-driven decision-making in a dynamic organization. Equally confident working independently and collaboratively as needed and utilizing excellent communication skills.


Overview

17
17
years of professional experience
1
1
Certification

Work History

Data Engineer

EXL Services
2 2020 - Current
  • Led the migration of on-premise data infrastructure to GCP, improving data processing speed by 40% and reducing operational costs by 35%.
  • Optimized BigQuery schemas and queries, resulting in a 50% improvement in query performance and cost reduction.
  • Implemented data governance and security best practices, ensuring compliance with GDPR and CCPA regulations.
  • Assisted in the development of ETL processes using Python and SQL, ensuring accurate and timely data loading.
  • Monitored data pipeline performance and resolved data quality issues, maintaining 99.9% data accuracy.
  • Data extraction, Data Transformation & Data load using AWS Services (S3, Glue, Data Catalog, Redshift, RDS), Python
  • The project's primary goal was to design, develop, and maintain a robust data engineering pipeline to collect, process, and deliver data for analytics, reporting, and business intelligence.
  • Enhanced data quality by performing thorough cleaning, validation, and transformation tasks.
  • Led end-to-end implementation of multiple high-impact projects from requirements gathering through deployment and post-launch support stages.
  • Managed cloud-based infrastructure to ensure optimal performance, security, and cost-efficiency of the company''s data platform.

Manager

Paytm
04.2013 - 01.2020
  • ETL/Datawarehouse - Initially started with small Database MySQL for all types of Transactions (OLTP and OLAP)
  • Later Datawarehouse migrated to Big Data using Hadoop and Hive eco-system
  • It involves data ingestion from different sources like RDBMS, AWS S3 and real-time data streaming from Mobile app/website.
  • Improved marketing to attract new customers and promote business.
  • Reduced manual processes by automating repetitive tasks using scripting languages and workflow automation tools.
  • Streamlined data operations by consolidating disparate data sources into a centralized database.
  • Mentored junior team members in technical skills and professional development, fostering their growth within the organization.
  • Enhanced data quality by implementing rigorous data validation processes and automated error detection systems.
  • Implemented business intelligence tools to provide actionable insights, driving more informed decisionmaking.

Senior Data Analyst

One97 Communication
07.2010 - 03.2013
  • Execute the subscription services for VAS and USSD
  • Filter out the customer base from DND list.
  • Streamlined data analysis workflows for increased efficiency and faster decision-making processes.
  • Improved data accuracy by implementing stringent data validation processes and quality control measures.
  • Used business objects, business intelligence and other reporting tools to extract data from data solutions and data warehouses.
  • Served as a subject matter expert on various projects, sharing valuable insights derived from extensive industry experience as a Senior Data Analyst.
  • Assisted in the development of a centralized data repository to improve accessibility and collaboration among teams.
  • Created dashboards to monitor and track key performance indicators.

Customer Care Executive

EXL Services
02.2007 - 12.2009
  • Manage customer database in SAP.
  • Reduced customer complaints with proactive issue identification and resolution strategies.
  • Provided excellent customer service by efficiently resolving issues and responding to inquiries.
  • Maintained and managed customer files and databases.
  • Assisted in training new team members to ensure a high level of customer care expertise across the department.

Education

Bachelor of Science - Mathematics, Physics, Chemistry

HNB Garhwal University

Skills

Cloud Platforms: Google Cloud Platform (GCP), AWS

Summary

Over 15 years of experience (5+ years in Data Engineering) in ETL Processes, Data Warehousing, GCP Service (IAM, Google Cloud Storage, Airflow Composer, Data Proc cluster, Dataflow), AWS Services (EC2, Glue, S3, Redshift, Athena, Lambda), Hive, Airflow, Sqoop, Hadoop, SQL-based technologies, Python, PL-SQL. 

Migrated on prem data ware house to GCP from SAS, Netezza and Hadoop.

Good hands on with cloud technolgies (GCP and AWS).

Created 100+ Airflow DAGs using many Operators like Python, PythonVirtual & TriggerDagRun, BigqueryInsert Job, CreateDataproc, SubmitJob etc.

Extensively worked on GIT, Jenkins & CI/CD pipelines including .

Developed a Python based ETL Engine to ingest data from AWS S3, AWS CLI, SFTP into AWS Redshift using metadata in AWS RDS which replaced existing ETL tool and reduced license cost (around $70K/year). 

Developed a python script to check & fetch latest available data files from SFTP and store the output to AWS S3 creating date folders., Managed Airflow DAG scheduling, monitoring, and error handling to ensure data pipeline reliability and data quality., Developed Apache Airflow DAGs to configure multiple tasks into one complete end-to-end process/job., Automated the Tableau Data sources/workbooks using python TSC module reducing a lot of manual efforts., Developed highly optimized Hive-based fact/dimensional data model using partition/bucket concept & fast processing file types (Parquet, ORC etc.), Developed interactive parametrized business level metrics dashboard using Tableau, MSSQL & Azure Databricks for US Healthcare based RCM Project., User segmentation/Fraud Analytics using business models like RFM & Cohort Analysis.

Certification

Google Cloud Associate

Timeline

Google Cloud Associate

09-2023

Manager

Paytm
04.2013 - 01.2020

Senior Data Analyst

One97 Communication
07.2010 - 03.2013

Customer Care Executive

EXL Services
02.2007 - 12.2009

Data Engineer

EXL Services
2 2020 - Current

Bachelor of Science - Mathematics, Physics, Chemistry

HNB Garhwal University
PAWAN SHARMAData Engineer