Staff Data Engineer at Fastly with strong expertise in Google BigQuery and Python. Optimized data pipelines, achieving a 40% increase in efficiency. Experienced in architecting scalable data solutions and mentoring teams to enhance data integrity and performance. Committed to utilizing advanced technologies for innovative data engineering solutions.
Overview
13
13
years of professional experience
Work History
Staff Data Engineer
Fastly
08.2025 - Current
Lead the development of Fastly’s data infrastructure, optimizing data pipelines on Google Cloud Platform (GCP), scaling ingestion of complex data sources, and establishing best practices for quantitative performance and reliability.
Architect and drive the evolution of data platforms, ensuring scalability, performance, and reliability by utilizing advanced technologies and tools, including Data Warehousing and Optimization, Python, SQL, Airflow, BigQuery, Google Cloud Platform, Cloud Storage, and Pub/Sub.
Designed and deployed AI-driven agents using Python to automate complex data engineering workflows, reducing manual intervention and improving pipeline efficiency.
Built intelligent data pipelines integrating BigQuery and Airflow, enabling dynamic data orchestration and real-time error handling through LLM-based logic.
Provide technical leadership and strategic direction for data engineering initiatives.
Mentor other engineers, fostering technical excellence and professional growth.
Establish and enforce best practices for data architecture, pipeline development, and data engineering deployment processes.
Collaborate closely with data analysts and business analysts to understand data requirements and deliver impactful quantitative solutions.
Manage infrastructure-as-code deployments using Terraform and containerized software applications with Kubernetes.
Continuously evaluate and implement data tools and practices to enhance team productivity and data quality.
Lead Data Engineer
Apree health
San Francisco Bay Area
09.2023 - Current
Successfully migrated 100+ TB of data from on-premise Postgres to Google Cloud Platform, ensuring 100% data integrity.
Designed and developed Data Lake on GCP during migration to store and organize the data coming from various data sources like flat files, APIs and replication.
Built cloud infrastructure using Terraform for GCP. Implemented Access management and security using IAM authorization through Terraform.
Led developer team in creating a real-time data pipeline with Python, Pub/Sub, BigQuery, and Airflow, reducing processing time by 40% and enhancing data availability.
Led 2-developer team to create Common File Framework, handling weekly ingestion of ~10K files from SFTP to BigQuery with real-time alerts.
Pioneered a data pipeline post-merger for seamless integration of EMR data from SQL Server to GCP Data Warehouse.
Led the migration of data warehouse from Azure to GCP, integrating real-time EMR processing and saving the organization $1.3M annually.
Streamlined data processing by developing Docker containers, deployed on Google Kubernetes Engine with Jenkins CI/CD.
Built Google Data fusion and Data stream pipelines to sync and replicate data to Data lake from Application and Transactional database.
Leading clinical data operations team for all the data ingestion and analytics pipeline related to EHR data.
Working on POC for Google DataPlex to implement Data quality and Data Governance.
Lead Data Engineer
Castlight Health
San Francisco Bay Area
03.2018 - 09.2023
Designed and deployed an Enterprise Data Warehouse, processing millions of claims and transactions daily, with optimized ETL in Python and Informatica.
Developed and launched a data pipeline for https://www.vaccines.gov/, facilitating access for more than 20M users during the pandemic.
Streamlined incentive file deliveries by engineering a robust automation framework, ensuring daily dissemination of wellbeing data to more than 100 benefit vendors.
Enabling an internal incentive strategy team, AI/ML team and external customers by providing insight on wellbeing expenditure and activities based on data analysis and data mining.
Worked on POC for implementing a data pipeline using spark to fetch real time user data and build data lakes on AWS Redshift.
Worked with top level management to understand business needs and translate those business needs into actionable reports in Tableau of operations, saving 11 hours of manual work each week.
Led the team for developing data quality dashboards for data pipeline processing and reporting across multiple projects.
Migrated complete data platform to Greenplum of newly acquired wellbeing organisation Jiff Inc. and integrated into data warehouse.
Senior Development Engineer
Pramati Technologies Private Limited
Hyderabad
11.2014 - 03.2018
Client : Castlight Health.
Developed ETL processes in Greenplum, building the EDW for PULSE, improving daily data handling to over a million rows with SCD Type-II integration.
Built aggregated tables for reporting customer engagement and activities on the platform.
Created transformation in Informatica power center for sending data incrementally to Salesforce using parameterised file, stored procedure transformation via SFTP.
Designed and developed data migration project which brings data from MySQL to Greenplum on nightly basis. Implemented the code in python to automate and monitor the job.
Developed Analytical projects which gives insight into savings and user-engagement of organisation.
Software Engineer
Persistent Systems
Pune Area
10.2012 - 11.2014
Client: Castlight Health.
Designed and developed incentive automation process that delivers regular files to customers SFTP everyday. ETL and automation for this script is created in python using MySQL.
Developed Qlkview report required for internal and executive level reporting in castlight. Created basic optimized query required to fetch data for reports. Used Google map in geo-specific report. Created Drill-down reports, used custom charts (like candle-stick chart).
Created MySQL queries for savings and cost related projects of castlight customer and developed automation in python for it. Worked on critical billing and claims project. Used Multithreading and incremental approach for processing millions of rows in database parallel.
Education
B.E. - Information Technology
Pune University
12.2012
Diploma - Information Technology
Government Polytechnic
Jalgaon
12.2009
Skills
Google BigQuery, Pub/Sub, GCS, Google Kubernetes Engine (GKE), Amazon Redshift