Summary
Overview
Work History
Education
Certification
Skills
Publications
Sougandhika Tera

Sougandhika Tera

Data Engineer

Summary

Data Engineer with 4+ years of experience designing scalable data pipelines, optimizing ETL/ELT workflows, and supporting analytics platforms across manufacturing (Filtrona), pharma (Novo Nordisk), and e-commerce/IoT (ACS). Strong hands-on expertise in SQL, Python, Scala/Java, Spark, AWS, Snowflake, Hadoop, Kafka, and orchestration tools like Airflow and ADF/Fabric. Experienced in Lakehouse architectures, CDC pipelines, real-time streaming, batch processing, and data modeling. Familiar with AI models/tools and application-integrated data flows. Adept at collaborating with business, engineering, and analytics teams to deliver high-quality, reliable data solutions in fast-paced environments.

Overview

4
4
years of professional experience
3
3

Certifications

Work History

Data Engineer

Filtrona
12.2024 - Current
  • Designed enterprise-grade ELT pipelines using Fabric Data Pipelines & Lakehouse SQL for operations, quality, and manufacturing analytics.
  • Built star schemas & semantic models for shop-floor insights, downtime analysis, batch tracking & OEE reporting.
  • Implemented dbt-style SQL transformations with incremental logic, SCD handling & validation rules.
  • Optimized query performance using partitioning, clustering & delta optimization.
  • Created Power BI dashboards for production KPIs, waste trends, and equipment efficiency.
  • Added audit logging, pipeline observability & custom error-handling for Fabric workflows.
  • Built scalable ML data pipelines using ADF to automate feature extraction, model deployment triggers, monitoring, and feedback loops, ensuring reliable CI/CD workflows for ML models.
  • Documented data flows, KPIs, data dictionary, lineage, and training materials for business users.
  • Remote, USA. Environment: Microsoft Fabric, Lakehouse, SQL Endpoint, Spark, Delta tables, Power BI, ADF, Azure DevOps, Python.

AWS Data Engineer

Novo Nordisk Pharmaceuticals
NC
12.2022 - 10.2024
  • Developed and maintained large-scale ETL/ELT pipelines extracting clinical & manufacturing datasets from SAP, Oracle, and LIMS systems into AWS and Snowflake.
  • Built PySpark jobs for transforming raw clinical trial data, enforcing validation rules & audit checkpoints.
  • Implemented CDC pipelines using DMS for near-real-time replication of LIMS and manufacturing systems.
  • Designed star/snowflake schemas for analytical marts supporting regulatory, product quality, and supply chain reporting.
  • Used Airflow to orchestrate pipelines with SLA monitoring, retries, email alerts & lineage metadata.
  • Optimized AWS Glue jobs using partitioning, bucketing & predicate pushdown improving run time by 40%.
  • Integrated Kafka streaming for ingesting IoT sensor data from manufacturing lines, enabling real-time alerts.
  • Collaborated with data scientists to deliver clean feature datasets for ML models predicting batch deviations.
  • NC, USA. Environment: AWS (Glue, S3, Lambda, Redshift, EMR), Snowflake, Kafka, Airflow, Python, Spark.

Data Engineer

ACS Technologies
India
05.2021 - 12.2021
  • Designed ingestion pipelines to process e-commerce clickstream & cart events using Kafka → Spark → Snowflake.
  • Developed PySpark transformations to enrich behavioral data with customer profiles for personalization workflows.
  • Built ETL pipelines to migrate legacy datasets to Snowflake for faster analytics.
  • Created Power BI dashboards for user journeys (add-to-cart, abandonment, trending products, exit-risk modeling).
  • Automated pipeline deployments using Git, CI/CD, and parameterized ADF pipelines.
  • Improved system reliability through data validation frameworks, error logging & SLA monitoring.
  • India. Environment: AWS, Kafka, Spark, MySQL, Snowflake, Python, ADF.

Education

M.S. - Computer Information Systems

New England College
01.2023

B.Tech - Electronics & Communication Engineering

Guru Nanak Institutions
01.2021

Certification

  • AWS Certified Cloud Practitioner
  • AWS Certified Data Engineer – Associate
  • Microsoft Certified Azure Fundamentals
  • Introduction to Data Engineer by IBM (Coursera)
  • Machine Learning Specialization (Coursera)

Skills

  • Python
  • Java (Basic)
  • SQL
  • Scala
  • Apache Spark
  • PySpark
  • Hadoop
  • Hive
  • Kafka
  • Airflow
  • EMR
  • Databricks
  • AWS
  • Azure
  • Microsoft Fabric
  • Snowflake
  • Redshift
  • MySQL
  • SQL Server
  • Oracle
  • BigQuery
  • AWS Glue
  • Informatica
  • ADF
  • Fabric Pipelines
  • Star/Snowflake schema
  • Dimensional modeling
  • Semantic modeling
  • Git
  • GitHub Actions
  • Azure DevOps
  • Docker
  • Linux
  • Jenkins
  • Power BI
  • Tableau

Publications

  • Overcoming Early-Stage Adoption Challenges of Microsoft Fabric in Enterprise Environments (https://doi.org/10.63282/3050-9416.IJAIBDCMS-V6I4P108)
  • The AI-Augmented Data Engineer: How LLMs and Copilots are Redefining the Engineering Workflow (https://doi.org/10.63282/3050-9246/ICRTCSIT-116)
  • ENHANCED OCCUPANCY DETECTION SYSTEM USING ULTRASONIC/PIR SENSORS: IMPROVING ACCURACY AND EFFICIENCY IN SMART ENVIRONMENTS (https://pesjournal.net/paper.php?id=653)
Sougandhika TeraData Engineer