Summary

Overview

Work History

Education

Skills

Accomplishments

Certification

Timeline

PARTHA MEHTA

Randolph Township,NJ

Summary

Highly skilled and motivated Data Engineer with a robust track record in building, scaling, and optimizing large-scale data pipelines and distributed systems at Cisco Talos, contributing to the Threat Analytics Platform (TAP) Core Development team.
Proficient in PySpark, Databricks, Go, and Python, with extensive experience in cloud technologies including AWS, Terraform, and Azure, as well as expertise in modern data lake and Delta Lake architectures.
Demonstrated success in designing and deploying comprehensive end-to-end data workflows that encompass prevalence aggregation, retention policies, SCD2 modeling, and event-driven ingestion pipelines. Adept at developing resilient ETL/ELT pipelines while automating monitoring processes to enhance reliability for multi-terabyte datasets.

Overview

years of professional experience

Certification

Work History

Cloud Engineer (Talos: TAP Core Dev Team)

Cisco Systems, Inc.

10.2023 - Current

Built and maintained multi-terabyte ETL pipelines for threat datasets.
Designed and orchestrated first and second-level prevalence aggregation pipelines in Databricks.
Developed retention frameworks and job workflows in Delta Lake, reducing storage costs and improving performance.
Validation initiatives with SQL notebooks comparing DEV vs PROD across observables.
Created a Go-based CLI tool to re-drive or re-run failed Step Function executions. This came in real use during a big customer outage incident where ingest data was missing.
Engineered and deployed the health-check system in Go.
Implemented a storage-efficient SCD2 pipeline for a dataset, reducing >70% data redundancy.
Built Managed Delta tables for near-real-time API ingestion and long-term historical tracking, integrated with Databricks, S3, and ClickHouse.
Contributed to AI/LLM initiatives, including an internal TAP chatbot.
Provided on-call support for pipeline incidents, including triage, repair, RCA, and documentation.
Deployed Spark jobs, Step Functions, Lambdas, and IAM policies via Terraform across multi-region AWS.
Built a long-running Databricks job detection system for anomaly alerting.
Contributed to TAP-wide Go packages. Focused on building reusable libraries for AWS service integration.
Successfully updated Golang base-ci image, alpine base ci image and Golang / Terraform version upgrades to latest for SOC2 audit.
Updated code repo for TAP team for compatibility and consistency with latest Terraform or Go versions.
Improved system reliability by replacing static cron schedules with dependency-driven triggers in Databricks, preventing rollups on incomplete datasets.

Data Scientist - Full Time CPT

Syntactech

01.2023 - 05.2023

Worked on marketing analytics initiatives using statistical modeling, ML techniques, and Time Series forecasting.
Delivered actionable insights via automated dashboards, through KPI tracking, A/B testing, and campaign analysis to support data-driven decisions.
Built forecasting models that enabled accurate sales planning and strategic goal setting.
Analyzed customer behavior to assess product impact, reduce churn, and enhance engagement strategies.
Developed strategies to optimize client channel placement and improve commercial account performance.
Built scalable revenue prediction models, helping drive long-term business planning.

Data Engineer - Part Time On Campus Role

GEP Worldwide

09.2022 - 12.2022

Automated data ingestion and parsing guide generation using Azure Data Factory and Databricks, enabling same-day client onboarding (down from 4 days).
Built end-to-end monitoring and error logging system with Azure Log Analytics and Power BI for real-time visibility.
Improved data processing efficiency through optimized ingestion logic using Databricks and Apache Kafka.
Revamped ETL with an automated framework, increasing data accuracy, and reducing processing time.
Delivered data cube reports via Azure Data Factory and triggered Spark jobs within ADF pipelines for scalable processing.

Technology Consultant/Data Engineer

PricewaterhouseCoopers SDC

08.2019 - 08.2021

Worked on end-to-end development of real-time and batch data pipelines using Snowflake, Spark, Azure, and AWS, supporting user analytics, content recommendations, and enterprise reporting at scale.
Automated data ingestion, ETL frameworks, and monitoring systems across Azure Data Factory, Logic Apps, and AWS services, reducing processing time, improving data accuracy, and cutting operational costs.
Migrated critical pipelines from third-party tools (e.g., Informatica to native AWS) and built reusable frameworks for Salesforce, HR, and media data feeds, enabling secure, scalable, and cost-efficient ingestion.
Designed internal tools for query optimization, job monitoring, and real-time analytics using Hive, Elasticsearch, Cassandra, Django, and QuickSight, improving performance, data governance, and developer productivity.

Education

Master of Science - Data Science Computational Track

New Jersey Institute of Technology

Newark, NJ

05-2023

Bachelor of Science - Computer Science

Nitte Meenakshi Institute of Technology, VTU

Bangalore, India

05-2019

Skills

Programming Languages: Golang, Python, Scala, Shell Scripting, Java, gRPC, Terraform
Tools & Frameworks: Docker, Kubernetes, Grafana, Airflow, Streamlit, Microservices, Linux, Kibana, Django, Flask, Prometheus
Data Engineering (Processing & Storage): Apache Spark, PySpark, Scala Spark, Databricks, Delta Lake, Unity Catalog, Observability
Data Engineering with Cloud Services: AWS (Lambda, S3, DynamoDB, Athena, Glue, EKS, ECS, ELB, Redshift, SNS, SES, SQS, Cloudwatch, Eventbridge, Cloud Trail, EC2, Security Group, EMR, Auto Scaling, RDS, CFT, Step Functions, Kinesis), Databricks

Software Development: Agile Methodologies, CI/CD (Github Actions, Gitlab), Version Control (GIT, SVN), Jira, Confluence
Security: IAM, SOC2 documentations, Incident Response and Mitigation, Vault
Databases: NoSQL (DynamoDB, Redshift, MongoDB), MySQL
Data Science: Algorithms with ML, NLP, GenAI (LLMs, RAG Pipelines, Agents, LlamaIndex, OpenAI, GPT, Langchain models)

Accomplishments

Received Connected Recognitions from team members including manager at Cisco.
Received Connected Recognition from Director of Security Researching for developing our own internal AI agent at Cisco.
Identified and fixed PROD issues wrt data availability from our event-driven pipelines on AWS, especially during my on-call rotations at Cisco.
Identified and fixed PROD bugs in CI/CD within first 2 months at Cisco. Received recognition for this effort.
Received 2 promotions within 2 years of joining as a Fresher at PWC SDC, US Advisory, Bangalore, India.
Received real-time recognition and On-Spot Awards from Offsite and Onsite teams at PWC.

Certification

HashiCorp Certified: Terraform Associate 2025 on Udemy
The Ultimate MySQL Bootcamp: Go from SQL Beginner to Expert on Udemy
AWS Certified Data Engineer Associate 2025 - Hands On! on Udemy
Databricks Certified Data Engineer Associate - Preparation on Udemy
Docker for the Absolute Beginner - Hands On - DevOps on Udemy
Docker Certified Associate 2023 on Udemy
Kubernetes Certified Application Developer (CKAD) with Tests on Udemy
Participated in Talos AI Hackathon (2025) at Cisco

Timeline

Cloud Engineer (Talos: TAP Core Dev Team)

Cisco Systems, Inc.

10.2023 - Current

Data Scientist - Full Time CPT

Syntactech

01.2023 - 05.2023

Data Engineer - Part Time On Campus Role

GEP Worldwide

09.2022 - 12.2022

Technology Consultant/Data Engineer

PricewaterhouseCoopers SDC

08.2019 - 08.2021

Master of Science - Data Science Computational Track

New Jersey Institute of Technology

Bachelor of Science - Computer Science

Nitte Meenakshi Institute of Technology, VTU

PARTHA MEHTA

Summary

Overview

Work History

Cloud Engineer (Talos: TAP Core Dev Team)

Data Scientist - Full Time CPT

Data Engineer - Part Time On Campus Role

Technology Consultant/Data Engineer

Education

Master of Science - Data Science Computational Track

Bachelor of Science - Computer Science

Skills

Accomplishments

Certification

Timeline

Cloud Engineer (Talos: TAP Core Dev Team)

Data Scientist - Full Time CPT

Data Engineer - Part Time On Campus Role

Technology Consultant/Data Engineer

Master of Science - Data Science Computational Track

Bachelor of Science - Computer Science

Similar Profiles

JARED DEMENEZESJARED DEMENEZES

TRAVIS ATKINSTRAVIS ATKINS

Krishnaraj KKrishnaraj K

DANIEL NEWKIRKDANIEL NEWKIRK

Orly FreifeldOrly Freifeld