· Experienced in data engineering across diverse domains such as telecom, healthcare, and insurance, I have led complex end-to-end ETL pipeline development, data orchestration, and large-scale data migration projects, leveraging AWS-native services like Glue, S3, Redshift, and Lambda for secure and scalable data processing.
· Adept at building interactive workflow UIs within Foundry to manage data versions and real-time analytics, enabling end-users to perform complex transformations and writeback operations directly in the application layer.
· Experienced in designing and deploying data pipelines using PySpark and Python, building high-scale ETL workflows aligned with Foundry ontology to support analytics and operational systems.
· Expert in implementing data visualizations leveraging Foundry Contour and Power BI, translating complex datasets into clear business intelligence dashboards that support strategic decisions.
· Skilled at collaborating with data engineering teams to integrate ontology-driven data models with application logic, ensuring data consistency, lineage tracking, and governance compliance.
· Strong experience in Foundry Expectations and Health Checks to perform automated validations, catch anomalies early, and mitigate data staleness or quality issues before impacting business KPIs.
· Highly proficient in PyTest, developing unit, integration, and end-to-end tests to maintain reliability in production systems and align with TDD practices.
· Delivered optimized backend APIs that support high concurrency, leveraging microservice principles, caching strategies, and distributed systems design.
· Applied Agile methodologies to manage the full SDLC, from gathering requirements with stakeholders to delivering iterative features with continuous integration pipelines via Jenkins and Git.
· I’ve built end-to-end data pipelines using Palantir Foundry’s Code Workbooks and Spark integrations, leveraging Ontology to define semantic layers over operational datasets and collaborating with business users through clean data interfaces for self-service analytics.
· I’ve also contributed to cloud-native modernization by implementing serverless architecture for data microservices using AWS Lambda and API Gateway, significantly reducing cost and complexity while enhancing scalability.
· As a certified Palantir Foundry user, I have developed and deployed modular data pipelines and transformation logic using Code Workbooks, Contour, and Ontology objects, while maintaining rigorous lineage, access control, and governance frameworks.
· Passionate about continuous learning and staying updated on emerging technologies and industry best practices to deliver innovative solutions. Skilled in managing cross-functional teams and making strategic decisions that drive both short- and long-term business success.
· Spearheaded the design and development of high-throughput Spark jobs for real-time telecom data ingestion and transformation, processing billions of events daily in a microservices-driven architecture on AWS.
· Led the migration of legacy BTEQ workloads from Teradata to modern Spark/Glue-based pipelines on AWS, leveraging S3, Glue Jobs, and Redshift Spectrum for unified data lake and warehousing solutions.
· Designed ontology-driven data models for telecom datasets, enabling quick integration into Foundry analytics pipelines.
· Developed interactive workflow UIs within Foundry Workshop, allowing business teams to manipulate large-scale datasets and visualize trends.
· Built PySpark ETL pipelines to ingest, cleanse, and transform high-volume call data, integrating with Foundry Quiver for analytics.
· Implemented Foundry writeback functionality to synchronize processed datasets with upstream billing systems.
· Created Power BI dashboards for network performance KPIs, leveraging Foundry Contour visualizations for real-time data feeds.
· Collaborated with AI teams to embed natural language processing (LLM) features into customer support tools.
· Used PyTest and CI/CD pipelines for automated validation of ETL pipelines, reducing production incidents by 30%.Ensured end-to-end data integrity and lineage tracking by integrating AWS Glue Data Catalog and schema validation at ingestion.
· Defined scalable DynamoDB designs for storing semi-structured call log metadata, ensuring fast, indexed access for downstream services.
· Designed Java/Python-based APIs for secure, high-concurrency telecom data services, integrated with Foundry writeback pipelines and downstream billing systems.
· Built React/Typescript dashboards integrated with Palantir Quiver/Contour to visualize call volumes, dropped call KPIs, and customer churn insights in near real-time.
· Designed secure data APIs with token-based authentication and integrated them with internal business dashboards to deliver on-demand insights.
· Leveraged Redshift and Snowflake for analytical workloads and built automated data exports into Tableau and Power BI using scheduled Lambda functions.
· Conducted POCs for adopting container-based ML model deployment workflows via SageMaker endpoints and integrated them into production-grade telecom analysis pipelines.
· Engineered scalable, HIPAA-compliant ETL pipelines in PySpark for claims processing, transforming structured/unstructured healthcare data from on-prem to AWS S3 and Glue.
· Integrated TensorFlow-based patient risk scoring models into batch pipelines that auto-triggered based on data readiness, providing actionable insights for case managers.
· Architected near real-time streaming data pipelines using Kafka and Spark Structured Streaming for ingesting high-frequency patient vitals data from external IoT devices.
· Designed secure ontology mappings to integrate patient records across disparate data systems in compliance with HIPAA.
· Built Foundry Quiver dashboards to monitor claims approval SLAs in near real-time.
· Automated ingestion of clinical data from multiple sources, applying cleansing logic in PySpark pipelines.
· Integrated AI-powered analytics for fraud detection, embedding ML models into Foundry workflows.
· Developed writeback capabilities in Foundry to update claim status directly in operational systems.
· Built PyTest suites to validate data accuracy for every ETL stage.
· Developed AIP-powered fraud detection workflows, embedding ML models into PySpark pipelines and surfacing outcomes through Foundry APIs and dashboards.
· Applied Foundry Health Checks to proactively flag anomalies in patient record feeds.
· Worked with frontend teams to improve data visualization usability for healthcare administrators.
· Participated in MLOps initiatives for version control and A/B testing of ML models using MLflow integrated with Foundry and SageMaker pipelines.
· Worked extensively on Palantir Foundry, designing Code Workbooks to transform, filter, and aggregate patient, provider, and claims data; leveraged Ontology to link raw sources with business-contextual layers.
· Used Palantir Contour to develop governed data products with lineage tracking and automated documentation, supporting both audit compliance and reproducibility of analytics pipelines.
· Integrated MLflow-managed models into Foundry pipelines and set up batch inference jobs that triggered based on new data availability, with outputs published directly to Foundry APIs and dashboards.
· Led initiative for implementing schema validation and automated quality checks using Great Expectations framework for claim processing data.
· Integrated Redshift-based analytical reporting with Looker dashboards for executive reporting, with automated refresh pipelines managed through Airflow.
· Collaborated with multiple stakeholders including business analysts, data scientists, and compliance teams to ensure data workflows aligned with federal health data guidelines.
· Took ownership of designing enterprise-level data models, ensuring alignment with business functions such as customer churn, network performance, and call data record (CDR) analytics.
· Architected and implemented robust Azure Synapse pipelines to ingest, process, and transform multi-source telecom datasets including logs, CRM records, and real-time call metadata.
· Orchestrated data movement from Snowflake and on-prem Oracle systems into Azure using hybrid cloud strategies and Azure ExpressRoute for secure, high-speed transfers.
· Designed governance workflows using Azure Purview and Unity Catalog to document metadata, access policies, and ensure regulatory compliance across subscriber data.
· Collaborated with the security and risk team to integrate Azure RBAC and Key Vault, implementing robust data masking policies for sensitive customer attributes.
· Delivered React/Typescript-powered analytics dashboards for end-users, integrating directly with Foundry Contour and Azure Synapse reporting layers.
· Designed CI/CD pipelines with Docker, Kubernetes, and GitLab to automate deployments of ontology changes, Spark pipelines, and dashboard configurations.
· Developed semantic models in Power BI and Tableau, used for self-service reporting on call volumes, latency patterns, and mobile data consumption.
· Conducted intensive training workshops for data stewards and engineers to ramp up on Palantir's Code Workbooks and Object Explorer, helping establish platform-wide best practices.
· Utilized GitLab CI/CD pipelines for versioned deployments of ontology changes, data transformations, and UI configuration artifacts in Foundry.
· Modeled high-volume data for 5G rollout projects by integrating Kafka Streams into Azure Event Hubs, ensuring seamless streaming and batch sync with downstream systems.
· Pioneered the cloud data mesh design by leveraging Azure’s hub-spoke VNet topology, creating domain-specific data products for customer engagement, billing, and service quality.
Developed a custom data lifecycle management solution using Azure Data Lake Storage Gen2 and Azure Functions, automating data tiering and archival processes, resulting in a 35% reduction in long-term storage costs
· Microsoft Certified Fabric Analytics Engineer Associate(DP-600)
· Microsoft Certified Azure Data Engineer
· Implementing a Data Warehouse in Microsoft Fabric
· Palantir Foundry & AIP Builder
· Databricks Fundamentals
· Implementing a Data Lakehouse in Microsoft Fabric