Summary
Overview
Work History
Education
Skills
Websites
Languages
Timeline
Generic

Gaeyanmayee Meghana Janarajupalli

Summary

Data engineering professional poised to add significant value through comprehensive experience in developing scalable data solutions. Noted for strong team collaboration and adaptability in fast-paced environments. Reliable in driving results with key skills in data modeling, ETL processes, and cloud-based data platforms.

Senior engineering professional with deep expertise in data architecture, pipeline development, and big data technologies. Proven track record in optimizing data workflows, enhancing system efficiency, and driving business intelligence initiatives. Strong collaborator, adaptable to evolving project demands, with focus on delivering impactful results through teamwork and innovation. Skilled in SQL, Python, Spark, and cloud platforms, with strategic approach to data management and problem-solving.

Overview

13
13
years of professional experience

Work History

Senior Data Engineer

JPMorgan Chase
New York, NY
11.2024 - Current
  • Designed and developed scalable data pipelines using AWS Glue, Step Functions, and Lambda to efficiently process high-volume financial data across structured and unstructured sources.
  • Engineered data lake architecture on Amazon S3 with strong partitioning and versioning strategies to support cost-effective long-term data retention and auditability.
  • Implemented advanced ETL/ELT workflows using Apache Airflow and AWS Glue, integrating data from multiple banking systems into centralized repositories.
  • Developed and optimized complex SQL queries and stored procedures in Snowflake for robust data transformation and business intelligence reporting.
  • Spearheaded the migration of legacy data systems to modern cloud-native solutions using AWS Redshift, Snowflake, and serverless computing for improved performance and scalability.
  • Built real-time data ingestion frameworks with Kinesis Data Streams and Kafka, enabling near-instant fraud detection and transaction monitoring systems.
  • Integrated dbt (data build tool) for transformation as code, ensuring modular and testable pipeline logic aligned with modern data engineering best practices.
  • Designed secure and compliant data models aligning with PCI-DSS and SOC2 requirements, ensuring high levels of data governance for financial applications.
  • Developed robust CI/CD pipelines using Terraform and AWS CodePipeline, automating data infrastructure deployment and pipeline updates with strong version control.
  • Collaborated with data scientists and analytics teams to deploy feature stores and prepare high-quality datasets for advanced machine learning models in production.
  • Utilized AWS Athena and QuickSight to deliver serverless analytical insights with dashboarding for real-time decision-making in risk and claims management.
  • Enforced data quality through Great Expectations framework and custom validation layers to ensure high fidelity of critical financial datasets.
  • Conducted thorough performance tuning on Snowflake including clustering keys, result caching, and warehouse scaling for high-throughput analytical queries.
  • Established data access controls using AWS Lake Formation and IAM policies, ensuring secure and auditable access to sensitive financial data.
  • Automated data cataloging and lineage tracking using AWS Glue Data Catalog and integrated with Apache Atlas for enterprise-wide metadata management.
  • Championed the use of Delta Lake on AWS EMR for managing large-scale transactional data, providing ACID compliance and efficient upserts.
  • Mentored junior engineers in adopting data mesh principles, enabling domain-driven ownership and scalable data product design across business units.
  • Supported business continuity with disaster recovery strategies across multi-region AWS deployments, ensuring data availability and system resilience.

Senior Data Engineer

Premier Inc
Charlotte, NC
08.2021 - 10.2024
  • Designed and developed scalable and secure AWS-based data pipelines to ingest, transform, and process large volumes of healthcare data with optimized performance and low latency.
  • Engineered robust data lakes on Amazon S3, integrating structured and semi-structured data from multiple healthcare plan sources using AWS Glue and Lambda.
  • Spearheaded the migration of legacy ETL processes to Snowflake on AWS, enabling faster data querying, simplified maintenance, and improved scalability for claims and member datasets.
  • Implemented real-time data streaming using Amazon Kinesis and Kafka for continuous ingestion of eligibility, provider, and claim data, aligning with operational reporting needs.
  • Developed and maintained data models and warehouse schemas in Snowflake to support analytics, regulatory reporting, and actuarial forecasting within the healthcare domain.
  • Created automated CI/CD pipelines using AWS CodePipeline and Terraform, ensuring seamless deployment and version control of data integration code across environments.
  • Built and orchestrated complex workflows using Apache Airflow and AWS Step Functions, reducing data latency and improving data availability across downstream systems.
  • Collaborated with cross-functional stakeholders to implement data quality frameworks using AWS Deequ and custom validation scripts, ensuring compliance with HIPAA and data governance standards.
  • Enabled efficient query performance and cost optimization through Snowflake clustering, materialized views, and warehouse tuning techniques.
  • Designed high-availability and disaster recovery strategies for critical data services hosted on Amazon RDS, Redshift, and S3.
  • Conducted data profiling, lineage tracking, and metadata management to support data catalog initiatives using AWS Glue Data Catalog and Alation.
  • Worked closely with data scientists and business analysts to enable self-service analytics by delivering curated, certified datasets in Snowflake and Amazon Redshift.
  • Optimized ETL performance using PySpark on AWS EMR and Snowflake Snowpipe, reducing batch processing time by over 40%.
  • Designed reusable components and data ingestion templates for claims adjudication, plan member eligibility, and provider hierarchy data.
  • Implemented security best practices across the data pipeline including IAM roles, KMS encryption, and VPC configurations in AWS.
  • Participated in architecture reviews, vendor evaluations, and capacity planning to scale enterprise data platforms for future business growth.
  • Mentored junior data engineers and conducted knowledge transfer sessions on AWS ecosystem, Snowflake architecture, and healthcare data models.
  • Delivered end-to-end ownership of data engineering projects, ensuring alignment with business objectives, cost-efficiency, and system performance in healthcare plan analytics.

Senior Data Engineer

Simmons Bank
Pine Bluff, AR
09.2018 - 07.2021
  • Led the end-to-end design and implementation of data pipelines on Google Cloud Platform (GCP), ensuring scalable ingestion, transformation, and loading of large volumes of banking data.
  • Architected cloud-native solutions using BigQuery, Cloud Composer, and Dataflow to streamline data workflows, ensuring low-latency access for analytics teams and business users.
  • Designed and maintained Terraform infrastructure-as-code to automate provisioning of GCP services, optimizing deployment cycles and reducing manual configuration efforts.
  • Developed and orchestrated ETL/ELT pipelines using Informatica PowerCenter and Informatica Cloud, integrating heterogeneous data sources across retail and commercial banking applications.
  • Created reusable data models and semantic layers to support regulatory reporting, credit risk analytics, and customer 360 initiatives, aligned with banking compliance standards.
  • Collaborated with cross-functional teams to translate business requirements into efficient technical solutions, using agile methodologies and continuous delivery best practices.
  • Led the migration of legacy SQL Server and Oracle data warehouses to GCP BigQuery, reducing operational costs and improving query performance by over 60%.
  • Implemented robust data governance frameworks using Google Data Catalog, ensuring data lineage, classification, and compliance across critical banking datasets.
  • Designed scalable data lake architecture using GCS (Google Cloud Storage), enabling archival, raw, and refined zone segregation for better data lifecycle management.
  • Conducted detailed performance tuning and SQL optimization of large-scale queries, improving data processing times and supporting real-time dashboarding in banking operations.
  • Built CI/CD pipelines for data engineering projects using Cloud Build, Git, and Terraform, streamlining release processes and reducing deployment errors.
  • Led initiatives to modernize ETL infrastructure, replacing batch processes with streaming pipelines using Apache Beam and Cloud Pub/Sub to meet real-time banking demands.
  • Integrated market-leading data quality frameworks to monitor and alert on anomalies, ensuring high trust in data used for fraud detection and loan risk models.
  • Mentored junior engineers and conducted code reviews to ensure best practices in Python, SQL, and Terraform, fostering a high-performing data engineering team.
  • Partnered with security and compliance teams to enforce data encryption, access control policies, and PII masking across the cloud data infrastructure.
  • Contributed to metadata management and lineage tracking initiatives, ensuring full transparency and traceability of data transformations in compliance with FDIC guidelines.
  • Participated in data strategy planning to align Simmons Bank’s digital transformation goals with modern data stack adoption and cloud-first architecture.
  • Implemented alerting and monitoring for data jobs using Stackdriver and Cloud Monitoring, ensuring proactive management of SLA-bound workflows.

Data Engineer

AutoZone Inc
Memphis, TN
12.2015 - 08.2018
  • Designed and implemented scalable data pipelines using Azure Data Factory and Azure Databricks to support enterprise data integration from point-of-sale (POS), inventory, and customer systems.
  • Developed ETL processes with optimized transformations using Azure Data Lake Storage Gen2, reducing processing time and improving data freshness for reporting and analytics.
  • Created and maintained SQL Server Integration Services (SSIS) packages for periodic and real-time data loading, enabling faster retail trend analysis and operational reporting.
  • Leveraged Azure Synapse Analytics to construct and manage high-performance data warehouses, integrating data from store-level transactions to support forecasting models.
  • Orchestrated data ingestion workflows from various sources including REST APIs, flat files, and cloud-based endpoints, ensuring seamless integration into centralized storage systems.
  • Implemented data validation frameworks using PySpark on Azure Databricks, ensuring high data quality standards across batch and streaming pipelines.
  • Collaborated with business analysts and product managers to translate retail KPIs into robust data models, enhancing decision-making with accurate and timely insights.
  • Developed and deployed Azure Logic Apps and Azure Functions for automation of data synchronization processes across retail systems and cloud services.
  • Conducted detailed data profiling and cleansing using tools such as SQL, Pandas, and Azure-native utilities to standardize data from multiple franchises and suppliers.
  • Ensured compliance with data governance policies through the use of Azure Purview, maintaining proper metadata cataloging and access controls.
  • Monitored and optimized performance of data workflows with Azure Monitor and Log Analytics, reducing bottlenecks in data delivery and supporting SLA adherence.
  • Designed star schema and snowflake schema data models in Azure SQL Database and Synapse, enabling efficient BI reporting via Power BI and Tableau.
  • Partnered with cross-functional agile teams to continuously deliver production-ready data solutions that enhanced AutoZone’s loyalty program and promotional targeting.

Data Engineer

Lumen Technologies
Monroe, LA
02.2013 - 11.2015
  • Designed and developed end-to-end data pipelines using structured and unstructured data from network systems to support telecom analytics and reporting.
  • Implemented and optimized ETL workflows using Python and Apache Spark to process high-volume call and signal data efficiently.
  • Built scalable data storage solutions using Amazon S3, ensuring secure and cost-effective data archiving and retrieval.
  • Utilized AWS Glue for automated metadata cataloging and serverless data preparation to accelerate ingestion processes across telecom datasets.
  • Worked closely with data architects and business analysts to define data models and schema designs aligned with telecom performance KPIs.
  • Developed Redshift-based data warehouses and managed clusters for high-performance query execution on billions of telecom records.
  • Designed real-time data streaming solutions with Apache Kafka and AWS Kinesis for proactive fault detection and network monitoring.
  • Employed SQL (PostgreSQL, Redshift Spectrum) to build reusable and efficient queries that power downstream BI dashboards and telecom insights.
  • Leveraged AWS Lambda functions for lightweight data transformations and scheduling across various telecom platforms.
  • Created robust data validation layers using PySpark and Pandas, ensuring data quality across ingestion and processing stages.
  • Collaborated with DevOps teams to deploy CI/CD pipelines for data applications using AWS CodePipeline and CloudFormation.
  • Maintained detailed data lineage and audit trails by integrating AWS CloudTrail with internal logging systems to meet telecom compliance standards.
  • Improved data accessibility by integrating BI tools like Tableau and QuickSight, enabling interactive visualizations for network health and usage trends.
  • Participated in the migration of on-premise data warehouses to AWS cloud infrastructure, improving scalability and reducing latency for global telecom services.
  • Provided knowledge transfer and documentation on telecom-specific data flows, enabling internal teams to efficiently troubleshoot and maintain systems.

Education

Bachelor’s - computer science

LP University
05.2011

Bachelor’s - computer science

LP University
05.2011

Skills

  • Cloud Platforms: AWS, Azure, Google Cloud Platform (GCP)
  • Data Warehousing: Snowflake, AWS Redshift, GCP BigQuery, Azure Synapse
  • ETL/Data Pipelines: AWS Glue, Apache Airflow, dbt (Data Build Tool), Azure Data Factory, Informatica
  • Streaming: Apache Kafka, Amazon Kinesis, Google Cloud Pub/Sub
  • Data Lakes: Amazon S3, Google Cloud Storage, Delta Lake
  • Big Data Processing: PySpark, AWS EMR, Databricks
  • Orchestration: Apache Airflow, Google Cloud Composer, AWS Step Functions
  • Infrastructure as Code: Terraform, AWS CloudFormation
  • CI/CD: AWS CodePipeline, Azure DevOps
  • Data Quality: Great Expectations, AWS Deequ
  • SQL & Databases: Snowflake SQL, Azure Synapse, Stored Procedures, Complex SQL Transformations
  • Metadata Management: AWS Glue Catalog, Apache Atlas, Google Data Catalog
  • BI & Visualization: Power BI, AWS QuickSight, Tableau (implied via dashboards)
  • Serverless: AWS Lambda, Azure Functions
  • Security & Compliance: PCI-DSS, HIPAA, SOC2, FDIC (standards implementation)
  • ML Feature Stores: Feature store design for ML models
  • Version Control: Git (implied via CI/CD workflows)
  • Scripting: Python, SQL, PySpark

Languages

English

Timeline

Senior Data Engineer

JPMorgan Chase
11.2024 - Current

Senior Data Engineer

Premier Inc
08.2021 - 10.2024

Senior Data Engineer

Simmons Bank
09.2018 - 07.2021

Data Engineer

AutoZone Inc
12.2015 - 08.2018

Data Engineer

Lumen Technologies
02.2013 - 11.2015

Bachelor’s - computer science

LP University

Bachelor’s - computer science

LP University
Gaeyanmayee Meghana Janarajupalli