Summary
Overview
Work History
Education
Skills
Timeline
Generic

PAVAN KRISHNA VUNNAM

New Haven,USA

Summary

Data Engineer with 3+ years of professional experience designing, orchestrating, and optimizing end-to-end data pipelines across AWS and Azure ecosystems. Expert in building serverless, scalable data architectures using AWS Glue, Lambda, Step Functions, and Azure Data Factory for batch and real-time data workflows. Strong expertise in ETL automation, orchestration, and data modeling using Databricks (PySpark), Redshift, Snowflake, and Synapse Analytics. Skilled in AI-powered knowledge retrieval integrating Amazon Kendra, Bedrock, and OpenSearch for semantic document search and automated Q&A. Proficient in Python (Pandas, boto3, PySpark) and SQL for developing reusable data pipelines, API integrations, and workflow automation. Hands-on experience in metadata tagging, compliance enforcement (ABAC), and dynamic IAM policies for ITAR/EAR governance. Adept at data migration, transformation, and integration across multiple cloud environments using Glue Jobs, Step Functions, and Azure Data Factory pipelines. Designed and delivered data visualization solutions using Power BI, Tableau, and QuickSight, driving actionable business insights. Implemented CI/CD pipelines and infrastructure-as-code automation using Terraform, GitHub Actions, and AWS CloudFormation. Collaborated with data scientists and AI engineers to embed ML models into production ETL pipelines using SageMaker and Databricks. Experienced in data security, lineage tracking, and pipeline observability using CloudWatch, Athena, and tagging frameworks. Passionate about building intelligent, automated, and secure data ecosystems that bridge analytics, AI, and business decision-making.

Positive, analytical problem-solver with strong foundation in data systems and processes. Possesses solid understanding of data modeling and database design, coupled with skills in SQL and Python. Capable of driving data-driven decision-making and improving data infrastructure.

Overview

4
4
years of professional experience

Work History

Data Engineer

Aionx11
11.2024 - Current
  • Designed, developed, and deployed serverless ETL pipelines using AWS Glue, Lambda, Step Functions, and Azure Data Factory for structured and unstructured data ingestion.
  • Built and maintained enterprise data lakes in S3 and ADLS Gen2 with lifecycle policies, data partitioning, versioning, and encryption for security and performance optimization.
  • Integrated Databricks (PySpark) for data transformation, feature extraction, and advanced analytics, optimizing job performance through partitioning and caching.
  • Created and automated data ingestion workflows from REST APIs and flat files into Redshift, Snowflake, and Synapse for analytics consumption.
  • Implemented Attribute-Based Access Control (ABAC) using S3 object tags and IAM principal tags, enabling ITAR/EAR-compliant data access across departments.
  • Developed AI Knowledge Search Platform combining Amazon Kendra, Bedrock, and OpenSearch for intelligent search and semantic Q&A over research and enterprise data.
  • Automated metadata tagging, logging, and audit trails using Python (boto3) and CloudWatch for observability and compliance monitoring.
  • Engineered data models in Redshift, Snowflake, and Synapse supporting Power BI and QuickSight dashboards for executive reporting.
  • Integrated CI/CD pipelines using Terraform and GitHub Actions for automated infrastructure provisioning and workflow deployment.
  • Collaborated with analysts to replace manual Excel reports with Power BI dashboards, achieving a 70% reduction in reporting effort.
  • Worked with the architecture team to design cross-cloud pipelines connecting AWS S3, Azure Synapse, and Databricks for unified analytics.
  • Partnered with AI developers to embed NLP models via Bedrock Agents into analytics pipelines for document classification and search enhancement.
  • Environment: AWS (S3, Glue, Lambda, Step Functions, Redshift, Bedrock, Kendra, OpenSearch), Azure (Data Factory, Databricks, Synapse), Python (boto3, Pandas), Power BI, Terraform, GitHub

Statistics Tutor

University of Bridgeport
09.2023 - 06.2024
  • Conducted hands-on tutoring in data analytics, regression, and forecasting using Python, R, and Excel.
  • Designed interactive lab exercises demonstrating reproducible analytics workflows and model evaluation using Jupyter Notebooks.
  • Guided students in data preparation, feature selection, and visualization using pandas, seaborn, and ggplot2.
  • Supported graduate projects that used AWS and Python automation for academic data analysis and research.
  • Helped students apply machine learning concepts (regression and classification) for academic simulations and business case studies.
  • Provided detailed feedback and individualized learning plans to enhance student comprehension and performance in statistics.

Junior Data Analyst

Compsoft Technologies
03.2022 - 07.2023
  • Designed and deployed ETL pipelines for ingesting customer, sales, and transaction data into centralized MySQL and Power BI systems.
  • Automated data transformations and report generation using Python and SQL scripts, cutting manual tasks 40%.
  • Developed Sales Performance Dashboard integrating SQL queries, Power BI visuals, and KPI metrics for daily business monitoring.
  • Created Customer Purchase Behavior Analysis workflows using Python (pandas, matplotlib) and SQL joins to identify key purchase trends.
  • Built predictive models using regression and time-series techniques to forecast inventory and sales, improving business forecasting accuracy by 18%.
  • Partnered with marketing and finance teams to align analytics result with performance metrics and strategic goals.
  • Enhanced data accuracy and consistency through validation scripts and reconciliation reports using SQL stored procedures.
  • Documented ETL flow diagrams, SQL logic, and metadata dictionaries for easier pipeline maintenance and scalability.

Education

Master of Science - Computer Science

University of Bridgeport
Bridgeport, CT
01.2025

Bachelor of Engineering - Electronics & Communication

Atria Institute of Technology
India
01.2022

Skills

  • Python programming with data libraries
  • Cloud platform expertise: AWS and Azure
  • Data Engineering Tools: EMR, Terraform, Airflow (MWAA), dbt, EventBridge, Snowflake
  • Database management: Redshift, Snowflake, MySQL, PostgreSQL, Synapse Analytics, DynamoDB
  • Visualization: Power BI, Tableau, Amazon QuickSight
  • Machine Learning & AI: Regression, Forecasting, NLP, Bedrock Agents, MLflow (basic)
  • Other Tools: Excel (Pivot Tables, Macros), GitHub, JIRA, Agile/DevOps
  • Serverless ETL pipelines
  • AWS Glue
  • Azure Data Factory
  • Data lake management
  • PySpark integration

Timeline

Data Engineer

Aionx11
11.2024 - Current

Statistics Tutor

University of Bridgeport
09.2023 - 06.2024

Junior Data Analyst

Compsoft Technologies
03.2022 - 07.2023

Bachelor of Engineering - Electronics & Communication

Atria Institute of Technology

Master of Science - Computer Science

University of Bridgeport
PAVAN KRISHNA VUNNAM