Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Ravi Teja Puvvala

Summary

Results-oriented Cloud Data Engineer with experience in designing and implementing modern data architectures across AWS and Azure platforms. Specialized in building scalable ETL/ELT pipelines using Glue, Azure Data Factory, Databricks, Airflow, and Apache Spark for batch and real-time processing. Proven track record of reducing data processing time by 30% and enabling data lakehouse architectures with Delta Lake and Medallion models. Proficient in Snowflake and Power BI (DAX) to deliver business-ready insights. Adept at leveraging CI/CD pipelines and monitoring tools for robust data solutions in agile, cross-functional environments.

Overview

11
11
years of professional experience
1
1
Certification

Work History

Sr Data Engineer

Concentrix
01.2024 - Current
  • Designed end-to-end cloud data platform on AWS, enhancing data accessibility by 40% and reducing query latency.
  • Extracted and transformed source data from S3, Snowflake, and Delta tables, improving accuracy for analysis.
  • Segregated data into bronze, silver, and gold layers based on medallion architecture for tailored usability.
  • Applied PySpark transformation logic in Databricks, enabling time travel and incremental processing with Delta Tables.
  • Executed client-specific Datavant processes to streamline data integration and enhance flow efficiency.
  • Crafted complex queries in Databricks SQL for validation and comparison across Delta tables, boosting accuracy.
  • Deployed multi-node job clusters in Databricks, optimizing large-scale processing and reducing runtime by 30%.
  • Implemented CI/CD pipelines with Jenkins, automating deployment processes and reducing release cycles by 40%.

Sr Data Engineer

Cleveland Cliff
07.2023 - 12.2023
  • Designed and implemented real-time website traffic analysis pipeline on AWS for efficient data handling.
  • Developed Kafka producers on EC2 to capture website event data, ensuring reliable ingestion.
  • Orchestrated ETL workflows with Airflow DAGs, optimizing S3 storage costs by 20% through partitioning and transformation.
  • Utilized Snowflake for centralized data warehousing, enabling effective querying and reporting on traffic metrics.
  • Built scalable Spark jobs on EMR, enhancing performance and ensuring data reliability.
  • Created interactive Power BI dashboards with drill-through capabilities for actionable insights into key metrics.
  • Established monitoring and alerting systems using CloudWatch and SNS to maintain pipeline stability.
  • Collaborated across teams to support delivery and educate end users on complex data products.

Data Engineer

HCL Technologies
04.2022 - 05.2023
  • Administered Genesys cloud data by integrating external storage systems into Databricks workspace.
  • Established access controls and enforced security protocols within Databricks Notebooks for sensitive ServiceNow ticket data.
  • Designed and implemented a scalable data warehouse using Azure Fabric and Medallion Architecture.
  • Developed scalable PySpark solutions to ingest, transform, and analyze structured and unstructured data from ADLS.
  • Applied strong data modeling techniques in data warehouse environments, implementing star and snowflake schemas.
  • Created data integration and transformation solutions using Azure Synapse Analytics to optimize performance across large datasets.
  • Leveraged Databricks for managing data pipelines, optimizing workflows, and facilitating team collaboration.
  • Integrated Azure Key Vault to secure sensitive credentials used in PySpark applications, enhancing compliance.

Data Engineer

Hyundai Motor India Engineering
05.2020 - 04.2022
  • Designed and maintained scalable ETL pipelines using Databricks, Azure Data Factory, Hive, and Pig for efficient data transformation.
  • Managed data pipelines for batch and real-time processing with Azure Stream Analytics and Hadoop integration.
  • Integrated data from SQL Server, Cosmos DB, and HDFS into unified formats for analytics.
  • Optimized Azure Data Lake, Blob Storage, and HDFS for secure data storage and access.
  • Leveraged Apache Hadoop ecosystem tools to enhance distributed data processing capabilities.
  • Collaborated with technology teams to ensure data integrity during extraction, transformation, and loading.
  • Implemented CI/CD pipelines with Azure DevOps for streamlined deployment of data solutions.
  • Performed rigorous validation checks on transformed data to ensure accuracy and consistency.

Data Engineer/Power BI Developer

Hyundai Motor India Engineering
06.2016 - 04.2020
  • Orchestrated data ingestion from diverse sources, utilizing Amazon S3 and Amazon Kinesis for seamless analytics in AWS data lake.
  • Implemented data governance policies with AWS Glue Data Catalog, ensuring data quality and compliance throughout lifecycle.
  • Architected scalable data warehouse on AWS using Medallion Architecture in Amazon S3 for efficient data organization.
  • Developed PySpark solutions in Databricks to transform and analyze structured/unstructured data, optimizing query execution.
  • Designed star and snowflake schemas in Amazon Redshift, enhancing OLAP query performance for business intelligence.
  • Created data integration pipelines using AWS Glue and Databricks, reducing latency by 30% while enabling comprehensive analysis.
  • Analyzed automotive data to provide insights and developed interactive dashboards in Power BI for visual representation.
  • Utilized diverse visualizations in Power BI to effectively enhance user engagement and present complex datasets.

Research Engineer

Hyundai Motor India Engineering
09.2014 - 05.2016
  • Employed Power BI to analyze various data formats, including Excel and manufacturing databases.
  • Developed visual and page level filters to create detailed reports for car segments.
  • Generated database objects such as indexes and views using T-SQL for SQL Server environments.
  • Created diverse report types, including drill down, drill through, and Sync Slicer.
  • Conducted unit testing to validate Power BI data against source databases.
  • Utilized DAX measures with aggregate functions to enhance data analysis efficiency.
  • Wrote and executed T-SQL scripts for data retrieval and manipulation from relational databases.
  • Assisted in designing and implementing SQL Server database objects such as tables and stored procedures.

Education

Bachelor of Technology (B. Tech) -

Jawaharlal Nehru Technological University
India
01.2014

Skills

  • AWS and Microsoft Azure
  • PySpark and Spark SQL
  • Data lake and storage solutions
  • Serverless computing
  • Data integration and transformation
  • Cloud monitoring and management
  • Database technologies
  • Business intelligence tools
  • Version control and CI/CD
  • Data orchestration frameworks
  • Apache Spark

Certification

  • Databricks Certified Data Engineer Associate
  • Certified Microsoft PowerBI Data Analyst - PL 300

Timeline

Sr Data Engineer

Concentrix
01.2024 - Current

Sr Data Engineer

Cleveland Cliff
07.2023 - 12.2023

Data Engineer

HCL Technologies
04.2022 - 05.2023

Data Engineer

Hyundai Motor India Engineering
05.2020 - 04.2022

Data Engineer/Power BI Developer

Hyundai Motor India Engineering
06.2016 - 04.2020

Research Engineer

Hyundai Motor India Engineering
09.2014 - 05.2016

Bachelor of Technology (B. Tech) -

Jawaharlal Nehru Technological University
Ravi Teja Puvvala