Summary
Overview
Work History
Education
Skills
Timeline
Generic

Pawan Narasimha Avvari

North Bergen,NJ

Summary

  • Over 5+ years of experience in the software industry, including 3+ years of experience in Aws cloud services, and 2 years of experience in a Data warehouse.
  • Having experience and highly skilled data engineer with expertise in AWS, Snowflake, Databricks, and various big data technologies. Throughout career, I have gained extensive experience in analyzing, designing, developing, testing, maintaining, and implementing complex Data Warehousing applications for clients in the banking and financial sectors. Proficiency in using the ETL tool Informatica Power Center in both OLAP and OLTP environments has allowed to successfully deliver efficient and reliable solutions.
  • Furthermore, am proficient in using Databricks, Pyspark, EMR, Hive, Glue, Snowflake, MYSQL for big data processing, enhancing capabilities in handling large volumes of data efficiently. With a strong problem-solving mindset and a passion for driving data insights and informed decision-making, am committed to delivering exceptional results in the field of data engineering.

Overview

5
5
years of professional experience

Work History

Data Engineer

Saven Technology Private Limited
Hyderabad, Telangana
10.2020 - 08.2022
  • Having knowledge of interpreting and analyzing data for business needs, data mining and warehousing, implementing data modeling techniques and data visualization, and expertise working with various statistical and analytical tools.
  • I have experience working as a Business Data Analyst wherein I used my skills in conducting interviews for requirement gathering, developing UML diagrams, writing SQL queries (ETL process), creating insightful power BI reports to drive business decisions, creating Business Requirement Documents and giving executive presentations.
  • Design and implement scalable and efficient ETL extract/load strategies using AWS tools in development and production environments.
  • Collaborate with stakeholders on science and engineering teams to build ML platforms, data ingestion processes, and service integrations.
  • Develop code to acquire/transform datasets for machine learning algorithms, analysis and reporting using Pyspark and snowflake.
  • Support field operations by improving data analytics capabilities for associates, integrating new data sources, creating and optimizing pipelines, and enabling reporting.
  • Performed ETL using Aws Data Bricks. Migrated on-premises Oracle ETL process to Aws Synapse Analytics.
  • Worked on python scripting to automate the generation of scripts. Data curation is done using AWS Data bricks.
  • Utilized Databricks Delta Lake, optimized data lake solution, to efficiently store and manage large volumes of structured and semi-structured data, ensuring data integrity, reliability, and ACID compliance.
  • Migrated data processing workflows from on-prem to cloud, utilizing Databricks notebook, and clusters for interactive data exploration, prototyping and job scheduling.

Data Engineer

Wintech Information Service
06.2019 - 09.2020
  • Led successful migration project from AWS Redshift to Snowflake, ensuring seamless transition of data and analytics processes to new platform.
  • Conducted thorough assessment of existing AWS Redshift infrastructure and identified opportunities for optimization and improvement in migration process.
  • Developed and executed data migration scripts and processes, utilizing Snowflake's data loading capabilities, such as COPY command and Snow pipe, to efficiently transfer data from AWS Redshift to Snowflake.
  • Apache Spark or Python libraries to perform advanced data processing tasks, including machine learning algorithms
  • Optimized query performance in Snowflake for complex data processing scenarios by analyzing query execution plans, leveraging query hints, and applying optimization techniques such as clustering and partitioning.
  • Designed and implemented end-to-end data solutions using AWS S3 as a data lake for storing raw and processed data, EMR for big data processing, Snowflake as data warehouse, and Tableau for data visualization and reporting.
  • Implemented data quality checks and validation using AWS Lambda and Airflow to ensure integrity and accuracy of data in S3, EMR, and Snowflake.
  • Developed data ingestion processes using AWS S3 and EMR, leveraging technologies such as Apache Spark to extract, transform, and load data from various sources into Snowflake for further analysis
  • Designed and implemented complex data processing workflows in Snowflake, leveraging its powerful SQL capabilities and scalable architecture to handle large volumes of data.
  • Designed and implemented data archiving and backup strategies using S3 Glacier.

Software Engineer

Wintech Information Service
07.2017 - 05.2019
  • Collaborated with data scientists and business stakeholders to define data requirements and developed data models to support advanced analytics initiatives.
  • Updated old code bases to modern development standards,improving functionality.
  • Designed and implemented MapReduce jobs using Apache Hadoop ecosystem components, such as Hadoop MapReduce, HDFS, and YARN, to process large-scale datasets efficiently.
  • Optimized MapReduce jobs by applying techniques like data partitioning, combiners, and custom partitions, resulting in significant performance improvements and reduced execution time.
  • Extracted, transformed, and loaded data between Hadoop and relational databases using Sqoop, enabling seamless integration and bidirectional data flow.
  • Developed complex SQL queries, stored procedures, and views to extract, transform, and analyze data stored in SQL Server databases.
  • Tuned SQL Server performance by analyzing query execution plans, optimizing indexes, and fine-tuning database configurations.
  • Developed and maintained scalable and efficient data pipelines using AWS services.

Education

Master of Science - Computer And Information Sciences

Pace University
New York, NY
12-2023

Skills

  • Apache Spark
  • Unix Shell
  • Azure (s3,EC2, EMR,Lambda, Glue, Athena, Redshift)
  • Data warehouse (Snowflake, Redshift)
  • AWS Glue
  • RDBMS - Microsoft SQL Server
  • Data bricks
  • Visualization (Tableau, Python libraries)
  • Hadoop Eco System (MapReduce,Hive, Scoop)
  • Programming languages (Scala,Python, Go-lang)
  • Version Control (GitHub, bitBucket)
  • Amazon Web Services Architect, covering resources like S3, EC2,IAM, Databases (Dynamo DB,Redshift), VPC, Lambda, Glue, SQS, SNS, SES, API Gateway, Kinesis
  • Airflow

Timeline

Data Engineer

Saven Technology Private Limited
10.2020 - 08.2022

Data Engineer

Wintech Information Service
06.2019 - 09.2020

Software Engineer

Wintech Information Service
07.2017 - 05.2019

Master of Science - Computer And Information Sciences

Pace University
Pawan Narasimha Avvari