Summary
Overview
Work History
Education
Skills
Timeline
Generic

Veenaja Eleti

Summary

3+ years of experience as a Data Engineer in the fields of Database Development, Data Warehousing and BigData Technologies.Engaged in development of an enterprise-level solution using batch processing Apache Hive and Apache Spark.Understanding of Big Data and algorithms using Hadoop, HDFS, Map Reduce, Hive QL and Apache Spark (PySpark).Strong knowledge of Data Modeling (Facts and Dimensions, Star/Snowflake schemes), Data Migration, Data Cleansing, Data Transformation, ETL Processes and Strategic Data Architecture Designs. Extensive experience in Amazon Web Services (AWS) Cloud services such as EC2, S3, Redshift.Utilized Databricks on Apache Spark to design and optimize data processing workflows .Analyzed various reports, and dashboards using Tableau and Power BI Visualizations.Developed and implemented ETL processes to ensure efficient data extraction,transformation, and loading.

Implemented CI/CD pipelines on AWS using Terraform, Jenkins and Shell Scripting for automated builds.Capable of version control systems like Github.

Overview

6
6
years of professional experience

Work History

Data Engineer

Nike
Oregon, Usa
06.2023 - Current
  • Implemented simple-to-complex transformation on streaming data and datasets. Worked on analyzing the Hadoop cluster and different big data analytic tools, including Hive, Spark, and Python.
  • Developed PySpark Streaming by consuming static and streaming data from different sources.
  • Developed Airflow DAGs in Python by importing the Airflow libraries. Utilized Airflow to schedule automatically trigger and execute data ingestion pipeline.
  • Led the migration of scheduling and orchestration workflows from Apache Airflow to Databricks Workflows, improving maintainability, and reducing infrastructure overhead.
  • Reduced operational costs by deprecating Airflow’s external infrastructure, utilizing Databricks’ managed scheduling, and auto-scaling clusters.
  • Successfully restructured ETL pipelines originally orchestrated in Airflow to run natively within Databricks, leveraging Delta Lake and Spark optimizations.
  • Developed and maintained reports, dashboards, and metrics to track key performance indicators (KPIs).
  • Designed both 3NF data models for OLTP systems and dimensional data models using star and snowflake schemas.
  • Cleanse, preprocess, and transform raw data into usable formats.
  • Collaborated with cross-functional teams to understand requirements, and deliver data-driven solutions.

Data Engineer

Amazon
Bangalore, India
09.2020 - 08.2021
  • Evaluated ETL requirements to identify areas for enhancement and change, ensuring comprehensive analysis and understanding of project needs.
  • Designed ETL/ELT integration patterns using Python with PySpark on AWS infrastructure and orchestrated migration of on- premises SQL Server data to Amazon S3 using AWSGlue.
  • Implemented SQL views atop fact and dimension tables in Amazon Redshift for streamlined reporting, addressing performance issues in Spark and SQL Scripts, proficient in Joins, Group, and aggregation.
  • Implemented real-time data processing solutions by integrating Elasticsearch with EMR clusters and Amazon Redshift,
    enabling efficient indexing and search capabilities for streaming data. Leveraged Python and PySpark to transform andenrich data streams before storing them in Elastic search, enhancing data discovery and analysis capabilities.
  • Worked with unstructured, semi-structured and structured data, including file formats such as Parquet, CSV, XML, and JSON, in different Scenarios of building data pipe lines using AWSGlue and Amazon Kinesis.
  • Developed SQL scripts to securely access database credentials from AWS Secrets Manager and IAM, maintaining strict data security standards within AWS Glue, and addressing data validation concerns by analyzing complex SQL queries.

Data Analyst

Assetmonk Pvt Ltd.
Hyderabad, India
08.2019 - 07.2020
  • Analyzed and tracked data to prepare forecasts and identify trends.
  • Collected business intelligence data from industry reports, public information or purchased sources.
  • Generated standard or custom reports summarizing business, financial or economic data.
  • Developed data visualizations and dashboards to track business metrics using Tableau.
  • Assisted in identifying opportunities for improving data collection and analysis processes.
  • Extracted data to build metrics or create dashboards, tracking success toward KPIs.
  • Conducted comprehensive research and data analysis to support strategic planning and informed decision-making.

Education

Master of Science - Computer Science

University of Dayton
Dayton, OH
05-2023

Skills

  • Python, SQL, Spark, Shell Scripting, HiveQL
  • Hive, HDFS, Hadoop, YARN, MapReduce
  • Apache Spark, Apache Airflow
  • AWS S3, EMR, Redshift, Athena, and Delta Lake
  • Databricks, Delta Lake
  • Snowflake, AWS Redshift, and Teradata
  • PyCharm, VS Code
  • Data modeling, data warehouse, data quality assurance, data compliance

Timeline

Data Engineer

Nike
06.2023 - Current

Data Engineer

Amazon
09.2020 - 08.2021

Data Analyst

Assetmonk Pvt Ltd.
08.2019 - 07.2020

Master of Science - Computer Science

University of Dayton
Veenaja Eleti