Summary
Overview
Work History
Education
Skills
Timeline
Generic

Sowmya Penumarthi

MO

Summary

Detail-oriented data engineer designs, develops and maintains highly scalable, secure and reliable data structures. Accustomed to working closely with system architects, software architects and design analysts to understand business or industry requirements to develop comprehensive data models. Proficient at developing database architectural strategies at the modeling, design and implementation stages.

Responsive expert experienced in monitoring database performance, troubleshooting issues and optimizing database environment. Possesses strong analytical skills, excellent problem-solving abilities, and deep understanding of database technologies and systems. Equally confident working independently and collaboratively as needed and utilizing excellent communication skills. Organized and dependable candidate successful at managing multiple priorities with a positive attitude. Willingness to take on added responsibilities to meet team goals.

Overview

5
5
years of professional experience

Work History

Big Data Developer

Charter Communications, Spectrum
07.2023 - Current
  • Developed and maintained hadoop-based applications to process and analyze large datasets, improving data accessibility and optimizing query performance
  • Designed, implemented, and optimized complex ETL processes using apache spark, resulting in a significant reduction in data processing time
  • Created hive tables and managed metadata, enabling efficient querying and reporting for business analysts and stakeholders
  • Utilized HDFS to store and manage vast volumes of data, implementing data replication and backup strategies for high data availability
  • Collaborated with cross-functional teams to gather requirements and translate business needs into technical solutions
  • Implemented shell scripts for automating data ingestion, processing, and deployment tasks, reducing manual intervention and improving efficiency
  • Played a key role in performance tuning and optimization of hadoop and spark jobs
  • Assisted in troubleshooting and resolving production issues, ensuring the reliability and stability of data processing pipelines
  • Integrated tidal scheduling tool into ETL workflows, orchestrating data processing and ensuring timely execution of jobs
  • Participates in on-call rotation
  • Proficient in sql with strong knowledge and demonstrated expertise in writing and troubleshooting complex queries for data mining purposes
  • Skillful problem solver, proactively identifying opportunities for improvement through error detection, correction, and root cause analysis
  • Collaborate with data and analytics experts to optimize functionality within data systems
  • Build processes supporting data transformation, data structures, metadata, dependency.

Data Engineer

Genesis Financial Solutions
09.2022 - 07.2023
  • Developed and tested data pipelines in cloud environments, leveraging tools like databricks, scala, and spark, capturing data from delta tables in delta lakes
  • Migrated applications from internal data storage to azure, ensuring efficient data management
  • Created pipelines in azure data factory (ADF) to extract, transform, and load data from diverse sources like azure sql, blob storage, and azure sql data warehouse
  • Implemented proof of concepts for rest API's, enhancing data integration capabilities
  • Collaborated with users to create and modify data visualization dashboards in power bi
  • Assisted in deployments from dev to UAT, and prod, ensuring smooth project progression
  • Developed and optimized complex sql queries, pl/sql procedures, and converted them into ETL tasks
  • Structured required infrastructure for optimal data extraction, transformation, and loading using databricks, ADF, notebooks, and sql technologies
  • Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
  • Conducted extensive troubleshooting to identify root causes of issues and implement effective resolutions in a timely manner.
  • Developed database architectural strategies at modeling, design, and implementation stages to address business or industry requirements.
  • Visualized, managed, and monitored data pipelines using alation.
  • Reviewed project requests describing database user needs to estimate time and cost required to accomplish projects.

JR. Data Engineer

Dataken Technologies
08.2020 - 01.2021
  • Designed and implemented data pipelines, utilizing python and SSIS, to extract, transform, and load (ETL) data from diverse sources, ensuring accuracy and efficiency
  • Leveraged sql queries to manipulate and transform raw data into meaningful insights, improving data quality and usability
  • Collaborated with cross-functional teams to understand data requirements and develop solutions that align with business goals
  • Utilized google sheets and other google suite tools for collaborative data analysis and reporting, enhancing team productivity
  • Developed and maintained data integration processes, optimizing data flow, and reducing processing time by 30%
  • Collaborated closely with cross-functional teams to gather requirements and translate them into actionable insights using advanced analytics methodologies.
  • Enhanced data quality for analysis by cleansing, validating, and standardizing raw datasets.
  • Identified and resolved data inconsistencies, ensuring data integrity throughout the pipeline.

Intern

Research Wallet
07.2019 - 12.2019
  • Assisted in building data pipelines using hadoop.
  • Sorted and organized files, spreadsheets, and reports.
  • Participated in data processing tasks, including data cleansing and transformation.
  • Contributed to the design and development of the task
  • Collaborated with senior team members to troubleshoot and optimize data workflows.
  • Gained valuable experience working within a specific industry, applying learned concepts directly into relevant work situations.
  • Collaborated with senior management on new initiatives to build confidence.

Education

Master of Science - computer science

Kent State University
Kent, OH
05.2022

Bachelor of Science - Computer Science And Programming

KLUniversity
Vijayawada, AP
05.2020

Skills

  • Hadoop Ecosystem
  • Java, python,scala, pyspark
  • nodejs
  • Git Version Control
  • Analytical Skills
  • Data operations
  • ETL development
  • Big Data Processing
  • Data Modeling
  • SQL and Databases
  • Data programming
  • Data Visualization
  • Apache spark
  • Scripting: shell scripting (bash), python scripting
  • Data integration: Tidal, apache nifi
  • Data Pipeline Design
  • Database systems: Mysql, Postgresql, Oracle, Teradata, Azure sql
  • Operating systems: unix, windows
  • Cloud environment: AWS, ADF

Timeline

Big Data Developer

Charter Communications, Spectrum
07.2023 - Current

Data Engineer

Genesis Financial Solutions
09.2022 - 07.2023

JR. Data Engineer

Dataken Technologies
08.2020 - 01.2021

Intern

Research Wallet
07.2019 - 12.2019

Master of Science - computer science

Kent State University

Bachelor of Science - Computer Science And Programming

KLUniversity
Sowmya Penumarthi