Summary
Overview
Work History
Education
Skills
Timeline
Hi, I’m

SHIVAPRASAD SANDRALA

Data Engineer
Powell,OH

Summary

Highly skilled and dedicated Data Engineer with over 5 years of experience in software analysis, design, development, and implementation of Cloud and Big Data solutions. Proficient in leveraging technologies such as Big Query, Spark, Scala, Hadoop, and Oracle Database to build and maintain robust data pipelines.

Extensive expertise in developing data models, pipeline architectures, and providing ETL solutions for project models. Managed end-to-end operations of ETL data pipelines using Matillion on AWS Cloud Services and Azure Data Factory on Azure Cloud Services, ensuring seamless data ingestion, Data Processing/Transformation,Data Curation.

Proven ability to design and specify Informatica ETL processes, optimizing schema loading and performance. Skilled in ETL architecture design and implementation, consistently delivering high-performance solutions.

Certified in software engineering concepts, with hands-on experience in system design, application development, testing, and operational stability. Proficient in coding using modern programming languages and database querying languages, ensuring efficient and maintainable code.

Proficient in utilizing Python for data manipulation, analysis, and scripting, enabling efficient data processing and transformation. Skilled in PySpark, leveraging the power of Apache Spark for distributed data processing, machine learning, and real-time analytics. .

Strong troubleshooting and problem-solving skills, capable of identifying and resolving complex software issues. Exhibits excellent communication and collaboration skills, enabling effective teamwork and coordination.

A driven and motivated professional, consistently striving for excellence in data engineering, delivering scalable and high-quality solutions to meet business needs.

Overview

5
years of professional experience

Work History

New York Life Insurance Co
Powell, OH

Data Engineer
2023.08 - Current (2 education.years_Label & 1 education.month_Label)

Job overview

• Design and setup Enterprise Data Lake to provide support for various uses cases including Analytics, processing, storing, and Reporting of voluminous, rapidly changing data.
• Responsible for maintaining transactional data in the source by performing operations such as cleaning, transformation and ensuring Integrity in a relational environment by working closely with the stakeholders & solution architect.
• Creating SQL Plus scripts and packages to generate comprehensive reports.
• Developing and automating Shell scripts to streamline processes and eliminate manual tasks.
• Adapting existing logics or developing new ones to meet evolving customer requirements. Managing monthly data transfers from mainframe systems to Oracle databases.
• Collaborated with business stakeholders to analyze requirements and develop customized SQL logic, ensuring system alignment with evolving business needs.
• Proficient in Databricks data streaming tech architecture, with a strong understanding of building Analysis Services reporting models.
• Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and from RDBMS to HDFS.
• Experienced in connecting to various data sources in Databricks, importing data, and transforming it for Business Intelligence purposes.
• Leverage Azure Databricks to migrate on-premises data to the cloud, optimizing data processing and analytics capabilities.
• Develop and execute data pipelines using Azure Databricks to transform and load data into cloud-based data warehouses or data lakes.

  • Streamlined workflow processes by automating repetitive tasks using Unix shell scripting.

Salesforce

Cloud Data Engineer
08.2022 - 08.2023

Job overview

  • Currently leveraging Spark Context, Spark SQL, Data Frames, and Pair RDD's to perform large-scale data processing and analysis, ensuring efficient data manipulation and transformation.
  • Proficient in Spark Streaming and Kafka integration, enabling real-time data processing and analysis for streaming applications, and implementing data pipelines for continuous data ingestion.
  • Utilizing HBase as a NoSQL database, ensuring high-speed data storage and retrieval for real-time and big data applications.
  • Extensively working with Spark Streaming APIs to develop and deploy real-time data processing applications, enabling timely insights and decision-making.
  • Experienced in using Alteryx and Matillion for data integration and ETL processes, ensuring seamless data flow and transformation across various data sources and systems.
  • Implementing data warehousing solutions using Amazon Redshift and Snowflake data warehouse, providing scalable and high-performance storage and retrieval capabilities for analytics and reporting.
  • Proficient in Python and PySpark for data engineering tasks, including data processing, data cleaning, and feature engineering, ensuring efficient and scalable data analysis.
  • Expertise in developing and optimizing SQL queries, ensuring efficient data retrieval and manipulation for various data engineering tasks.
  • Experienced in working with BIGQUERY, AWS S3, and Azure Blob storage for data storage and retrieval, enabling seamless integration and accessibility of data across different cloud platforms.
  • Proficient in Scala programming language for developing Spark applications, providing strong functional programming capabilities for distributed data processing.
  • Skilled in migrating data from Oracle to Snowflake, ensuring smooth data transfer and maintaining data integrity throughout the process.
  • Experienced in Continuous Integration and Continuous Deployment (CI/CD) practices, ensuring streamlined and automated deployment of data engineering solutions.
  • Familiarity with containerization using Docker, enabling efficient deployment and management of data engineering applications and environments.
  • Demonstrated experience in designing and implementing scalable and efficient data engineering solutions, optimizing performance and ensuring high data quality standards.
  • Passionate about staying up to date with the latest advancements in data engineering and continuously expanding knowledge and skillset in Python, PySpark, and Scala.
  • Collaborated with cross-functional teams to gather and analyze business requirements, translating them into technical specifications for data engineering solutions, and delivering projects on time and within budget

Cloud Infra IT Solutions

Data Engineer
01.2022 - 07.2022

Job overview

  • • Utilized advanced data processing and cleansing techniques to ensure data quality and integrity, leveraging tools such as Informatica and SQL Server (On-Prem) for efficient data transformation and manipulation.
    • Designed and implemented data pipelines using AWS Lambda functions and AWS S3, enabling seamless and automated data ingestion from various sources into data lake.
    • Implemented data warehousing solutions using AWS Redshift, enabling scalable and high-performance storage and retrieval of structured and semi-structured data for analytics and reporting purposes.
    • Collaborated with cross-functional teams to gather and understand business requirements, translating them into technical specifications for data engineering solutions.
    • Experienced in working with SQL databases, using Python and Scala to write efficient SQL queries for data retrieval and manipulation.
    • Worked closely with data scientists and analysts to enable advanced analytics and machine learning capabilities on data engineering platforms, leveraging power of big data technologies.
    • Collaborated with stakeholders to define data architecture strategies and roadmaps, providing technical expertise and guidance on selection and implementation of appropriate data engineering technologies.
    • Conducted performance tuning and optimization activities to enhance speed and efficiency of data processing and analysis workflows.

DXC Technology

Data Engineer
08.2018 - 01.2021

Job overview

  • Conducted thorough analysis of business requirements, documenting and translating them into actionable insights. Created comprehensive process and system flow charts to visualize the implementation plan.
  • Designed and implemented a cutting-edge Big Data Analysis system, leveraging Tableau as the primary dashboarding tool. Architected the system to efficiently handle large volumes of data, ensuring optimal performance and data integrity.
  • Led the redevelopment of a centralized enterprise data warehouse by reverse engineering existing reports. Streamlined data storage and retrieval processes, enhancing overall system efficiency and reliability.
  • Developed multiple complex Extract, Transform, Load (ETL) processes and Cubes to extract data from diverse sources using tools such as SSIS, SSAS, and .Net. Ensured seamless data integration and maintained data consistency throughout the system.
  • Spearheaded the design and implementation of end-to-end data solutions on the Azure platform, leveraging services such as Azure Data Factory, Azure Databricks, and Azure Synapse Analytics.
  • Developed scalable and efficient data pipelines using Azure Data Factory, ensuring the smooth and reliable movement of data from various sources to the target systems.
  • Designed and implemented data ingestion processes, including data extraction, transformation, and loading (ETL), using Azure Data Factory and Azure Databricks.
  • Utilized Azure Synapse Analytics (formerly SQL Data Warehouse) to build high-performance data warehousing solutions, enabling advanced analytics and reporting capabilities for the organization.
  • Successfully developed and optimized data pipelines using Python, PySpark, and Scala, ensuring seamless data integration and transformation across various sources and formats.
  • Leveraged Azure Data Lake Storage and Azure Blob Storage to efficiently store and manage large volumes of structured and unstructured data, enabling seamless data access and retrieval.
  • Developed and maintained data pipelines using the Azure ecosystem, including Azure Databricks, Azure Data Lake Storage, and Azure Synapse Analytics, for seamless and scalable data processing and analysis.
  • Demonstrated strong problem-solving skills and the ability to troubleshoot complex data issues, ensuring the stability and reliability of Azure data solutions.
  • Actively kept up to date with the latest developments in Azure data engineering, continuously expanding knowledge and skillset through training and certifications.

Education

Chicago State University
Chicago, IL

Master of Science from Computer Science
05.2022

University Overview

GPA: 4.0/4.0

  • Received Graduate award for academic excellence.

Mahatma Gandhi University
Telangana, India

Bachelor of Technology from Electronics and Communication Engineering
05.2018

University Overview

GPA: 71.72


Skills

  • Programming : Python, PySpark, Shell Scripting
  • Big Data: Apache Spark, Hadoop, HDFS, MapReduce, Hive, Oozie, HBase, Cloudera
  • Database Technologies : MySQL, PostgreSQL, Oracle, NoSQL
  • Data Warehousing : Amazon Redshift, Microsoft Azure, Snowflake, Amazon Dynamo DB, Postgre SQL, Amazon S3, Teradata, Amazon RDS,
  • Data visualization and reporting tools : Tableau, Power BI, Amazon Quick Sight
  • ETL tools : Informatica, Apache Airflow, SSIS, Hadoop, AWS Glue, Azure Data Factory, Matillion, Alteryx Terraform, ETL, GitHub, JIRA, Rally, Confluence, Jenkins, Jupyter Lab, IntelliJ, Databricks, Palantir Foundry
  • Operating Systems: Windows, Linux/macOS

Timeline

Data Engineer

New York Life Insurance Co
2023.08 - Current (2 education.years_Label & 1 education.month_Label)

Cloud Data Engineer

Salesforce
08.2022 - 08.2023

Data Engineer

Cloud Infra IT Solutions
01.2022 - 07.2022

Data Engineer

DXC Technology
08.2018 - 01.2021

Chicago State University

Master of Science from Computer Science

Mahatma Gandhi University

Bachelor of Technology from Electronics and Communication Engineering
SHIVAPRASAD SANDRALAData Engineer