Summary
Overview
Work History
Education
Skills
Timeline
Generic

VAMSHI SAKINALA

TX,USA

Summary

Experienced Data Engineer with 5+ years of expertise in SQL Server development, Data Engineering, and ETL processes. Proficient in managing large-scale datasets, developing and optimizing data pipelines, and automating processes using Python, PySpark, Databricks, and PowerShell. Strong understanding of the full development lifecycle, including design, testing, and documentation. Adept at improving data processes for enhanced scalability and efficiency on both on-premises and cloud platforms.

Overview

6
6
years of professional experience

Work History

Data Engineer

Capgemini
01.2023 - Current
  • Supported SQL Server Databases, optimizing stored procedures to enhance data processing and retrieval efficiency, aligning with system compliance policies
  • Developed and maintained Databricks Notebooks for data loading into Delta tables, integrating Azure Linked Services for seamless data migration and transformation
  • Leveraged PySpark and Python to optimize ETL pipelines, ensuring efficient data extraction, transformation, and loading (ETL) from PostgreSQL to Snowflake
  • Implemented PowerShell scripts to automate data integration tasks within Azure Data Factory, improving operational efficiency and compliance adherence
  • Utilized Azure Data Factory to orchestrate and monitor data pipelines, ensuring high reliability and responsiveness in data processing workflows
  • Conducted design review sessions and provided leadership in testing/validation, ensuring the integrity and scalability of data solutions across the enterprise
  • Managed data extraction and integration using REST and ODBC protocols, facilitating seamless data flow between SQL Server, Databricks, and other data sources
  • Collaborated with cross-functional teams to design and implement infrastructure improvements, automating processes and optimizing data delivery for business needs.

Data Engineer/ Data Analyst

Legato Technologies
07.2020 - 07.2021
  • Migrated legacy SQL Server databases to Azure, improving scalability, security, and data accessibility across the organization
  • Developed Python and PySpark scripts to automate data extraction and transformation, enhancing the performance and reliability of data pipelines
  • Created and optimized stored procedures in SQL Server, significantly improving query performance and data retrieval speed
  • Integrated Databricks with Azure SQL databases, streamlining data processing and enabling real-time analytics through optimized ETL workflows
  • Designed and implemented complex data models in Databricks, ensuring efficient handling of large, complex datasets for machine learning applications
  • Utilized PowerShell scripting for automation, enhancing the efficiency of data transfer processes and system maintenance tasks
  • Conducted extensive testing and validation of data pipelines, ensuring compliance with internal data governance and quality standards
  • Developed Power BI dashboards to visualize data trends and insights, supporting strategic decision-making processes within the company.

Data Engineer/ Data Analyst

Genpact
10.2018 - 06.2020
  • Developed and maintained SQL Server databases, creating efficient stored procedures and scripts to support robust data warehousing systems
  • Built and optimized ETL processes using SSIS, integrating SQL Server with Azure Databricks for enhanced data transformation and analytics capabilities
  • Leveraged PySpark within Databricks to process large-scale datasets, improving the performance and scalability of data pipelines
  • Utilized PowerShell to automate routine database maintenance and data migration tasks, reducing manual intervention and error rates
  • Implemented advanced SQL querying techniques to support complex data analysis, improving the accuracy and timeliness of business intelligence reports
  • Designed and managed data models in SQL Server, employing best practices in database design to support efficient data retrieval and processing
  • Developed and deployed Power BI dashboards, visualizing key business metrics and driving data-driven decision-making across departments
  • Collaborated with IT and business teams to identify and implement process improvements, enhancing data pipeline efficiency and reliability.

Education

Master of Science - Business Analytics

University of New Haven
New Haven, CT
12.2022

Bachelor of Commerce - Accounting & Finance

Osmania University
Hyderabad, TG
12.2019

Skills

  • Languages: Python, SQL
  • Python Libraries: NumPy, Pandas, Matplotlib, Seaborn
  • Cloud Platforms: Azure (Azure Data Factory, Databricks, Logic Apps), AWS (Glue, S3, EMR)
  • Data Visualization: Power BI, Tableau
  • Big Data Technologies: Hadoop, Spark, Kafka, Snowflake
  • Development & Tools: Git, GitHub, Agile Methodology, CI/CD pipelines, API Development
  • RDBMS Database Technologies: SQL Server, Azure SQL Data Warehouse, Snowflake
  • Data Integration: ETL Pipelines (PySpark, Spark SQL, AWS Glue)

Timeline

Data Engineer

Capgemini
01.2023 - Current

Data Engineer/ Data Analyst

Legato Technologies
07.2020 - 07.2021

Data Engineer/ Data Analyst

Genpact
10.2018 - 06.2020

Master of Science - Business Analytics

University of New Haven

Bachelor of Commerce - Accounting & Finance

Osmania University
VAMSHI SAKINALA