Summary
Overview
Work History
Education
Skills
Accomplishments
Timeline
OperationsManager
SWAROOP MAMIDIPALLY

SWAROOP MAMIDIPALLY

Summary

Experienced Data Engineer specializing in building and optimizing scalable data pipelines and managing cloud-based architectures. Proficient in technologies like Hadoop, Spark, Kafka, and Azure, with a proven track record of automating data workflows using SQL and Python to enhance processing efficiency. Successfully collaborated with cross-functional teams to transform raw data into actionable insights, supporting key business decisions. Committed to leveraging expertise to develop high-performance data systems that meet enterprise-level requirements.

Overview

6
6
years of professional experience

Work History

Senior Data Engineer

HMS Insurance
10.2023 - Current
  • Developed and optimized ETL processes using IBM InfoSphere, improving data transformation and movement across multiple data sources toenhance decision-making
  • Utilized PyAthon libraries (NumPy, Pandas, Scikit-learn) to clean, analyze, and model data, delivering actionable insights and enhancing data-drivenstrategies
  • Automated workflows and data pipelines using Autosys and PySpark, improving data availability and reducing processing time by 40%
  • Designed and implemented scalable data solutions on AWS (EC2, S3, Redshift), ensuring efficient big data processing and cloud-based storagesolutions
  • Built and maintained complex SQL queries in MySQL and SQL Server, optimizing performance and ensuring data accuracy across applications
  • Worked with Hadoop ecosystem tools (Hive, HDFS, SparkSQL) to store and process large datasets, reducing processing time and enhancing dataaccessibility
  • Enhanced reporting capabilities using SSIS and Microsoft Suite, improving cross-departmental collaboration and increasing reporting efficiency
  • Implemented version control and collaborated on projects using Git, JIRA, and Confluence, ensuring smooth development cycles and code quality
  • Conducted data modeling and validation in R and Python, ensuring data consistency and integrity for analytics and machine learning models
  • Collaborated with cross-functional teams to develop and test data solutions, aligning engineering efforts with business goals and improving dataavailability
  • Supported data analysis using T-SQL, optimizing complex SQL queries to manage and manipulate large datasets, improving business insights.

Data Engineer

Autodesk
09.2022 - 10.2023
  • Designed and optimized scalable data pipelines for efficient ETL processes using Azure Data Factory, Apache Spark, and AWS Glue, reducingprocessingtime by 50% (from 6 hours to 3 hours)
  • Built and maintained cloud-based data systems on Azure, including Azure SQL Database and Azure Data Lake, to ensure scalable data storage andprocessing solutions
  • Architected cloud data infrastructure using Azure Synapse Analytics for seamless data integration and analysis, optimizing cloud-based data solutions
  • Developed and optimized complex SQL queries for data transformation and aggregation, improving query performance by 40% and ensuring dataquality for reporting
  • Managed containerized data workflows with Kubernetes, ensuring scalability, availability, and high performance of data pipelines
  • Integrated third-party APIs using Python for data transformation and loading into cloud systems, streamlining data ingestion processes
  • Worked with SQL and NoSQL databases (e.g., MongoDB, Cassandra) to design efficient data models for diverse business use cases
  • Processed large datasets using Hadoop and Spark, implementing distributed computing models to accelerate data processing and improve handlingefficiency.

Associate Data Engineer

Fission Labs
09.2018 - 06.2021
  • Developed complex SQL queries to extract, clean, and transform large datasets, ensuring data accuracy for reporting and analytics
  • Designed and optimized data pipelines to ingest, process, and store large datasets from multiple sources, ensuring efficient data workflows
  • Optimized database performance through indexing and query tuning, reducing execution times and enhancing data retrieval
  • Integrated Docker with CI/CD pipelines (using Jenkins) for automating testing, building, and deployment of containerized data applications
  • Created and managed database schemas for structured data storage in MySQL, PostgreSQL, and SQL Server, improving querying efficiency
  • Implemented data security and governance practices on Azure, ensuring encryption, access control, and compliance with GDPR and HIPAA
  • Utilized Hadoop HDFS for data storage and SparkSQL for querying, enabling fast and scalable analytics across large datasets
  • Built real-time data integration workflows using Apache Kafka to process live data streams for immediate analysis
  • Collaborated with data scientists and engineers to implement machine learning models on large datasets using Spark MLlib for predictiveanalytics
  • Automated data reporting workflows with SQL and SSIS, reducing report generation time and providing real-time insights to stakeholders
  • Performed data extraction and transformation tasks in SQL Server and MySQL, supporting data-driven decisions and enhancing reporting quality
  • Collaborated with cross-functional teams to gather data requirements and ensure solutions aligned with business goals, providing actionable insights.

Education

Master of Science - Computer Science

Valparaiso University
Valparaiso, Indiana

Bachelor of Technology - Information Technology

Sreenidhi Institute of Science and Technology

Skills

  • Programming Language: Python(NumPy, Pandas,Pyspark, Dash), R, SQL, T-SQL, Scala, Java
  • ETL Tools: Apache Airflow, Apache NiFi, IBM InfoSphere Information Server, Talend, Stitch
  • Database Management: MySQL, PL/SQL, Microsoft SQL Server, MongoDB, Cassandra
  • CI/CD & Scripting : Jenkins, GitLab CI, Bash, Shell Scripting
  • Cloud Technologies: Microsoft Azure, Amazon Web Services, Google Cloud Platform
  • Hadoop/Big Data Technologies: Hadoop, HDFS, Hive, Pyspark, SparkSQL, Kafka
  • Tools: Pycharm, Autosys, Teradata SQL assistant, JIRA, Confluence, Postman, Tableau

Accomplishments

  • Certifications: IBM Data Engineering Professional, Microsoft Certified: Azure Data Engineer Associate, SAS Certified Big Data Professional

Timeline

Senior Data Engineer

HMS Insurance
10.2023 - Current

Data Engineer

Autodesk
09.2022 - 10.2023

Associate Data Engineer

Fission Labs
09.2018 - 06.2021

Bachelor of Technology - Information Technology

Sreenidhi Institute of Science and Technology

Master of Science - Computer Science

Valparaiso University
SWAROOP MAMIDIPALLY