Summary
Overview
Work History
Education
Skills
Timeline
Generic

Susmitha

Lakeville,MN

Summary

Seasoned IT professional with over 10 years of experience in Data Engineering, adept in developing data-intensive applications across diverse sectors including Healthcare, Telecommunications, and Finance. Good experience in Cloud/On-Premise presents distinctive competency in Data Engineering / Data Analytics/ Data Modelling / Data Warehouse /Data Marts/Big Data/ ETL development. Skilled in Python, SQL, Pyspark, AWS Cloud, AZURE Data Lake, GCP, Snowflake, Databricks, Tableau, and BI Reporting with a strong emphasis on Data Engineering and Data Analysis using Big Data tools History of mining, warehousing and analyzing data at the company-wide level. Knowledgeable about the principles and implementation of machine and deep learning. Results-oriented and proactive with top-notch skills in project management and communication.

Overview

11
11
years of professional experience

Work History

Sr.Data Engineer

Elevance Health
06.2022 - Current
  • Apache Spark on Databricks, and Snowflake for data ingestion, transformation, and loading.
  • Architected and maintained a data warehouse solution on Snowflake, ensuring efficient storage, query performance, and data security.
  • Managed cloud data infrastructure on AWS, including S3 for data storage, Redshift for data warehousing, and Lambda for serverless data processing.
  • Integrated diverse data sources including relational databases, APIs, and unstructured data into centralized data lakes on AWS.
  • Utilized Databricks for large-scale data processing, leveraging Spark for distributed computing to handle big data workloads efficiently.
  • Designed and optimized data models in Snowflake, focusing on performance, scalability, and ease of use for analytical queries.
  • Created interactive and insightful dashboards and reports using Power BI, enabling data-driven decision-making across organization.
  • Implemented data quality checks, cleansing routines, and governance frameworks to ensure data accuracy, consistency, and compliance.
  • Developed real-time data processing workflows using AWS Kinesis and Databricks, enabling timely data insights and actions.
  • Optimized the performance of data pipelines and queries in Snowflake and Databricks, reducing processing times and improving resource utilization.
  • Automated data workflows using AWS Step Functions, Apache Airflow, and custom scripting to ensure reliable and repeatable data processing.
  • Collaborated with cross-functional teams to define requirements and develop end-to-end solutions for complex data engineering projects.
  • Worked closely with data scientists, analysts, and business stakeholders to understand data requirements and deliver tailored data solutions.
  • Implemented data security measures and compliance protocols in AWS and Snowflake, ensuring data protection and regulatory adherence.
  • Set up monitoring and alerting for data pipelines and infrastructure using AWS CloudWatch and Databricks monitoring tools, promptly addressing any issues.

Data Engineer/ Data Analyst

Verizon
03.2022 - 05.2023
  • Used a broad spectrum of Azure services including HDInsight, Data Lake, Databricks, Blob Storage, Data Factory, Synapse, SQL, SQL DB, DWH, and Data Storage Explorer.
  • Architected and deployed sophisticated data pipelines leveraging Data Lake, Databricks, and Apache Airflow to enable seamless data integration, transformation, and orchestration.
  • Implemented continuous integration and continuous delivery process using GitLab along with Python and shell scripts to automate routine jobs, which includes synchronizing installers, configuration modules, packages, and requirements for the applications.
  • Developed multiple applications required for transforming data across multiple layers of Enterprise Analytics Platform and implemented Big Data solutions to support distributed processing using Big Data technologies.
  • Responsible for data identification and extraction using third-party ETL and data-transformation tools or scripts. (e.g., SQL, Python) Worked on migration of data from On-prem SQL server to Cloud databases (Azure Synapse Analytics (DW) & Azure SQL DB).
  • Developed and managed Azure Data Factory pipelines that extracted data from various data sources, and transformed it according to business rules, using Python scripts that utilized Pyspark and consumed APIs to move data into an Azure SQL database.
  • Created a new data quality check framework project in Python that utilized pandas.
  • Promotion Sales Project: (Amazon Web Services, Python, Spark, Airflow, Snowflake)
  • The objective was to load data locally in MySQL and then extract that data with airflow scheduler through Python scripts into Amazon Web Services S3 buckets.
  • Transform data in S3 and load data to Snowflake for analysis Extract Transform and Load process for ingesting data into cloud software and analysis operation.
  • Developed and maintained complex data models and schemas in Tableau to support dynamic reporting and self-service analytics.

Data Engineer

Navy Federal
01.2021 - 02.2022
  • Responsible for designing and developing various analytical solutions for gaining analytical insights into large data sets by ingesting and transforming these datasets in the Big Data environment using technologies like Spark, Azure, Sqoop, and HIVE.
  • Successfully integrated data from on-premises (MySQL, Cassandra) and cloud sources (Blob Storage, Azure SQL DB) using Azure Data Factory, transforming data for insights in Azure Synapse.
  • Engineered Spark Scala functions for real-time data mining and reporting. Configured Spark Streaming with Apache Flume for data storage in Azure Table Storage using Scala.
  • Utilized Azure Data Lake and Databricks for data processing, employing Spark Scala scripts and UDFs.
  • Facilitated data movement from Azure Data Lake to Blob Storage and Snowflake, using SnowSQL scripts for business analysis.
  • Designed and deployed engaging Tableau dashboards and established DataMarts and Data Warehouses to support BI and reporting, enhancing decision-making processes.
  • Familiar with data architecture including data ingestion pipeline design, Hadoop information architecture, data modeling, data mining, machine learning, and advanced data processing.

Data Engineer/Data Analyst

Customer Analytics
06.2017 - 07.2019
  • Used GCP Services like BigQuery, Cloud Storage, Dataflow, Dataproc, Pub/Sub, Bigtable, Cloud SQL, Looker, Cloud Functions, Cloud Composer, Data Catalog, Cloud Dataprep.
  • Developed comprehensive reporting dashboards with Google Cloud Platform (GCP) tools to enhance marketing strategies and sales performance.
  • Created real-time analytics dashboards using Looker and BigQuery for immediate visual insights and efficient A/B test analysis.
  • Fostered collaboration with marketing teams using GCP services for campaign effectiveness analysis, increasing ROI.
  • Utilized Python libraries like Pandas and NumPy for querying and data validation against BigQuery databases.
  • Engineered end-to-end data processing pipelines with Dataflow and Apache Beam for efficient data transfer.
  • Developed interactive Tableau dashboards to visualize complex data sets, facilitating easier understanding and decision-making processes.

Data Analyst

Anblicks
08.2013 - 05.2017
  • Analyzed customer behavior data and generated insightful reports to drive business decisions and enhance marketing strategies for Customer Behavior Analysis and Reporting.
  • Utilized AWS S3 to store raw customer data from various sources, ensuring scalable and secure data storage.
  • Employed AWS Lambda with Python scripts to automate the extraction, transformation, and loading (ETL) processes, cleaning and structuring the data for analysis.
  • Managed and optimized the data storage using AWS RDS for structured query capabilities, ensuring efficient data retrieval and manipulation.
  • Conducted comprehensive data analysis using Python libraries (Pandas, NumPy, Matplotlib) to uncover customer behavior patterns and trends, providing actionable insights.
  • Generated detailed reports and presentations in Power BI to communicate findings and recommendations to the marketing and sales teams, enhancing their strategic planning and execution.

Education

Master of Science - Computer Science

Concordia University, St. Paul
Saint Paul, MN
12.2020

Skills

  • Python 3 X, Pyspark, Scala, SQL
  • AWS, Azure, GCP
  • Teradata, Metadata
  • SQL Server, MySQL, PostgreSQL, NoSQL, Oracle Database, MongoDB(NO SQL), HBase, Cassandra
  • Snowflake, Databricks
  • Hadoop (HDFS, MapReduce, Hive, HBase), Apache Kafka, Sqoop, Apache Spark
  • Talend, Informatica, Apache Airflow, Apache NiFi
  • Git, Jenkins
  • Big Data Processing, Data Analysis, Data Migration, ETL development

Timeline

Sr.Data Engineer

Elevance Health
06.2022 - Current

Data Engineer/ Data Analyst

Verizon
03.2022 - 05.2023

Data Engineer

Navy Federal
01.2021 - 02.2022

Data Engineer/Data Analyst

Customer Analytics
06.2017 - 07.2019

Data Analyst

Anblicks
08.2013 - 05.2017

Master of Science - Computer Science

Concordia University, St. Paul
Susmitha