Summary

Overview

Work History

Education

Skills

Certification

Timeline

Sankari Danesh

Summary

Certified AWS Cloud Practitioner with over 9 years of hands-on experience in data and quality engineering, specializing in on-premises and big data technologies. Expertise includes designing dimensional and fact models, developing data acquisition and quality frameworks, and executing cloud data migrations across commercial insurance, entertainment and media, and digital advertising sectors. Proven track record as a primary contact for analytics and business teams, successfully driving data quality initiatives while supporting senior management in making informed, data-driven decisions. Committed to fostering a collaborative environment by enhancing co-workers' business acumen and understanding of data, while adhering to agile methodologies for effective product and program delivery.

Overview

years of professional experience

Certification

Work History

Senior Data Engineer

Paccar Solutions

07.2024 - Current

Project: PowerTrain – Fleet Health Management
Domain: AWS , Snowflake , MS-SQL
Technology: AWS, SQL, Python , Pyspark
Roles and Responsibilities:
Data Architecture & Design:
Designed scalable, cloud-based data architecture for Fleet Health Management application, leveraging AWS services such as Redshift, Glue, and S3.
Developed schema and optimized data models in Amazon Redshift to support analytics, real-time metrics, and historical data comparisons.
Implemented ETL pipelines for extracting, transforming, and loading high-volume fleet data from diverse sources into Redshift.
Glue Job Development:
Built and orchestrated Glue jobs using Apache Spark to process trillions of records, ensuring efficient handling of fleet information, failure modes, and dealer locations.
Optimized Glue job performance by configuring Dynamic Processing Units (DPUs) and partitioning large datasets for parallel processing.
Ensured data integrity by implementing error-handling mechanisms and data validation in Glue pipelines.
API Development:
Designed and implemented RESTful APIs to serve data to the Fleet Health Management application, enabling real-time fleet information access for end users.
Integrated APIs to provide insights on frequently failing components, recommended dealer locations for repairs, and metrics comparisons with similar fleets.
Secured API endpoints using IAM roles and implemented robust authentication and authorization mechanisms.
Fleet Analytics & Insights:
Developed data pipelines to aggregate and analyze fleet performance metrics, identifying patterns and trends in component failures and vehicle performance.
Generated reports and dashboards to compare metrics such as mileage, idle time, and fuel efficiency across similar fleets for benchmarking.
Collaboration & Documentation:
Worked closely with product managers, data scientists, and DevOps teams to define data requirements and system specifications.
Documented architecture diagrams, ETL workflows, and API design for internal and external stakeholders.
Conducted knowledge-sharing sessions to onboard team members to the Fleet Health Management architecture and workflows.
Roles: Data Engineer

Data Solution Engineer

American Family Insurance

09.2023 - 12.2023

● Designed, implemented, and maintained data pipelines for efficient data ingestion, transformation, and loading using PySpark in EC2 instances after cluster configuration is completed in EMR environment.

● Developed Automated Data Validation framework to identify and resolve data quality issues during and after the ingestion process and before the data is being sent to business users.

● Experience in creating and reading DynamoDB configuration parameters for ETL execution and automation script execution.

● Monitored AWS infrastructure and pipelines to ensure smooth and reliable data flow, promptly addressing any issues or bottlenecks.

● Validated ETL data pipelines to ensure accuracy, completeness, and adherence to business requirements and data standards.

● Created event driven data pipelines using AWS Lambda.

● Created data models and data mappings for different stages of data within the data warehouse, including the creation of dimensional and fact tables using SQL and Python.

● Proficient in working with AWS Glue, Step Functions, EC2, and data pipelines for efficient data processing and management.

● Generated ad hoc reports using AWS Glue to meet the data needs of end users, providing them with actionable insights.

● Ensured data validation across all layers of the data ecosystem to maintain data integrity and consistency.

● Collaborated with other teams outside of the Commercial Data Platform (CDP) to support end-to-end testing, spanning application systems to Tableau system integration.

● Played a key role in the architectural design of the CDP data model, providing expertise and insights for optimal data management and performance.

● Prepared a comprehensive list of scenarios covering all layers of data to support thorough testing and validation processes.

● Conducted report validations in Tableau, ensuring the accuracy and integrity of the visualized data.

● Experience in creating and publishing live data sources and extracts in Tableau, utilizing custom SQL and data tables to meet the reporting requirements of end users.

Data Engineer - Automation

Disney Streaming Services

10.2021 - 03.2023

Project: Data Activation - DAF
Domain: AWS
Technology: AWS, Python, Snowflake
Responsibilities:
Enhanced data feeds and supported workflows in Informatica and Active Batch schedules.
Developed SQL queries to support data integration within the existing framework, ensuring compatibility with the new DAF framework.
Created JSON files and formulated SQL statements to facilitate seamless data feeds, aligning with the established framework for the new DAF framework.
Designed and implemented Directed Acyclic Graphs (DAGs) in Airflow to schedule and manage data processing jobs and streamline data feeds.
Utilized Databricks notebooks to compare files across different S3 locations, enabling efficient data analysis and identification of discrepancies.
Worked in MySQL database for watermark settings.
Designed Pyspark SQL queries in Databricks to be executed on Snowflake datasets.
Developed an automation framework using Python and Pytest to validate data feeds migrated via a JAVA application, ensuring accuracy and efficiency in the validation process.
Defined test strategies and test plans for highly complex scenarios, aiming to automate the testing process and streamline operations.
Demonstrated expertise in writing SQL queries, ranging from simple to complex, to extract and analyze data in Snowflake, ensuring efficient data retrieval.
Query optimization in Snowflake SQL for effective data processing.
Utilized Python for loading and processing large files in S3, enabling effective data validation and verification procedures.
Hands-on experience with code deployment automation to lower and higher environment using Jenkins and Gitlab CI/CD pipelines. Used GIT repository to manage and track changes of the code.
Experience in automatically deploying code changes to staging or pre-production environments
Roles: Data Engineer - Automation

Data Solutions Engineer

Homesite Insurance

10.2019 - 10.2021

Project: Commercial Data Platform
Domain: AWS
Technology: AWS, SQL, Python, Tableau
Responsibilities:
Designed, implemented, and maintained data pipelines for efficient data ingestion, transformation, and loading using PySpark in EC2 instances after cluster configuration is completed in EMR environment.
Developed Automated Data Validation framework to identify and resolve data quality issues during and after the ingestion process and before the data is being sent to business users.
Experience in creating and reading DynamoDB configuration parameters for ETL execution and automation script execution.
Monitored AWS infrastructure and pipelines to ensure smooth and reliable data flow, promptly addressing any issues or bottlenecks.
Validated ETL data pipelines to ensure accuracy, completeness, and adherence to business requirements and data standards.
Created event driven data pipelines using AWS Lambda.
Created data models and data mappings for different stages of data within the data warehouse, including the creation of dimensional and fact tables using SQL and Python.
Proficient in working with AWS Glue, Step Functions, EC2, and data pipelines for efficient data processing and management.
Generated ad hoc reports using AWS Glue to meet the data needs of end users, providing them with actionable insights.
Ensured data validation across all layers of the data ecosystem to maintain data integrity and consistency.
Collaborated with other teams outside of the Commercial Data Platform (CDP) to support end-to-end testing, spanning application systems to Tableau system integration.
Played a key role in the architectural design of the CDP data model, providing expertise and insights for optimal data management and performance.
Prepared a comprehensive list of scenarios covering all layers of data to support thorough testing and validation processes.
Conducted report validations in Tableau, ensuring the accuracy and integrity of the visualized data.
Experience in creating and publishing live data sources and extracts in Tableau, utilizing custom SQL and data tables to meet the reporting requirements of end users.
Roles: Data Solutions Engineer

Big Data Engineer

Avvo Inc

07.2015 - 10.2019

Project: Avvo Business Services
Domain: ETL & Big Data
Technology: Hadoop (Hive, Impala), Data Dog, Python
Responsibilities:
Data pipeline creation and maintenance.
Data extraction and loading into Hadoop data warehouse environment.
Helped in analyzing a complex data set, and ideating on the underlying business problem.
Data Migration testing end to end from Netezza to Hadoop.
Various kinds of reports testing, the report that includes major business decisions.
Performed the role of Scrum master for some of the sprints.
Monitoring the data aligning business metrics for any errors / data mismatches using Data Dog.
Creating and deploying workflows for the monitoring metrics.
Involved in Design meetings and contributed valuable inputs for change in the design.
Emerged as a SME across the business services data.
Testing across all layers of the Data Model.
Preparation of Automated Scripts to run regression tests.
Data Monitoring for business metrics using Tableau Reports.
Helped in product process improvement through efficient data monitoring techniques.
AVVO App testing on Mobile devices, and the data flow from mobile app to warehouse.
Roles: Big Data Engineer

Education

Masters - Computer Science

University of Madras

Bachelor of Computer Science -

University of Madras

Skills

Experienced with AWS services including Glue, DynamoDB, and Lambda
Skilled in data processing with Hive, Impala, and Sqoop
Proficient in ETL development and data transformation
Snowflake data warehouse expertise
DAG job creation and scheduling
Proficient in CI/CD pipeline deployment
Streamlined job execution using DAGs
Experienced with Parquet, JSON, and CSV files
Tableau data visualization expertise
Expertise in maintaining data quality
Proficient in Google Analytics
Experienced in collaborative team environments

Analytical problem-solving skills
Experienced in optimizing data workflows with Snowflake SQL
Adept at generating JSON files and formulating SQL statements
Data quality assurance
Automation framework development using Python and Pytest
Test strategy development
Data loading and processing in S3 using Python
Data warehouse development and support
Expertise in big data technologies including Hadoop, Spark, and Google BigQuery
Data governance implementation
Proficient in using defect tracking tools like JIRA and Bugzilla
Skilled in aligning project inter-dependencies and managing milestones

Certification

AWS Cloud Practitioner, 2021-01-01
Certified Scrum Master (CSM), 2022-01-01

Timeline

Senior Data Engineer

Paccar Solutions

07.2024 - Current

Data Solution Engineer

American Family Insurance

09.2023 - 12.2023

Data Engineer - Automation

Disney Streaming Services

10.2021 - 03.2023

Data Solutions Engineer

Homesite Insurance

10.2019 - 10.2021

Big Data Engineer

Avvo Inc

07.2015 - 10.2019

Masters - Computer Science

University of Madras

Bachelor of Computer Science -

University of Madras

Sankari Danesh

Summary

Overview

Work History

Senior Data Engineer

Data Solution Engineer

Data Engineer - Automation

Data Solutions Engineer

Big Data Engineer

Education

Masters - Computer Science

Bachelor of Computer Science -

Skills

Certification

Timeline

Senior Data Engineer

Data Solution Engineer

Data Engineer - Automation

Data Solutions Engineer

Big Data Engineer

Masters - Computer Science

Bachelor of Computer Science -

Similar Profiles

David CochenourDavid Cochenour

Ethan MatthewsEthan Matthews

Soham ThakurSoham Thakur

NUSRATH MOHAMMEDNUSRATH MOHAMMED

Chih Han YuChih Han Yu