Summary

Overview

Work History

Education

Skills

Timeline

Santosh Jena

Summary

19+ years of experience in Data Engineering, Data Analytics, and Cloud Technologies. Proven expertise in designing and optimizing data pipelines, ETL workflows, and analytical solutions to drive business insights. Skilled in Python, PySpark, AWS, and Hadoop, with a strong background in data warehousing, cloud migration, and DevOps. Adept at managing large-scale data processing, stakeholder engagement, and vendor management while ensuring high-performance, scalable, and cost-effective solutions. Passionate about leveraging data to enable decision-making and innovation.

Overview

years of professional experience

Work History

Senior Data Engineer

TABNER INC

04.2024 - Current

Company Overview: Client: Genentech
Developed and optimized AWS Glue-based ETL pipelines for processing large-scale pharmaceutical data, ensuring high performance and reliability.
Manipulated and analyzed large or complex data sets using relevant techniques.
Utilized AWS SageMaker for building and deploying machine learning models to support drug research, clinical trials, and patient analytics.
Designed and implemented Athena-based query solutions for real-time data analysis.
Automated data ingestion, transformation, and validation using Python and PySpark, enhancing efficiency and data accuracy.
Performed data validation and reconciliation to ensure the integrity and consistency of migrated data within AWS and on-premise environments.
Collaborated with cross-functional teams including data scientists, engineers, and business stakeholders to drive insights and improve operational efficiency.
Analyzed ETL processes to extract, transform, and load pharmaceutical data into AWS Redshift, S3

Senior Data Engineer

TABNER INC

10.2023 - 04.2024

Company Overview: Client: DICKS SPORTING GOODS
Analyzed sales data to identify opportunities for revenue growth and operational efficiency improvements.
Collaborated with cross-functional teams including data engineers, architects, and business stakeholders to ensure successful migration outcomes.
Designed and implemented data pipelines to extract, transform, and load data into Snowflake using Snowflake utilities and ETL tools.
Automated data extraction and reporting processes using Python scripts, reducing manual effort and improving efficiency.
Performing data validation and reconciliation to verify the integrity of migrated data and ensure consistency with source systems.
Analyzing ETL processes to extract, transform, and load data into Snowflake as part of Modernization.

Technical Lead

TABNER INC

03.2022 - 10.2023

Company Overview: Client: Apple
Partnered with cross-functional teams and business stakeholders to design and implement scalable data pipelines using AWS Glue and PySpark, enhancing data integration and processing efficiency .

• Designed and deployed end-to-end ETL workflows leveraging AWS Glue, S3, and Athena to support real-time and batch data processing, improving data availability for analytics and reporting.

• Developed reusable ETL scripts and automation logic to reduce manual intervention, streamline data ingestion processes, and ensure consistent data quality.

• Led optimization initiatives for AWS Glue jobs and Spark performance tuning, resulting in a 40% reduction in ETL processing time and improved job reliability.

• Collaborated with data analysts and data scientists to deliver curated datasets, enabling advanced analytics and machine learning model development.

• Implemented data quality checks and monitoring frameworks to proactively identify anomalies and bottlenecks in data pipelines.

• Delivered performance dashboards and operational KPIs using tools like QuickSight and Redshift, enabling data-driven decision-making for senior leadership.

• Conducted root cause analysis for data pipeline failures, driving the resolution of recurring issues and implementing long-term improvements in data architecture.

Data Engineering Manager

Accenture

11.2015 - 01.2022

Company Overview: Client: JP MORGAN
Designed and developed ETL processes and data pipelines to ingest, transform, and load data from various sources into data warehouses and data lakes.
Implemented real-time data streaming solutions using Apache Kafka and Spark Streaming for processing and analyzing streaming data.
Conducted performance tuning and optimization of data processing workflows to improve efficiency and reduce processing times.
Collaborated with data architects and database administrators to design and implement data models and schemas for optimal performance and scalability.
Provided technical guidance and support to junior data engineers, ensuring adherence to best practices and standards.
Implemented database migrations and data modeling using Django ORM to ensure data integrity and consistency.
Client: JP MORGAN

Technical Lead

WIPRO

06.2010 - 10.2015

Company Overview: Client: TIVO
Led the design and development of automated test frameworks using Python.
Implemented best practices for test automation, including framework architecture, coding standards, and documentation.
Collaborated with cross-functional teams to define test strategies, identify automation opportunities, and prioritize test coverage.
Developed and executed automated test scripts for functional, regression, and integration testing.
Integrated automated tests into CI/CD pipelines for continuous testing and deployment.
Mentored junior automation engineers and provided technical guidance and support.
Client: TIVO

Senior Software Engineer

ITC Infotech

01.2006 - 12.2009

Company Overview: Client: IBM
Scripted test cases in Python, Shell, and Perl to automate system configuration, deployment, and validation tasks.
Collaborated with AIX system administrators and developers to identify test scenarios, prioritize test coverage, and resolve issues.
Integrated automated tests into CI/CD pipelines for continuous testing and deployment of AIX-based applications.
Conducted root cause analysis and provided recommendations for improving AIX system reliability, security, and performance.
Stress testing and performance profiling to assess the scalability and responsiveness of AIX file systems under heavy load conditions.
Client: IBM

Education

Master in Computer Application -

BPUT

ODISHA, India

Skills

Data Engineering
Data Analyst
PySpark
Data Visualization
Python Programming
Hadoop Ecosystem
Data Integration
ETL Processes
Snowflake

Cloud Computing
Kubernetes
Docker
Django
Kafka
SQL
Oracle
GCP
AWS

Timeline

Senior Data Engineer

TABNER INC

04.2024 - Current

Senior Data Engineer

TABNER INC

10.2023 - 04.2024

Technical Lead

TABNER INC

03.2022 - 10.2023

Data Engineering Manager

Accenture

11.2015 - 01.2022

Technical Lead

WIPRO

06.2010 - 10.2015

Senior Software Engineer

ITC Infotech

01.2006 - 12.2009

Master in Computer Application -

BPUT

Santosh Jena

Summary

Overview

Work History

Senior Data Engineer

Senior Data Engineer

Technical Lead

Data Engineering Manager

Technical Lead

Senior Software Engineer

Education

Master in Computer Application -

Skills

Timeline

Senior Data Engineer

Senior Data Engineer

Technical Lead

Data Engineering Manager

Technical Lead

Senior Software Engineer

Master in Computer Application -

Similar Profiles

Priyanka SinghPriyanka Singh

Sujitha KamireddiSujitha Kamireddi

BAPUSAHEB DESAIBAPUSAHEB DESAI

NUSRATH MOHAMMEDNUSRATH MOHAMMED

Chih Han YuChih Han Yu