Nishanth Sudhaharan - Data & Strategy Analyst

Summary

· Analytical and process-oriented Data Engineer with 3 years of experience in building scalable data solutions and real-time analytics pipelines, with expertise in Azure Data Services, Kafka, Spark, PySpark, SQL, and Power BI, leveraging Agile and Waterfall methodologies to deliver actionable insights.

· Built and orchestrated containerized data infrastructure using Docker Compose with services like Zookeeper, Kafka, Spark (Master/Workers), and Cassandra, enabling efficient development, testing, and streaming of event-driven data pipelines.

· Developed streaming data pipelines using Apache Kafka and Azure Event Hubs, integrating with Spark Structured Streaming and Azure Func tions to process and transform high-throughput event data in real-time for downstream analytics and storage.

· Engineered scalable ETL workflows on Azure Databricks using PySpark and SQL, performing large-scale batch and streaming transformations on structured and semi-structured data, and storing results in Azure Synapse and Data Lake Storage.

· Delivered real-time dashboards in Power BI by integrating with Microsoft Fabric and streaming sources like Kafka and Event Hubs, allowing business stakeholders to monitor operational metrics with minimal latency and drive timely decision-making.

· Worked through the entire data science lifecycle by performing exploratory data analysis (EDA), feature engineering, and dimensionality reduction using PCA, and then training and evaluating machine learning models using Apache Spark MLlib, effectively bridging data engineering with model deployment in distributed environments.

Overview

4

years of professional experience

1

Certification

Work History

Data & Strategy Analyst

InPower

03.2024 - 08.2024

Conducted competitor pricing analysis and user segmentation by researching market leaders in online wellness and counseling platforms, influencing a three-tier subscription strategy tailored to InPower’s user base.
Developed a component classification model using Python to evaluate and cluster InPower’s digital services (podcasts, workshops, counseling) into strategic subscription bundles based on feature usage trends, perceived value, and target user engagement metrics.
Estimated per-feature cost and profitability projections using benchmarked data, enabling the stakeholders to assess break-even points and refine monetization strategies.
Supported product development by identifying innovative engagement features from web-based competitor research, aiding feature prioritization for future platform enhancements.
Created visually engaging presentations and data visualizations in Tableau to communicate strategic pricing models and user growth forecasts to InPower leadership.
Collaborated with product leadership and executive team members during sprint reviews and backlog grooming to refine user stories, prioritize feature development, and iteratively align project deliverables with InPower’s strategic goals and stakeholder feedback.
Technology Stack: PySpark, Spark Mllib, Python (pandas, scikit-learn, Tableau, Microsoft Excel, Agile (Scrum)

Data Engineer

Kaashiv Infotech

04.2020 - 07.2022

Designed metadata-driven ingestion pipelines that dynamically adapt to schema changes in source systems, reducing maintenance overhead by 30%.
Implemented alerting and incident workflows using Azure Monitor, Log Analytics, and custom KQL queries to proactively identify data latency or integrity issues.
Managed and monitored ETL pipelines from on-prem SQL-based workflows to Azure-native architectures, improving scalability and reducing processing costs by 40%.
Collaborated with product teams to define data SLAs, DQ rules, and KPIs, embedding governance into the design of data assets early in the lifecycle.
Implemented data partitioning and file optimization techniques to enhance query performance and minimize read latency for analytical workloads.
Diagnosed and resolved indexing issues by identifying non-indexed pages to improve discoverability and performance.
Led internal workshops to train analysts and junior engineers on best practices for using Power BI with composite models on top of Synapse and Fabric datasets.
Technology Stack: Azure Data Factory, Azure Monitor, Azure Synapse Analytics, Azure Databricks, PySpark, SQL, KQL, Power BI

Education

Master of Science - Business Analytics

Seattle University

Seattle, Washington

12.2024

Bachelor of Engineering - Computer Science

Anna University

06.2022

Skills

Programming & Scripting: Python (NumPy, pandas, SciPy, scikit-learn, Matplotlib, Plotly, regular expressions), R (tidyverse, dplyr, ggplot2, caret, forecast, shiny), Scala, SQL, T-SQL, KQL
Cloud and Big Data Platforms: Microsoft Azure (Data Factory Studio, Databricks, Data Lake, Blob Storage, Event Hubs, Azure Functions, Synapse Analytics, Azure ML), Google Cloud Platform (BigQuery)
Data Engineering & Orchestration: Apache Spark (PySpark, Spark SQL, MLlib), Apache Kafka, Apache Airflow, Cassandra, Docker & Docker Compose, Zookeeper, Azure Data Factory, Microsoft Fabric (Event Stream, Eventhouse, Data Activator, OneLake)
Machine Learning: Multiple Linear Regression, Random Forest (Regressor & Classifier), Ensemble Methods (Bagging, Boosting), Neural Networks (ANN, CNN, RNN), Natural Language Processing, Principal Component Analysis, EDA, Feature Engineering, Spark MLlib, Azure ML

Visualization & Business Intelligence: Power BI, Tableau, Kibana
Databases & Storage: Relational Databases (SQL Server, MySQL, Oracle), NoSQL (Cassandra)
Reporting & Analytical Tools: SQL Server Reporting Services, Adobe Analytics, Microsoft Excel (Pivot Tables, XLOOKUP, VLOOKUP), Google Sheets, Microsoft PowerPoint, Google Slides
Development Methodologies: Agile (Scrum), Hybrid Project Management