Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Harshitha Thodathara

Summary

Results-driven Software Engineer with expertise in Python, data analysis, and machine learning. Developed automation solutions reducing workloads by 60%, and implemented data pipelines achieving over 95% success.

Overview

3
3
years of professional experience
1
1
Certification

Work History

Software Engineer

CGI
01.2025 - 06.2025
  • Developed automation for converting BIM Prolog to SWI-Prolog using Python on Azure VMs, reducing manual workload by 60%.
  • Created data pipelines with ADF, SQL DB, Databricks, and Azure Storage, achieving over 95% data flow success.
  • Optimized Snowflake objects including tables, views, procedures, and functions to enhance query performance.
  • Built Power BI dashboards and Excel models to support RFPs and client proposals.
  • Delivered interactive dashboards that provided key insights for strategic decision-making.

Research Assistant

Amrita Vishwa Vidyapeetham University
02.2024 - 12.2024
  • Published three research papers in IEEE and Springer focused on healthcare security, traffic optimization, and medical machine learning applications using Python.
  • Developed and deployed a speech-based machine learning model for Parkinson’s diagnosis, achieving 95% accuracy with Random Forest and XGBoost.
  • Processed over 50,000 medical records using PySpark and SQL, enhancing model performance by 12% through effective feature engineering.
  • Implemented secure data handling protocols and cryptographic measures to protect healthcare information, adhering to industry best practices.
  • Collaborated with research teams utilizing Git for code management, conducting thorough code reviews, and participating in peer review for academic publications.

Machine Learning Engineer

Cybrowse Digital Pvt Ltd
01.2023 - 06.2023
  • Engineered credit risk machine learning models with SQL, PL/SQL, and Python, analyzing 50K+ records and achieving a 25% reduction in analysis time through query optimization.
  • Established comprehensive data cleaning workflows using Pandas and NumPy, improving model reliability by effectively managing missing values.
  • Designed and implemented automated Excel dashboards with Python pipelines, monitoring 15 KPIs for portfolio performance across three teams.
  • Collaborated on code reviews while applying Git for version control to ensure secure, maintainable software solutions.

Data Analyst

M3 Solutions
02.2022 - 07.2022
  • Engineered secure data pipelines for 350K+ patient records using PySpark, Python, and SQL.
    Implemented data governance protocols, enhancing data quality by 15% through systematic validation.
  • Designed automated monitoring systems for 12 KPIs with Python and statistical analysis techniques.
  • Reduced data errors by 18% through technical troubleshooting and quality assurance frameworks.
  • Applied modern software development practices, including Git version control and collaborative code reviews.
  • Utilized Agile methodologies for healthcare data modeling and reporting solutions.
  • Optimized databases through SQL query tuning and indexing strategies, ensuring efficient operations.
  • Improved system performance through strategic database optimization initiatives.

Education

Bachelor of Science - Computer Science

Amrita Vishwa Vidyapeetham University

Skills

Programming languages: Python, SQL, Shell scripting

Big data technologies: PySpark, Spark, Kafka, SnowSQL

Database management: MySQL, SQL Server, PostgreSQL, Snowflake, MongoDB

Workflow orchestration: Airflow

Machine learning techniques: Supervised learning, KNNs, Random forests, XGBoost, Linear regression, Decision trees

Cloud services:

AWS: Lambda, AWS Glue, Athena, Redshift, EMR, EC2, S3

Azure: Azure Data Factory, Databricks, Blob storage, Azure Data Lake

GCP: Composer, BigQuery, Cloud Storage (GCS), Spanner, Dataflow, Dataproc

Development environments: Databricks, Visual Studio Code, PyCharm, DBT, Anaconda, IntelliJ

Data visualization tools: Tableau, Power BI

Python libraries: NumPy, Pandas, Matplotlib, BeautifulSoup, Scikit-learn

Certification

  • Databricks Data Engineer Associate
  • Snowflake SnowPro Core
  • Microsoft Certified : Azure Fundamentals
  • Microsoft Certified : Azure Data Fundamentals

Timeline

Software Engineer

CGI
01.2025 - 06.2025

Research Assistant

Amrita Vishwa Vidyapeetham University
02.2024 - 12.2024

Machine Learning Engineer

Cybrowse Digital Pvt Ltd
01.2023 - 06.2023

Data Analyst

M3 Solutions
02.2022 - 07.2022

Bachelor of Science - Computer Science

Amrita Vishwa Vidyapeetham University
Harshitha Thodathara