Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Sahithya Mannaru

Jersey City

Summary

  • 7+ years of experience in AI/ML engineering, data analytics, and Python development across healthcare, finance, and retail domains.
  • Expertise in developing and deploying Generative AI solutions using LLMs (GPT-4, LangChain, Hugging Face).
  • Successfully built and productionized recommendation systems, underwriting risk models, and advisory automation tools.
  • Proficient in cloud platforms: Google Cloud Platform (Vertex AI, BigQuery), AWS (SageMaker, S3, Lambda), and Azure AI.
  • Designed and maintained scalable ETL workflows using Apache Airflow, Talend, Apache Nifi, and Apache Spark.
  • Strong experience in NLP, sentiment analysis, text classification, and embedding models using Transformers and NLTK.
  • Led CI/CD implementations using Jenkins, Git, Docker, and Kubernetes for automated ML lifecycle management.
  • Developed interactive dashboards and real-time reports using Tableau, Power BI, and Jupyter Notebook for strategic decision-making.
  • Applied advanced statistical and machine learning techniques (Random Forest, Boosting, SVM, ARIMA, Deep Learning).
  • Collaborated closely with stakeholders to translate complex requirements into technical solutions and model pipelines.
  • Skilled in model evaluation, hyperparameter tuning (using Optuna/Hyperopt), and A/B testing for ML optimization.
  • Implemented robust data engineering pipelines integrated with BigQuery, Snowflake, PostgreSQL, and Redshift.
  • Demonstrated ability to lead cross-functional teams, mentor junior engineers, and drive Agile project delivery.
  • Developed and integrated RESTful APIs using Flask and FastAPI for seamless ML model serving and application integration.
  • Improved model explainability and compliance through monitoring, logging, and attention visualization tools.
  • Experienced in deploying large-scale ML applications with robust monitoring, logging, and real-time error resolution.
  • Reduced AI system downtime and latency by implementing real-time monitoring and intelligent alerting systems.
  • Delivered impactful business outcomes by aligning AI initiatives with organizational goals and customer experience strategies.
  • Strong communicator with the ability to present complex AI concepts to both technical and non-technical stakeholders.

Overview

8
8
years of professional experience
1
1
Certification

Work History

AI/ML Data Scientist

Synechron
10.2023 - Current
  • Designed and deployed a GenAI-based advisory assistant using LangChain, GPT-4, and LLaMA2 models.
  • Implemented RAG pipelines using FAISS and ChromaDB to surface insights from unstructured financial PDFs.
  • Built document Q&A systems for wealth management reports with real-time vector search capabilities.
  • Developed scalable APIs using FastAPI for LLM deployment and chatbot integration.
  • Containerized applications using Docker and orchestrated deployment on AWS Lambda and EKS.
  • Integrated OpenAI and Azure AI to automate sentiment analysis on market news feeds.
  • Fine-tuned LLMs for portfolio recommendation scenarios using real customer prompts and advisor input.
  • Constructed CI/CD workflows with Jenkins and Git for automated model updates and prompt versioning.
  • Conducted prompt engineering optimization using LangChain Agents and few-shot learning techniques.
  • Reduced response latency by 30% by caching top-k document embeddings from Pinecone/ChromaDB.
  • Led integration of Generative AI into the bank’s digital channels (IVR, Chat, and Dashboard tools).
  • Used PandasAI and Whisper for multi-modal input support (voice and tabular).
  • Developed auto-tagging and classification systems for financial documents using Hugging Face.
  • Built explainable AI modules with attention visualizers to assist compliance and transparency checks.
  • Collaborated with wealth managers and risk teams to ensure alignment with advisory logic.
  • Developed dashboards with Power BI to visualize user-agent interactions and LLM usage metrics.
  • Implemented monitoring/logging for inference latency, token usage, and hallucination detection.
  • Mentored a team of junior engineers on GenAI implementation best practices and data governance.
  • Participated in internal GenAI research groups to evaluate emerging models (e.g., Mistral, Gemini).
  • Achieved 40% automation of previously manual advisory workflows, increasing advisor capacity

Data Engineer

General Motors
04.2022 - 04.2023
  • Built real-time ETL pipelines using Informatica and Oracle Exadata for investment data ingestion.
  • Optimized queries and data joins in Snowflake to support faster report generation.
  • Automated KPI dashboards for financial forecasting using Tableau and Power BI.
  • Designed audit-ready transformation pipelines using parameterized SQL and data masks.
  • Orchestrated data workflows with Airflow and AutoSys to reduce latency in BI pipeline refreshes.
  • Enabled predictive analysis on fund performance using Python-based backtesting libraries.
  • Supported model risk management efforts by automating validation of historical market data.
  • Collaborated with product teams to centralize data assets for advisory and R&D divisions.
  • Migrated legacy ETL jobs to cloud-native AWS Glue pipelines for operational savings.
  • Integrated S3-based archival storage with analytics reporting for historical trend tracking.
  • Improved metadata documentation and lineage with custom data cataloging scripts.
  • Created parameter-driven reporting pipelines to automate campaign-level fund evaluations.
  • Developed shell scripts to monitor ETL failures and trigger alerting for key investment reports.
  • Maintained high-volume data tables used in executive fund performance summaries.
  • Conducted root-cause analyses on delays in reporting processes across departments.
  • Partnered with Tableau developers to optimize workbook load times by improving extract design.
  • Built investment compliance check dashboards to identify anomalies pre-submission.
  • Created onboarding documentation for new data engineers and analysts.
  • Reviewed data pipeline security and implemented encryption-at-rest in Snowflake and AWS.
  • Reduced report turnaround time by 50% by restructuring business-critical dashboards.

Data Engineer

Quantum Technologies Private Limited
08.2017 - 05.2021
  • Migrated on-premise data systems to AWS, improving scalability and reducing operational overhead.
  • Built Informatica ETL pipelines to handle data ingestion from third-party investment platforms.
  • Automated batch processing of customer financial records using shell scripts and Python.
  • Developed investment portfolio risk dashboards using Power BI and SQL Server.
  • Integrated financial sentiment analysis into reporting pipelines using NLTK and Scikit-learn.
  • Maintained and optimized 20+ daily ingestion workflows feeding internal risk assessment models.
  • Collaborated with front-end teams to design APIs for on-demand insights into market data.
  • Built real-time stock performance alerting system using AWS Lambda and SNS.
  • Constructed modular data models that enabled flexible what-if scenario reporting.
  • Developed validation scripts for pre-submission compliance checks across advisory systems.
  • Streamlined historical fund performance retrieval using partitioned Snowflake tables.
  • Integrated document summarization engines using early Transformer models.
  • Created dashboards to highlight underperforming assets and trigger rebalancing alerts.
  • Maintained robust documentation of ingestion jobs and data model specifications.
  • Participated in regular governance audits and improved logging coverage across pipelines.
  • Conducted internal workshops on best practices for scalable Python-based ETL.
  • Implemented version control and rollback features for ETL mappings using Git and Jenkins.
  • Worked directly with compliance officers to ensure SOC2-aligned data pipeline practices.
  • Reduced ETL failure rate by 40% via retry logic and outlier detection preprocessing.
  • Enabled flexible, low-code dashboard creation by provisioning certified datasets.

Education

Master of Science - Computer Science

Stevens Institute of Technology
05.2023

Skills

    Programming Languages : Python, Java, R, SQL, PLSQL, NoSQL

    Databases : MySQL, MongoDB, Oracle, PostgreSQL, Snowflake, Oracle Exadata

    Cloud Technology : Amazon Web Services (AWS), Azure AI

    Big Data : Apache Hadoop, HDFS, MapReduce, Hive, HBase, Spark (PySpark)

    ML Frameworks : Flask, NumPy, Pandas, Scikit-learn, TensorFlow, Keras, Matplotlib, PyTorch, Seaborn

    Web Development : HTML5, CSS3

    Scheduling & CI/CD Tools : Airflow, GitLab, Jenkins, Kubernetes, Jira, Ansible

    Data Warehouse : Prism, data mapping, Informatica ETL

    Large Language Models : HuggingFace, OpenAI, Llama

    Other Tools : Power BI, Tableau, Excel, AutoSys

Certification

  • AWS Certified Solutions Architect- Associate
  • Azure AI Engineer Associate
  • HackerRank Python Certification
  • IBM Python Certification

Timeline

AI/ML Data Scientist

Synechron
10.2023 - Current

Data Engineer

General Motors
04.2022 - 04.2023

Data Engineer

Quantum Technologies Private Limited
08.2017 - 05.2021

Master of Science - Computer Science

Stevens Institute of Technology
Sahithya Mannaru