Summary
Overview
Work History
Education
Skills
Work Availability
Websites
Timeline
Generic

Suvarna Gadiraju

Data Scientist
McKinney,TX

Summary

Over 8 years of experience designing, developing, and implementing enterprise applications in Python across diverse domains. Skilled in managing the full Software Development Life Cycle (SDLC) using both Agile and Waterfall methodologies to deliver high-quality solutions. Proficient in developing applications using Python 3.x, following best practices for performance, scalability, and security. Hands-on experience building machine learning models using K Means, XGBoost, feature engineering, and model evaluation pipelines for predictive analytics, segmentation, and risk scoring. Experienced in implementing Large Language Models (LLMs) and Retrieval Augmented Generation (RAG) for AI-driven solutions in healthcare, legal, and financial domains. Skilled in fine-tuning pre-trained LLMs for domain-specific applications such as medical transcription, legal document review, and financial report summarization. Proficient in low-code platforms like OutSystems to accelerate development and create efficient web and mobile applications, including API integration and custom connector development. Strong expertise in Python libraries including NumPy, Pandas, Matplotlib, SciPy, wxPython, and Scikit learn for data analysis, machine learning, and visualization. Experienced in web application development using Django, Flask, HTML5, CSS3, JavaScript, jQuery, and Bootstrap. Extensive experience with relational databases (Oracle, PostgreSQL, MySQL, SQLite) and NoSQL databases (MongoDB, Cassandra). Hands-on experience with AWS Cloud services including EC2, S3, CloudWatch, IAM, and Auto Scaling to build scalable and cost-effective cloud solutions. Skilled in test automation, continuous integration (CI), and continuous deployment (CD) using Jenkins to streamline development workflows. Proficient in version control using Git and GitLab for efficient team collaboration and code management. Experienced in building cloud-native solutions optimized for security, performance, and scalability. Strong problem-solving abilities, quick adaptability to new technologies, and a proven track record of delivering solutions both independently and as part of a team. Skilled in creating complex data visualizations and performing graphical analysis using Matplotlib, Pandas, and NumPy. Strong problem-solving abilities, quick adaptability to new technologies, and a proven track record of delivering solutions both independently and as part of a team.

Overview

8
8
years of professional experience

Work History

Data Scientist / Generative AI Engineer

Fort Sill National Bank
05.2024 - Current
  • Engineered and deployed a centralized enterprise-grade data warehouse on AWS Redshift, integrating multisource financial data, including customer transactions, loan applications, credit histories, and account management systems.
  • Designed, orchestrated, and maintained scalable ETL pipelines using Apache Airflow, transforming raw customer and transaction data into analytics-ready assets to support RAG-powered credit scoring, fraud detection, and risk evaluation models.
  • Built and optimized data models for customer segmentation and risk analytics using KMeans clustering, grouping customers by financial behavior and credit profiles to provide personalized banking insights and LLM-powered recommendations.
  • Developed predictive models using Linear Regression to forecast loan default probabilities, customer lifetime value (CLV), and income-based credit limits, integrating statistical outputs with RAG-enhanced real-time transaction intelligence.
  • Created and deployed extended machine learning pipelines in Python for credit risk assessment, fraud propensity scoring, and churn prediction, incorporating LLMs for document understanding and RAG frameworks for dynamic decision-making.
  • Implemented real-time streaming data pipelines with Kafka/Kinesis to ingest transactional, behavioral, and risk signals, enabling adaptive RAG-powered credit assessments and fraud alerts with millisecond-level latency.
  • Leveraged AWS Glue to automate ETL transformations, enforce schema consistency, and deliver structured datasets for downstream ML workflows and LLM-based financial advisory systems.
  • Engineered custom financial data connectors to integrate external sources such as credit bureaus, payment gateways, and loan servicing systems, ensuring high-quality data ingestion for RAG-enabled real-time scoring engines.
  • Wrote advanced SQL queries and developed financial dashboards for loan approval trends, CLV calculations, revenue forecasting, and customer portfolio analysis, enriched with LLM-generated insights for risk and compliance teams.
  • Built MCP-based solutions to enhance model context management, interpretability, and performance across distributed AI systems.
  • Created interactive Tableau dashboards to visualize real-time financial KPIs such as loan portfolio health, fraud risk levels, customer segmentation (via KMeans), and behavioral trends derived from streaming data.
  • Optimized Redshift performance by applying distribution keys, sort keys, and partitioning strategies, reducing query latency and improving response times for RAG-supported analytics and regulatory reporting.
  • Designed and enforced comprehensive data governance policies to ensure GDPR/PCI DSS compliance, secure LLM-based document processing, audit logging, metadata management, and end-to-end encryption.
  • Conducted exploratory data analysis (EDA) to identify patterns, anomalies, and emerging fraud signals, applying statistical testing and visualization to guide model development and enhance risk management strategies.
  • Integrated customer feedback and sentiment data from surveys, reviews, and support tickets, using LLM-driven sentiment classification and clustering techniques to improve digital banking experiences and personalize products.
  • Collaborated cross-functionally with product, compliance, and risk teams to define KPIs, validate ML outputs, and refine fraud prevention strategies using insights from RAG-powered behavioral monitoring systems.
  • Developed automated alerting and anomaly detection systems using streaming analytics and ML models, enabling rapid identification of suspicious transactions, unusual credit activity, and operational risks during peak banking hours.
  • Enhanced platform security by implementing RBAC, IAM policies, CI/CD guardrails, and encryption standards to secure sensitive financial data flowing through LLM and RAG pipelines.
  • Created and implemented new forecasting models to increase company productivity.

AI&ml Engineer /Data Scientist

Gamestop
10.2022 - 04.2024
  • Led the end-to-end design, development, and deployment of enterprise-scale AI, ML, and data engineering solutions, ensuring alignment between business goals and scalable cloud-native architectures within an Agile/Scrum environment.
  • Architected and optimized LLM workflows using GPT and RAG, enabling automated decision support, improved recommendation accuracy, and real-time customer insights through retrieval-enhanced reasoning.
  • Developed and deployed production-grade Python applications and microservices using AWS Lambda, Django REST Framework, and Boto3, providing seamless backend automation and high-availability enterprise integration.
  • Designed, orchestrated, and optimized cloud-native ETL and data pipelines using AWS Glue, Redshift, S3, and Snowflake, delivering high-performance data ingestion and transformation for both structured and unstructured datasets.
  • Implemented LangChain and LangGraph to build intelligent agent workflows, improving automation, contextual reasoning, and operational efficiency in Aldriven business processes.
  • Utilized NumPy, Pandas, and SQL for data preparation, feature engineering, and analytical transformations, supporting downstream LLM-based and machine learning predictive models.
  • Developed and deployed ML models such as KMeans clustering for customer segmentation and XGBoost for classification and risk prediction, incorporating rigorous model evaluation techniques including AUC, F1 score, RMSE, Precision/Recall, and cross-validation.
  • Applied advanced feature engineering strategies (scaling, encoding, binning, domain-driven variable creation) to improve model accuracy, interpretability, and real-time scoring performance.
  • Built automated ML workflows integrated with AWS services, enabling seamless deployment and monitoring of models including XGBoost-based risk estimations and clustering-driven customer analytics.
  • Implemented robust model evaluation pipelines, including drift detection, performance tracking, validation frameworks, and continuous improvement loops within CI/CD workflows.
  • Created real-time operational dashboards using AWS Kinesis, Redshift, and Tableau to track ML/LLM system metrics, ensure model reliability, and optimize business outcomes based on live performance data.
  • Built and maintained CI/CD pipelines using Jenkins, supporting continuous integration, automated model deployment, and version-controlled infrastructure updates.
  • Applied TDD (Test Driven Development) with PyTest, Mock, and Patch to ensure high reliability, testing coverage, and maintainability across AI and data engineering solutions.
  • Collaborated with cross-functional teams (engineering, business, compliance) to deliver integrated, secure, and regulatory-compliant AI/ML solutions aligned with enterprise governance.
  • Mentored junior engineers and data scientists on data pipeline architecture, ML model development, model evaluation, and cloud best practices, fostering technical excellence and innovation.
  • Streamlined data collection methods to minimize analysis errors.

Sr Data Engineer ( Machine Learning)

GATEWAY FIRST BANK
08.2020 - 09.2022
  • Engineered and implemented real-time claims processing pipelines with Apache Kafka, enabling immediate processing upon claim submission.
  • Developed data ingestion and transformation pipelines in Apache Spark, processing claims data for risk assessments and fraud detection.
  • Built data storage solutions using AWS S3 and DynamoDB, ensuring rapid data retrieval during the claims evaluation process.
  • Optimized ETL processes to streamline the extraction of data from legacy systems and integrate claims databases into real-time analytics workflows.
  • Created custom connectors to integrate claims data with external systems, including fraud detection tools and third-party insurance databases.
  • Implemented risk evaluation models using Apache Spark and Python to assess claim validity and calculate risk levels in real time.
  • Built and deployed fraud detection systems using machine learning models trained on historical claims data, reducing fraudulent claims.
  • Developed automated claim categorization systems using NLP techniques to analyze claim descriptions and categorize them by damage type, coverage, etc.
  • Integrated AWS Lambda for serverless processing, enabling automatic scalability with increasing claim volumes.
  • Created interactive dashboards in Tableau to monitor and visualize key claim metrics, such as approval rates, distribution, and fraud detection.
  • Optimized real-time data processing workflows using Apache Kafka and AWS Kinesis, ensuring low-latency claim evaluations.
  • Collaborated with business analysts to ensure claim validation rules were met, maintaining regulatory compliance and improving process efficiency.
  • Built automated notifications using AWS SNS to alert users about claim status updates, reducing manual interventions.
  • Implemented security measures like data encryption and role-based access control with AWS IAM to ensure secure handling of sensitive claim data.
  • Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.

Data Engineer

Jasco Products
10.2018 - 07.2020
  • Designed and implemented a centralized data warehouse using AWS Redshift, consolidating data from multiple customer touchpoints like e-commerce and in-store purchases.
  • Developed ETL pipelines using Apache Airflow, handling the extraction, transformation, and loading of customer data into the data warehouse.
  • Created and optimized data models for customer segmentation and sales analysis, ensuring fast query performance in AWS Redshift.
  • Built custom data connectors to integrate with third-party systems such as customer loyalty programs, payment processors, and inventory management tools.
  • Engineered real-time data pipelines to stream customer transaction data, supporting real-time analytics for improved decision-making.
  • Automated ETL workflows with AWS Glue, improving scalability and reducing manual processing time by transforming raw data into structured formats.
  • Wrote SQL queries to generate reports and insights, including sales trends, customer behavior, and inventory performance.
  • Developed interactive dashboards in Tableau, providing actionable insights for retail managers to track inventory, sales performance, and customer preferences.
  • Optimized performance of AWS Redshift queries through partitioning and indexing large datasets, reducing report generation times.
  • Implemented data governance practices to ensure compliance with regulations such as GDPR, focusing on data privacy and security.
  • Integrated customer feedback from surveys and reviews into the data pipeline, enhancing customer profiles for better decision-making and personalization.
  • Collaborated with business teams to define KPIs and metrics for tracking the success of promotional campaigns, sales strategies, and product launches.
  • Developed automated alerting systems for inventory management, using sales data to notify teams when stock levels reached low thresholds.
  • Ensured data security by implementing role-based access control (RBAC) to restrict access to sensitive customer data.

Python Developer

Aggieland Outfitters
05.2017 - 09.2018
  • Applied Agile Methodology throughout the Software Development Life Cycle (SDLC) to analyze, specify, design, and implement business-driven applications and data solutions.
  • Developed Python-based scripts to automate data processing tasks and enhance data analysis workflows, improving overall productivity and efficiency.
  • Enhanced the user interface by adding features like selection criteria, filters, and navigation options, ensuring an improved user experience for data-driven applications.
  • Optimized data analysis processes by implementing design patterns in Python code, improving performance and increasing reusability for future projects.
  • Designed and implemented dynamic data views, allowing users to customize their data analysis results by adding/removing columns and applying filters.
  • Developed and executed SQL queries, stored procedures, and triggers for data extraction, transformation, and reporting, ensuring accurate and timely insights.
  • Updated and optimized website performance by using AJAX to dynamically load data, reducing page reload times and enhancing user interaction.
  • Modified existing Python/Django modules to support various data formats, improving data import/export functionality for business users.
  • Collaborated with cross-functional teams using Git for version control and issue resolution, ensuring smooth collaboration and integration of data-related tasks.
  • Contributed to continuous integration by integrating data workflows into Jenkins pipelines, ensuring streamlined deployment and regular testing.
  • Ensured efficient data processing and accuracy by handling and analyzing large datasets, helping in decision-making and reporting processes.
  • Applied data visualization best practices to present insights clearly, using tools like Tableau or Excel for effective data storytelling.
  • Managed data security protocols and ensured compliance by ensuring sensitive data was protected during analysis and reporting tasks.
  • Provided support for managing Linux environments for running data analysis scripts, including installation and configuration of necessary tools.

Education

Bachelor's - Computer Science

Andhra University
Andhra Pradesh, India

Master's - Computer Science

UNVA
Virginia, United States

Skills

Machine Learning & AI: TensorFlow, Keras, Scikit learn, PyTorch, Apache Spark MLlib, K Means Clustering, Linear Regression, Logistic Regression, Random Forest, XGBoost, Gradient Boosting, LLMs (GPT based models), RAG

Work Availability

monday
tuesday
wednesday
thursday
friday
saturday
sunday
morning
afternoon
evening
swipe to browse

Timeline

Data Scientist / Generative AI Engineer

Fort Sill National Bank
05.2024 - Current

AI&ml Engineer /Data Scientist

Gamestop
10.2022 - 04.2024

Sr Data Engineer ( Machine Learning)

GATEWAY FIRST BANK
08.2020 - 09.2022

Data Engineer

Jasco Products
10.2018 - 07.2020

Python Developer

Aggieland Outfitters
05.2017 - 09.2018

Master's - Computer Science

UNVA

Bachelor's - Computer Science

Andhra University
Suvarna GadirajuData Scientist