Summary
Overview
Work History
Education
Skills
Timeline
Generic
Himabindu Pulijala

Himabindu Pulijala

Arlington,TX

Summary

Innovative Data Scientist with 4+ years of experience leveraging machine learning, statistical analysis, and predictive modeling to solve complex business problems. Proficient in Python, R, SQL, and big data technologies like Hadoop and Spark for scalable data processing. Skilled in developing and deploying machine learning models (Scikit-learn, TensorFlow, PyTorch) and creating dynamic visualizations with Tableau, Power BI, and Matplotlib. Hands-on expertise in NLP, deep learning, time-series forecasting, and real-time data processing. Strong experience in cloud platforms (AWS, Azure, GCP), API development, and containerization tools (Docker, Kubernetes). Adept at building ETL pipelines, ensuring data privacy compliance (GDPR, CCPA), and delivering actionable insights that drive business growth and operational efficiency.

Overview

5
5
years of professional experience

Work History

Data Scientist

Santander Consumer USA
Dallas, USA
04.2024 - Current
  • Build machine learning models (Python: scikit-learn, TensorFlow, Keras) for financial trend forecasting, credit scoring, and loan risk assessment.
  • Use statistical techniques, regression analysis, and time-series models (ARIMA, Prophet, LSTM) to predict customer behavior, market changes, and financial risks.
  • Automate data pipelines with Apache Spark, Hadoop, and Airflow for ingestion, transformation, and training.
  • Conduct NLP analysis (spaCy, NLTK, Transformers) on unstructured financial data and develop fraud detection systems.
  • Develop data lakes and warehouses (Amazon Redshift, Snowflake) and create dashboards with Tableau, Power BI, and Matplotlib.
  • Stream and analyze real-time financial data using Apache Kafka and Amazon Kinesis.
  • Manage large-scale datasets in SQL and NoSQL (MongoDB, Cassandra) and optimize models with XGBoost and deep learning.
  • Explore AI technologies like reinforcement learning for advanced financial modeling and risk management.

Data Scientist

JCPenney
Plano, USA
02.2023 - 03.2024
  • Analyze customer behavior and transaction patterns to enhance product assortment and store layout (SQL, Hadoop).
  • Build demand forecasting models and inventory replenishment systems using Python, R, ARIMA, and Prophet.
  • Perform price optimization and develop customer lifetime value models, integrating transactional, demographic, and behavioral data.
  • Leverage geospatial analysis (GIS tools) to identify optimal store locations and regional demand.
  • Develop real-time analytics platforms (Apache Kafka, AWS Lambda) to monitor operations and sales metrics.
  • Use Google Analytics and SQL to analyze web traffic and engagement, guiding promotions and website optimization.
  • Create dashboards for senior management with Power BI and Tableau to track sales, inventory, and customer trends.
  • Build fraud detection models using machine learning to secure retail transactions.

Data Scientist

Comcast
Mumbai, India
09.2021 - 12.2022
  • Analyze large datasets to optimize business strategies using SQL, Excel, and Tableau.
  • Design automated dashboards and reports to track KPIs across customer service, operations, and marketing.
  • Perform advanced data analysis, cleaning, and statistical modeling using Python and R.
  • Conduct customer segmentation with machine learning to identify trends and enhance targeting strategies.
  • Develop and optimize ETL pipelines to integrate data from CRM, billing, and operational systems.
  • Conduct A/B testing and analyze product and campaign performance using Google Analytics and Excel.
  • Analyze customer feedback and satisfaction surveys with SQL and Python to identify improvement areas.
  • Monitor operational performance using Hadoop, Big Data technologies, and AWS Redshift.
  • Create financial models for pricing strategies and market share analysis with Python and Excel.
  • Prepare executive reports and presentations for stakeholders using Tableau and PowerPoint.

Data Analyst

Molina Healthcare
Mumbai, India
06.2020 - 08.2021
  • Analyzed large healthcare datasets using SQL, Python, and R to identify trends, support decision-making, and forecast healthcare demands.
  • Built and maintained dashboards in Power BI and Tableau to visualize key metrics for diverse stakeholders.
  • Developed and optimized SQL queries and ETL pipelines for data extraction, transformation, and loading across SQL Server and Oracle.
  • Utilized machine learning algorithms (scikit-learn) to build predictive models for patient readmissions and care planning.
  • Conducted advanced analytics using SAS to assess healthcare costs, claims data, and patient outcomes.
  • Automated data extraction and reporting processes with Excel macros, Python, and SQL, improving operational efficiency.
  • Integrated data from multiple sources via APIs and ensured compliance with HIPAA and regulatory standards.
  • Applied deep learning (TensorFlow, PyTorch) to analyze imaging data, detecting anomalies in X-rays and MRIs.
  • Supported data governance protocols to maintain data quality and integrity across systems.
  • Developed decision-support tools for clinicians and administrators to aid in care delivery and planning.

Education

Master of Science - Data Science

University of Texas At Arlington
USA
01.2024

Skills

  • Programming Languages: Python, R, SQL, Java, Scala, Julia
  • Machine Learning: Supervised and unsupervised learning, Scikit-learn, TensorFlow, PyTorch, Kera’s
  • Data Analysis: Pandas, NumPy, Matplotlib, Seaborn, SciPy
  • Data Visualization Tools: Tableau, Power BI, D3js, Plotly, Dash
  • Big Data Technologies: Hadoop, Apache Spark, Hive, AWS EMR
  • Databases: PostgreSQL, MySQL, MongoDB, Cassandra, Redis
  • Deep Learning Frameworks: TensorFlow, PyTorch, Kera’s, OpenCV
  • Model Deployment: Flask, Fast API, Docker, Kubernetes, AWS Lambda
  • Real-Time Data Processing: Apache Kafka, Spark Streaming, RabbitMQ
  • Data Pipeline Tools: Apache Airflow, Luigi, Prefect
  • Version Control and CI/CD: Git, Jenkins, GitHub Actions, Bitbucket Pipelines
  • Statistical Tools: SAS, MATLAB, SPSS, Stata
  • Feature Engineering and Selection: PCA, LASSO, Recursive Feature Elimination
  • ETL Tools: Talend, Alteryx, Pentaho, Informatica
  • Geospatial Analysis: GeoPandas, QGIS, Google Earth Engine
  • Time-Series Analysis: ARIMA, SARIMA, Prophet, LSTM
  • LLM and NLP Environments: NLTK, SpaCy, LangChain, OpenAI, Hugging Face, GroqAI, Ollama, LangChain Community

Timeline

Data Scientist

Santander Consumer USA
04.2024 - Current

Data Scientist

JCPenney
02.2023 - 03.2024

Data Scientist

Comcast
09.2021 - 12.2022

Data Analyst

Molina Healthcare
06.2020 - 08.2021

Master of Science - Data Science

University of Texas At Arlington
Himabindu Pulijala