Summary
Overview
Work History
Education
Skills
Websites
Projects
Personal Information
Timeline
Generic

DILIP KUMAR

Summary

4+ years of experience as a Data Analyst with strong proficiency in Python, SQL, and R for data analysis, automation, and modeling. Skilled in building interactive dashboards and reports using Power BI, Tableau, and advanced Excel (VLOOKUPs, Pivot Tables, VBA). Hands-on experience in ETL development and data integration using SSIS, Apache Airflow, and Informatica. Proficient in working with databases including MySQL, PostgreSQL, MongoDB, SQL Server, AWS Redshift, and Snowflake. Experienced in cloud technologies such as AWS (S3, EC2, Lambda, Glue), GCP (BigQuery, Dataflow), and Azure Data Lake. Applied machine learning models like Logistic Regression, Random Forest, and LSTM to support predictive analytics. Strong in data wrangling, EDA, and visualization using Pandas, NumPy, Seaborn, Matplotlib, and Plotly. Knowledgeable in data warehousing, data governance, and data modeling for structured and unstructured datasets. Familiar with compliance and regulatory standards including HIPAA, GDPR, and SOX. Effective communicator with experience working in Agile/Scrum environments using tools like Jira. Adaptable across Windows, Linux, and Mac OS with solid version control and scripting experience.

Overview

6
6
years of professional experience

Work History

Data Analyst

Globe Life
TX
08.2024 - Current
  • Analyzed health insurance claims and policy data using SQL and Python (Pandas, NumPy), generating cost trend reports that identified a 15% increase in chronic care claims over a 12-month period.
  • Designed ETL pipelines using Apache Airflow and AWS Glue, automating ingestion and transformation of multi-source datasets (claims, provider, EHR) into Amazon Redshift for downstream analytics.
  • Created dynamic, real-time dashboards in Power BI and Tableau, visualizing KPIs such as claim turnaround time, denial rates, and risk scores, increasing visibility and response time for underwriters by 40%.
  • Developed and optimized PostgreSQL queries and stored procedures to improve claims data aggregation performance, reducing report latency by 35%.
  • Conducted predictive analytics using scikit-learn and Spark MLlib, forecasting high-cost patients and potential policy lapses to support proactive intervention strategies.
  • Used Matillion to build modular, reusable ETL components, standardizing data ingestion from legacy insurance systems and improving data consistency across lines of business.
  • Performed data cleansing and profiling in Talend to standardize demographic and policyholder attributes, resulting in a 28% increase in match accuracy across reporting datasets.
  • Integrated external public health datasets (CDC, CMS) with internal claims data using Python and AWS Lambda, enriching patient profiles and supporting risk adjustment modeling.
  • Led the development of HIPAA-compliant data pipelines on AWS S3 and Redshift, ensuring secure handling of PHI and supporting internal data governance initiatives.
  • Collaborated with actuarial and underwriting teams to deliver ad hoc and scheduled reports using Excel Power Query, SSRS, and R, aiding in mortality and morbidity trend analysis for product pricing reviews.

Data Analyst

Arvest Bank
Tx
07.2023 - 07.2024
  • Built ETL pipelines using Python (Pandas, NumPy) to clean, merge, and transform structured financial data from MySQL and Excel, automating daily ingestion workflows.
  • Developed automated scripts using Python to extract data from internal APIs and databases, validate schema consistency, and load into a centralized data repository for analysis.
  • Optimized SQL queries for large financial datasets using MySQL, improving data retrieval speed for Tableau reports by over 35%, enabling near real-time performance dashboards.
  • Performed A/B testing and hypothesis analysis using SciPy and StatsModels in Python, supporting data-driven experimentation in financial communication strategies.
  • Built visualizations with Matplotlib, Seaborn, and Plotly to explore trends in client transactions and behavior, supporting marketing and outreach segmentation decisions.
  • Engineered automated reporting workflows in Tableau, sourcing data directly from MySQL and processed Excel outputs, reducing manual reporting effort by 50%.
  • Used Excel VBA to automate financial report generation and integrate source files with Tableau dashboards, minimizing manual reconciliation errors.
  • Implemented data quality checks and anomaly detection scripts in Python to flag inconsistencies and missing values in transaction logs and client engagement data.
  • Connected Python scripts to MySQL databases using SQLAlchemy for scalable querying and transformation, used in batch and streaming workflows.
  • Collaborated via Git, Jupyter Notebooks, and Google Suite, maintaining well-documented analysis codebases and presenting findings to non-technical stakeholders with clarity.

Data Analyst

Legato Health Technologies
India
05.2019 - 07.2021
  • Designed data models and developed Tableau dashboards to monitor drug performance, inventory levels, and distribution efficiency, ensuring timely availability of essential medications across healthcare facilities.
  • Analyzed production and supply chain data to identify bottlenecks and optimize inventory management, directly supporting patient safety and regulatory service-level agreements (SLAs).
  • Built and maintained ETL pipelines using Python, streamlining the flow of manufacturing and logistics data to ensure timely, accurate reporting for quality control and compliance audits.
  • Wrote and optimized SQL queries to extract and transform large datasets from clinical supply systems and pharmaceutical production databases, improving reporting efficiency and data reliability.
  • Implemented rigorous data cleansing and transformation processes using Python, SQL, and Excel, enhancing data consistency for reporting on drug batch quality, distribution timelines, and packaging metrics.
  • Collaborated with cross-functional teams including production managers, healthcare compliance officers, and IT staff to ensure secure and consistent data integration across systems supporting drug manufacturing and distribution.
  • Performed Exploratory data analysis (EDA) and statistical modeling (e.g., regression, trend analysis) to identify risk patterns in delayed shipments, production variances, and forecast healthcare product demand.

Education

Master of Science - Information System and technology

University of North Texas
Texas, US
05.2023

Bachelor of Technology - Electronics & Instrumentation Engineering

CVR College Of Engineering
Hyderabad, India
05.2019

Skills

  • Python and R
  • Machine learning and predictive analytics
  • Data cleaning and ETL processes
  • SQL databases (MySQL, PostgreSQL, SQL Server)
  • Data visualization (Tableau, Power BI)
  • Cloud platforms (AWS, Google Cloud Platform, Azure)
  • Data warehousing (Snowflake, Redshift)
  • Automation scripts and tools (Apache Airflow, FastAPI)
  • Statistical analysis (NumPy, Pandas, SciPy)
  • Predictive modeling and data mining
  • Data governance and quality management
  • Agile methodologies (Scrum, SDLC)
  • Development environments (Visual Studio Code, PyCharm)
  • Operating systems (Linux, Windows, Mac)

Projects

Stock Market Prediction Using Machine Learning, Conducted in-depth Exploratory Data Analysis (EDA) on historical stock market data, identifying relevant features and trends. Implemented machine learning models such as Random Forest, Gradient Boosting, and Long Short-Term Memory (LSTM) networks to predict stock prices and market movements. Nations Mental Health Using Twitter Sentiment Analysis, Analyzed sentiment for user-generated tweets using effective libraries such as Tweepy and NLTKP python module like Text Blob. Implemented Naive Bayes classifier to produce results, perform analysis and various operations on the big data using RapidMiner.

Personal Information

Title: DATA ANALYST

Timeline

Data Analyst

Globe Life
08.2024 - Current

Data Analyst

Arvest Bank
07.2023 - 07.2024

Data Analyst

Legato Health Technologies
05.2019 - 07.2021

Master of Science - Information System and technology

University of North Texas

Bachelor of Technology - Electronics & Instrumentation Engineering

CVR College Of Engineering
DILIP KUMAR