
Data Scientist specializing in advanced analytics and machine learning solutions that drive business impact. Engineered models that optimized inventory management, resulting in over $2.3M annual cost savings, while developing interactive dashboards for executive-level insights. Strong background in cloud platforms and big data technologies.
Programming Languages: Python, R, SQL, T-SQL, PowerShell, UNIX Shell Scripting, Java, Scala
Machine Learning & AI Frameworks: Scikit-learn, XGBoost, TensorFlow, Keras, PyTorch, Prophet, SAS, PySpark, MLflow, Hugging Face
Cloud Platforms & Services: AWS (S3, Glue, Redshift, EC2, EMR, SageMaker, Lambda), Azure, Google Cloud Platform (GCP), Snowflake
Big Data Technologies: Hadoop, Apache Spark, Apache Kafka, Apache Beam, Databricks, Dask, Apache Hive, Apache Flink
Database Management Systems: MySQL, PostgreSQL, Oracle, MS SQL Server, HBase, Teradata, MongoDB, Cassandra, Google BigQuery, Azure SQL
Data Visualization & BI Tools: Tableau, Power BI, Matplotlib, Seaborn, Plotly, D3js, Looker, QlikView
ETL & Data Pipeline Tools: Apache Airflow, AWS Glue, Apache NiFi, Talend, Informatica, SSIS, Data Factory
Version Control & DevOps: Git, GitHub, Bitbucket, Jenkins, Docker, Kubernetes, CI/CD Pipelines, GitLab
Statistical Analysis & Testing: A/B Testing, Hypothesis Testing, Statistical Modeling, Time Series Analysis, Bayesian Statistics, ANOVA
Operating Systems & Development: Linux, Windows, macOS, Jupyter Notebooks, Docker, Virtual Environments, Agile/Scrum Methodologies, Model deployment, Cross-functional stakeholder engagement
Autonomous Data Analyst Agent | LangChain, OpenAI, SQLite, Streamlit
• Developed an AI-powered autonomous data analyst agent using LangChain and OpenAI GPT models that enables users to query structured data using natural language without SQL expertise.
• Designed and implemented an intelligent text-to-SQL pipeline that automatically generates, executes, and validates SQL queries against SQLite databases, delivering accurate business insights in real time.
• Built conversational memory capabilities to support multi-turn interactions and contextual follow-up questions, improving user experience and analytical workflow efficiency.
• Developed an interactive Streamlit web application with dynamic data visualization and formatted analytical responses for business users.
• Leveraged Python, Pandas, SQLAlchemy, and OpenAI APIs to automate data analysis workflows, reducing manual querying effort and enabling self-service analytics.