Experienced data scientist and engineer with a strong background in building scalable data platforms, end-to-end machine learning pipelines, and cloud-based ETL solutions. Skilled in Python, SQL, and PySpark with hands-on expertise in Databricks, AWS, and Snowflake. Proven ability to translate complex data into actionable insights across healthcare, finance, and scientific domains. Passionate about technical mentorship and education, with experience leading review sessions, guiding projects, and refining data science curricula.
Worked with cross-functional teams across diverse domains to provide data-driven solutions to clients.
Languages & Tools: Python, SQL, R, PySpark, Bash, Git, Jupyter, Tableau
Cloud & Platforms: AWS (Glue, S3, QuickSight), Databricks, Snowflake
Data Engineering: ETL pipelines, metadata-driven architecture, data profiling, Spark, data quality validation
Data Science & ML: Supervised/unsupervised learning, NLP (GuidedLDA, CorEx, Hugging Face), sentiment analysis, topic modeling, model evaluation
Visualization & Reporting: Tableau, matplotlib, seaborn, QuickSight
Other: Cross-functional collaboration, technical mentorship, proposal writing, healthcare & financial data, ARPES & experimental physics