Summary
Overview
Work History
Education
Skills
Additional Information
Accomplishments
Timeline
Generic

Sohel Rana

Sacramento,CA

Summary

Research Data Specialist highly motivated to explore the underlying story of data by applying and implementing the statistical knowledge and different programming languages. Love to prepare/pre-process data for a workable format for further analysis or storing by enjoying the painstaking part of data cleaning. Present the underlying story of data with visualization in a simple way that is informative and understandable for the people with even no knowledge of statistics or any programming skills after deep analysis of model fitting and comparing in case of both structured and unstructured data!

Overview

6
6
years of professional experience

Work History

Research Data Specialist I (RDS I)

CDPH | State Of California
12.2022 - Current
  • Weekly Birth Validation and Verification
  • Weekly Infant Crossmatch
  • Daily’s download/upload files from/to Social Security Administration, CDC, Out of State
  • Ensure data quality reported to Social Security Administration
  • Follow up calls with facility/county to confirm data quality on the birth certificate
  • Identify birth duplicate and work on sealing the duplicate birth certificate
  • Report to National Center for Health Statistics on birth data quality and birth duplicates
  • Present birth data quality issues to different stakeholders
  • Develop python script to automate emails to stakeholders to ensure data quality on the birth certificate
  • Develop excel VBA code and excel power queries to automate data to multiple reports
  • Replace SAS code by excel power queries and VBA code
  • Develop power BI dashboard to visualize trend of stakeholder’s responses on birth data quality utilizing over 3 years of historic data
  • Link birth and death certificates for infants and report to National Center for Health Statistics
  • Follow up calls with funeral home/birth facility to identify data discrepancies on birth and death certificate
  • Work with funeral home to create fetal death certificate
  • Develop power BI dashboard to keep track of unlinked birth certificate
  • Contact different state to request birth data
  • Respond to Out of State request on birth and death data
  • Facilitate DQA unit’s weekly check-in meeting
  • Help onboarding new staffs and make sure proper access has been granted to different accounts/sites
  • Train new RDS’s and work closely to make sure smooth transition of task
  • Work with other RDS’s to make sure they can run developed code using necessary software/application
  • Support other RDS and RDA at decision making on various projects of the unit
  • Working on 2021 birth cohort

Research Data Analyst II (RDA II)

HCD | State Of California
05.2021 - 12.2022
  • State Bond Contract Balance Reconciliation
  • Monthly GO Bond Cashflow Report
  • Monthly Financial Management Report
  • Maintain Contract Balance Reconciliation Public Dashboard
  • Modify existing Power BI Data Analysis Expression (DAX) for Published Dashboard
  • Maintain Database (MySQL, Access)
  • Use SQL queries to run Weekly/Monthly Contract Balance Reconciliation Reports
  • Use SQL queries to Automate Source data for Monthly Reports
  • Use Excel Power Queries, Pivot Table, Excel Functions to Automate Source Data into Monthly Reports
  • Develop Excel VBA/Macros to Automate data/formatting into Monthly/Reconciliation Reports
  • Extract/Structure/Combine data from MySQL/Access databases for Monthly Reports using Power BI queries, Excel Power queries
  • Extract and Clean data from Fi$cal/CAPES/SharePoint for further use
  • Use Altair Monarch to extract data from PDF reports

Instructional Specialist (TA)

2u
03.2022 - 07.2023
  • Help professional student with non technical background to solve coding issues during class and office hours

Graduate Research Assistant II (RA II)

Bowling Green State University
08.2019 - 01.2021
  • Imported resident student's interaction data, housing data from Microsoft share point server, Access database
  • Developed and maintained students Microsoft Access database by modification, joining, creating new tables using complex SQL queries
  • Performed data cleaning, data manipulation, predictive model fitting and comparison, reporting
  • Established new database of student's past information, Created new dashboard using resident student's past interaction data, Developed existing dashboard by creating new graphs, charts, tables, etc.

Graduate Teaching Assistant

University Of Nevada Reno
08.2017 - 08.2019
  • Taught Intermediate Algebra, Pre-Calculus, Introduction to Statistics College-level courses for over 60 students
  • Mentored students through office hours and one-on-one communication
  • Increased presentation/communication skill by working with student from different aspects/cultures

Education

Master of Science - Statistics and Data Science

University of Nevada Reno
Reno, NV
12.2019

Bachelor of Science - Statistics, Biostatistics & Informatics

University of Dhaka
Dhaka, Bangladesh
05.2015

No Degree - Data Science

Bowling Green State University
Bowling Green, OH

Skills

  • R, Python, SQL, SPSS, SAS, MATLAB, Apache Hadoop, Py Spark, Altair Monarch
  • MySQL (Database), Microsoft Access (Database), Tableau, Microsoft Word, Microsoft Outlook, Microsoft Teams
  • Power BI Queries, Power BI Functions, Power BI Visuals
  • Excel VBA/Macros, Excel Power Queries , Excel Functions, Excel Pivot
  • Predictive/Forecasting Models, Dashboards Building/Data Visualization, Database
  • Statistics, Data Mining, Big Data Analytics, Machine Learning Algorithms

Additional Information

A temporal and spatial study of crime committed in Chicago | CS 6010 (Data Science Programming ), Fall - 2019, BGSU

  • Data cleaning, data manipulation, visualization, checking data quality
  • Feature selection using Correlation Matrix, Chi-Squared Test, PCA, Extra Tree Classifier, Recursive Feature Elimination, etc., draw a map for the crime hot spots
  • Fitting Neural Network, Random Forest, Logistic , SARIMA etc. Models, Use Over Sampling, compared Confusion Matrix, ROC, AUC, RMSE, etc., Predict the odds of Arrest, Predict the future Arrest counts
  • Research paper review, develop the research methodology, research report writing, creating poster, presentation

Computational Issues and Hyper Parameter optimization in LSTM | CS 7200 (Machine Learning ), Spring - 2020, BGSU

  • Data Preprocessing, dividing data into training & validation set for optimizing the hyper-parameter using k-fold cross validation.
  • Implementing Single LSTM, Stacked LSTM, Bidirectional LSTM, and CNN-LSTM models separately for hyper-parameter searching.
  • Compared model performance and computational time using different optimizer, hidden size, learning rate, dropout rate, batch size, embedding vector size etc, paper review, report writing, creating poster, presentation.

SARIMA Forecasting and Analysis (Chicago Crime Data) | STAT 758 ( Time Series Analysis ), Fall-2018, UNR

  • Data collection, Data preprocess (e.g., fill the missing data, visualize the Arrests count to see the seasonal dip & trim the data for further analysis.), Split the data set into training and testing data set
  • Autocorrelation (ACF) and Partial autocorrelation (PACF) plots, grid search, use the Akaike information criterion (AIC) to select the orders of different forecasting models
  • Fit SARIMA, ARIMA, Simple Exponential Smoothing (SES) etc. forecasting models, Check and use further differencing to make the model stationary ( if the model is non-stationary)
  • Short-term (one-week, two-weeks, etc.) & long-term forecasts (three months, six months, etc.) for Arrest counts
  • Use the testing data, Pearson correlation coefficient (r2), mean absolute error (MAE), Residual plots to assess the forecast quality of fitted models
  • Research paper review, develop the research methodology, research report writing, presentation

Variational Bayesian Inference for Multivariate Normal Distribution | MATH 629 (Topics Applied Analysis), Spring-2019, UNR

  • Generate random data points from standard normal distribution with known covariance
  • Generate mean and covariance for conjugate prior normal distribution, obtain the true posterior mean analytically
  • Employ ADVI using statistical software, RStan
  • Approximate posterior, calculate means and bias, compute statistics for bias
  • Research paper review, develop the research methodology, poster presentation

How does Sepal Width, Petal Length, Petal Width explain Sepal Length in different species of iris plant? | STAT 757 (Applied Regression Analysis), Spring-2019, UNR

  • Checking/Removing Outliers, Influential Observations, Collinearity, Residual Analysis, Sensitivity Analysis
  • Finding the best model that can explain the maximum variability by R-Squared, Adjusted R-Squared

Factors that determine the Gasoline Consumption | STAT 652 (Intro:Regression/Linear Models) , Spring-2018, UNR

  • Checked the assumptions of multiple linear regression model and transformed the data for violation of any assumptions

A Study on Improving Livelihood of Rural Women Through Income Generating Activities in Bangladesh | STAT H - 408 (Research Methodology and Survey Project) , Year - 2014, DU

  • Hands-on experience in collecting primary data, designing the framework of the questionnaire and analyzing those data to deliver valid inferences for assessing the overall status of rural women

Accomplishments

  • Certificate of recognition for Inspire, Passion, and Collaboration efforts from CDPH.
  • Certificate for completing Machine Learning with Tree-Based Models in R from DataCamp
  • Certificate for completing Unsupervised Learning in R from DataCamp
  • Certificate for completing Fundamentals of Bayesian Data Analysis in R from DataCamp
  • Certificate for completing Fundamentals of Visualization with Tableau from Coursera
  • Certificate for completing SQL for Data Science from Coursera
  • Certificate for completing Excel Skills for Business: Essentials from Coursera
  • Graduate Record Examination (GRE): 316 [Verbal: 152 (55th Percentile), Quantitative Reasoning :164 (87th Percentile), Analytical Writing: 3.0 (17th Percentile)]

Timeline

Research Data Specialist I (RDS I)

CDPH | State Of California
12.2022 - Current

Instructional Specialist (TA)

2u
03.2022 - 07.2023

Research Data Analyst II (RDA II)

HCD | State Of California
05.2021 - 12.2022

Graduate Research Assistant II (RA II)

Bowling Green State University
08.2019 - 01.2021

Graduate Teaching Assistant

University Of Nevada Reno
08.2017 - 08.2019

Master of Science - Statistics and Data Science

University of Nevada Reno

Bachelor of Science - Statistics, Biostatistics & Informatics

University of Dhaka

No Degree - Data Science

Bowling Green State University
Sohel Rana