Strategic Data Innovator & Senior Biostatistician specializing in bridging complex data science with actionable clinical insights. Delivered impactful clinical evidence generation by architecting AI-driven foundations and integrating ML/AI tools with multimodal data. Focused on enhancing regulatory-aligned decision-making and optimizing trial designs to accelerate drug development pipelines.
Overview
13
13
years of professional experience
Work History
Data Innovation Lead, Product Development
Roche/Genentech
South San Francisco, California
01.2025 - Current
Strategic AI Foundations & Target DiscoveryTrial Innovation & Evidence generation & Financial StewardshipOrganizational AI Transformation & Leadership
Architected the AI Data Foundation for gRED, integrating RWD from 100,000+ patients across three oncology TAs; delivered 9 Analysis-Ready Datasets (ARDs) that serve as the backbone for foundational disease models.
Directly enabled target validation for 15+ RWE studies (NSCLC, CRC, BC), serving as the primary RWD expert for early-stage drug development decisions and stratification factor confirmation.
Defined Phase 3 Obesity trial criteria (CT-388) by providing the definitive evidence on GLP1 discontinuation; work directly shaped the 6-month inclusion criteria for a high-priority global program.
Co-led development of PROGEN, an AI-driven protocol generator that reduced development time significantly.
Led technical feasibility assessment of IQVIA Pharmetrics Plus dataset, advising against renewal based on ROI analysis, projected $1M in annual savings.
Technical Lead for AI-centric platform evaluations (AtroposHealth, hopeAI), serving as the bridge between clinical leaders and AI practitioners to integrate of Generative AI tools to automate clinical evidence generation, reducing technical debt and manual oversight.
Spearheaded "AI Unconference," convening 100+ practitioners and executives to define enterprise roadmap for "Big Idea" AI innovations.
Led the RAAN SSF Chapter to close knowledge gaps between early research & late stage organizations promoting a unified AI value proposition across the organization.
Senior Data Scientist/Real World Data, Product Development
Roche/Genentech
South San Francisco, CA
03.2022 - 01.2025
Directed a Real-World Data (RWD) safety signal detection study in KRAS-G12C NSCLC patients to evaluate competitor drug (Sotorasib) liver toxicity, directly informing the clinical design and safety monitoring strategy of an upcoming Phase 3 trial to preemptively mitigate delays.
Led statistical design and analysis of oncology RWD studies integrating Caris NGS molecular diagnostics with clinical outcomes for KRAS G12C NSCLC and HR+/HER2- breast cancer, informing biomarker stratification strategy for Roche phase 1 & phase 2 oncology trials.
Spearheaded methodology study using multiple imputation and inverse probability of censoring weights to address missing data in longitudinal outcomes (weight & A1C) within claims data, informing washout period for Roche CT-388 obesity trial and presenting findings to FDA at Statbolic 2026.
Led a comparative effectiveness study using Target Trial Emulation (TTE) on Flatiron Health real-world data, leveraging clone-censoring weights to eliminate immortal time bias and estimate survival outcomes (BMFS, OS, rwPFS) for metastatic breast cancer treatments.
Led transportability study for Herceptin in HER2+ metastatic breast cancer, utilizing U.S. Flatiron Health data to transport efficacy estimates to the UK with ESTHER data, demonstrating survival benefit in UK population and supporting HTA reimbursement discussions.
Co-led the building of a clinical trial simulation platform to evaluate and integrate an accelerated historical trial data and RW Data to calculate the probability of success of late-stage Roche trials and enhance the speed of internal decision-making for clinical programs.
Built analytic-ready datasets (ARD) combining Caris NGS profiling, clinical outcomes, and RWD across breast, colorectal, and NSCLC programs to support Roche's Foundational AI models, enabling diagnostic performance modeling, biomarker subgroup analyses, and predictive modeling at scale.
Developed frameworks for external control arms using causal adjustment methods and RWD-based estimators.
Published innovative work on transportability and bias-adjusted estimation supporting regulatory decision-making.
Scientist/Biostatistician in Research Unit
23andMe
Sunnyvale, CA
09.2020 - 02.2022
Developed predictive models integrating polygenic risk scores from gwas analyses with non-genetic data for oncology and metabolic disease, enhancing insights from gwas studies, environmental, and sensor data.
Conducted statistical analyses to address missingness, selection bias, and confounding in large observational cohorts, bolstering validity of studies on polygenic risk scores for diagnostic predictions.
Data Science Intern in US Medical Affairs Unit
Roche/Genentech
South San Francisco, CA
06.2020 - 09.2020
Developed predictive survival models with claims data to inform treatment strategies for oncology and infectious disease programs.
Applied Target Maximum Likelihood Estimation (TMLE) and super learning ensemble models to evaluate treatment impact of early surgery on healthcare burden in Periocular basal cell carcinoma patients.
Junior Researcher & Course Instructor
Stanford University School of Medicine
Stanford, CA
09.2016 - 09.2019
Taught Stanford Principles of Statistics in Health Research to 50 students; planned lessons and tutorials, graded evaluations, and achieved 98% pass rate.
Applied Bayesian and machine-learning methods to bridge treatment effects from clinical trials to real-world populations, contributing methodological innovations relevant to diagnostic generalizability and clinical utility.
Estimated impact of longitudinal drug adherence on CD4 count and viral suppression among adult HIV-infected patients using Kaiser Permanente Northern California electronic health records.
Directed project to evaluate efficacy of cognitive tests predicting cognitive decline in individuals with Alzheimer disease.
Authored three publications as first author for three peer-reviewed journals in AIDS, Cognitive behavioral neurology, and PM&R.
Access Operations & Emerging Markets Intern
Gilead Sciences
Foster City, CA
06.2018 - 09.2018
Developed statistical and epidemiological models assessing Hepatitis B incidence and diagnostic needs in 50+ countries, informing public health strategies.
Junior Investigator, Clinical Trials Unit
New York Presbyterian Hospital
New York, NY
05.2015 - 05.2016
Recruited and randomized 90+ human subjects for AIDS Clinical Trial Group (ACTG) to advance understanding of treatment efficacy.
Conducted a case-control study of 521 patients, assessing the effect of antiretroviral therapy regimens on body mass index of HIV-infected patients.
Intern
World Health Organization
Geneva
04.2013 - 06.2013
Developed mathematical model to monitor access to HIV medications in low-income countries, enhancing data-driven decision-making.
Education
PhD - Epidemiology & Biostatistics
Stanford University
Palo Alto, CA
11-2020
MPH - Infectious Disease Epidemiology
Columbia University
New York, NY
05-2016
MSc - Clinical Immunology
University of Cape Town
Cape Town , South Africa
01-2014
BSc - Chemistry & Molecular Cell Biology
University of Cape Town
Cape Town, South Africa
12-2011
Skills
Product development
Product development strategy
New product development
Healthcare strategy
Product strategy
Clinical trial design
Real-world evidence frameworks
Biostatistics
Selection bias adjustment
Causal Inference Methods
Survival models
Real-world evidence
Time-to-event analyses
Targeted maximum likelihood estimation (TMLE)
Bayesian methods
Propensity weighting
Adjustment for selection bias
Transportability
FDA methodology
Oncology EHR
Molecular profiling
Claims datasets
Genomics
Regulatory Evidence
SAP development
Machine learning
R
Python
SQL
Git
Stan
RAG
Machine learning
Statistical modeling
Selected Publications
Renal Flares and Health-Related Quality of Life Among Patients with Lupus Nephritis: A Post Hoc Analysis of Control Arm Data from the LUNAR Phase 3 Clinical Trial,Rheumatology and Therapy 2026
Estimating the effect of initiating early maintenance endocrine therapy on brain metastases-free survival and other clinical outcomes in patients with HER2+/HR+ mBC without brain metastases: A target trial emulation." Clinical Cancer Research, 2026
Transportability of Overall Survival Estimates in Metastatic Breast Cancer, Value in Health, 2024
Transporting Real-World Evidence Across Populations in HER2+ mBC, Value in Health, 2024
Adherence & Viral Suppression Modeling Using EHR Data, PLOS One, 2022
Empirical Bayes Approaches to Transportability, JSM, 2022
Semantic Memory in the Clinical Progression of Alzheimer Disease, Cognitive Behavioral Neurology, 2017
Certifications And Awards
LangChain for LLM Application Development by DEEPLEARNING.AI, 2025
Building Agentic RAG with Llamaindex by DEEPLEARNING.AI, 2025
Certificate in Mathematical Modeling, University of Washington, 2013
Clinical Trial & Vaccine Design, HVTN, University of California San Francisco, 2012
Lead Engineer – Product Development & Tool Development at Bajaj Electricals LimitedLead Engineer – Product Development & Tool Development at Bajaj Electricals Limited