Data Scientist with 3+ years of experience building and deploying production computer vision systems in healthcare. Specialized in GPU-accelerated video pipelines, real-time inference, and end-to-end ML infrastructure. Proven track record delivering measurable latency, cost, and accuracy improvements across 15+ production models serving 20+ hospitals.
Overview
3
3
years of professional experience
Work History
Data Scientist
Surgical Safety Technologies Inc.
02.2023 - Current
Built end-to-end face deidentification system - created custom SAM2-powered annotation tool with pre-computed GPU embeddings for CPU-only analyst machines, enabling 50,000+ image labels in 3 weeks, trained RT-DETR v2 model, and deployed to production reducing analyst time from 40 min to 1-2 min per clip and enabling same-day video retrieval; extended to CPR/emergency detection by applying rapid-increase thresholds on temporal people counting patterns
Optimized for multi-target deployment - rebuilt GPU batch pipeline with NVIDIA pynvVideoCodec for zero-copy GPU decode-inference-encode loop achieving 4x speedup (1.8x → 0.4x realtime), and compressed model via distillation and pruning for real-time Lambda inference (11s → 2.8s, 1024MB footprint) at minimal cost increase
Developed end-to-end audio analysis system for surgical safety checklist identification, optimized parakeet-tdt-0.6b-v2 via custom ONNX inference and deployed LightGBM segment classifier with temporal positioning, achieving 0.85-0.95 recall and reducing analyst annotation time by 24% over thousands of monthly cases
Developed two-stage video deidentification pipeline for endoscopic surgical feeds - implemented K-fold cross-validation for data cleaning (identifying mislabeled frames via high log-loss disagreement), boosting model F1 from 0.93 to 0.96, then applied bidirectional AutoGluon-tuned XGBoost temporal smoothing on model logits achieving 0.99 F1, deployed with coarse-to-fine inference strategy (2s interval scan + 0.1s precision blurring)
Refactored training and inference pipelines with NVIDIA DALI framework - replaced PyTorch DataLoader achieving 5x training speedup (20% → 100% GPU utilization), and replaced cv2 video decoding loop with DALI Video Loader enabling near-zero overhead multi-model inference