Computer Science student seeking to learn and apply my soft and hard skills in a professional environment and contribute to the goals of the company
ML for Diabetes:
Problem & Motivation
Standard RNNs (LSTMs/GRUs) allocate separate bias parameters for each gate and hidden unit, leading to over-parameterization and overfitting—especially in low-data or resource-constrained settings.
Innovative Approach
• Gate-level & layer-level grouping: share bias vectors across all input, forget, output, or per-layer units rather than per unit.
• Learned clustering: introduce a regularization term that automatically ties similar biases into a small number of groups.
• Soft vs. hard tying: explore both gradual (soft) and strict (hard) parameter sharing during backpropagation.
Key Contributions
• Compact RNNs: achieve substantial reductions in bias-parameter count with minimal impact on perplexity (language modeling) or MSE (time-series forecasting).
• Generalization Gains: improved robustness under limited training data by avoiding overfitting.
• Open-Source Toolkit: PyTorch and TensorFlow implementations, complete with scripts for automatic grouping and benchmarking.
Impact & Applications
Enables deployment of high-performance sequence models on edge devices and in any scenario where model size and generalization are critical.
Built a website for CloudTenX
C
Java
Nodejs
Git
Python
Sql