Summary
Overview
Work History
Education
Skills
Websites
Accomplishments
Research
Projects
Timeline
Generic

Nachiket Subbaraman

San Jose,CA

Summary

As a software development engineer at Amazon Web Services, I develop and optimize machine learning solutions for personalization and recommendation systems. I work with PySpark, Glue, Java, and Python to create scalable and reliable cloud-based applications that serve thousands of customers worldwide.

I graduated from the University of Illinois at Urbana-Champaign with a bachelor's degree in Mathematics and Computer Science, where I gained a solid foundation in data structures, algorithms, and software engineering. I also completed several internships at Qolsys and Loadstar Sensors, where I automated testing, architected scripts, and improved metrics. I am passionate about learning new technologies and solving challenging problems that have a positive impact.

Overview

4
4
years of professional experience

Work History

Software Development Engineer

AWS Personalize
01.2022 - Current
  • https://aws.amazon.com/personalize/
  • Improved ML model performance metrics for customers with over 50 item metadata columns by implementing feature compression via PySpark AWS Glue script
  • Increased test coverage of console UI pages by developing automated canary test infrastructure that mimics customer navigations using Webpack server for dev testing, Puppeteer, TypeScript, and NodeJS
  • Fixed throttling bugs raised when enabling UI canaries by setting TPS configuration in control service for console canaries
  • Solved availability risks caused by CPU utilization being greater than 90% and achieved a 4-5x runtime optimization for two APIs that manage data in Redis cache via AWS ElastiCache by migrating them from AWS SWF Java classes to serverless AWS Glue ETL PySpark scripts from Java classes
  • Employed OOP principles by creating an SWF parent class for both APIs to invoke the Glue script and refactored the delete API to use it
  • Reduced latency spikes in high-traffic API requests from 10 seconds to below 1 second using Java by optimizing AWS S3 ListObjects request to retrieve a constant number of objects instead of all objects in the S3 directory
  • Lowered DPU utilization by implementing incremental featurization via DynamoDB filter expression
  • Employed object-oriented programming ideas by refactoring a monolithic Python class into smaller classes
  • Implemented embedding outlier removal in PySpark to improve recommendation quality on multilingual text, model convergence, training time, and metrics
  • Benchmarked the cost incurred by migrating to a new language model with the current model’s costs using SageMaker Batch Transform jobs, clarifying the EC2 instance type and number of instances needed to meet cost and runtime SLA
  • Fixed internal server error bug by explicitly throwing user error when a customer calls DescribeSolution API with a deprecated recipe
  • Fixed training bug triggered when the customer provides insufficient textual data by omitting the textual data path from being passed to the training job
  • Fixed canary throttling bug by adding sleep time between consecutive API requests
  • Backlogged a longer-term fix to replace the numerous API requests with a single request
  • Created runbook to help oncall identify and mitigate red hosts - hosts that have not received security patches by SLA

Software Development Intern

Amazon Web Services
05.2021 - 08.2021

○ Developed train/test split and metrics computation algorithms used in Personalize user-segmentation recipes.

○ Evaluated migration to new EC2 training instances using SageMaker Debugger.

SDE Intern

Qolsys Software
05.2020 - 08.2020

○ Increased automated test case coverage by writing a Python script that replaced automated test method names in the GitHub repository with manual test method names via prefix matching

○ Displayed the percentage of GitHub test cases with and without a specific prefix and the percentage of manual test cases covered and not covered in GitHub via Python matplotlib pie charts

○ Automated RESTful HTTP requests in Java using RestAssured and TestNG Data Providers.

Software Development Engineer Intern

Loadstar Sensors SDE
06.2019 - 08.2019

○ Facilitated customizable purchases by creating two shopping cart websites using JQuery, Ajax, HTML, and Bulma.

○ Constructed Python program from open-source face recognition libraries that screenshots users’ faces using a webcam, saves them to a local directory, and then highlights a box around saved faces on a live webcam.

○ Created a watchdog program that kills another Python program by taking its name as input.

Education

UIUC - Mathematics and Computer Science

Arts and Sciences

Software Design Studio -

Skills

  • Self Starter
  • API Design Knowledge
  • Project Oversight
  • Continuous Deployment
  • Apache Spark
  • AWS DynamoDB
  • Design Patterns and Principles
  • Application Development

Accomplishments

  • Apache Spark, Large Scale Infrastructure, Machine Learning, Operations, Code and System Health, Testing, Debugging, Documentation, Concurrency, Multi-Threading, Data Processing, Eager to Learn and Curious, Ownership

Research

  

Research Assistant with Professor Richard Sowers, Math Department, UIUC                                                           

○ Constructed algorithm to store large amounts of hierarchical web scraped data in a tree data structure

○ Implemented KMeans clustering algorithm to plot wildlife data and compute density metrics

○ Parallelized time-consuming function which computes entropy and density, dividing runtime by number of CPU cores

○ Documented important milestones from GitLab repo commit messages and email threads

Research Assistant at Forward Data Lab, UIUC (https://github.com/nachsub/Keywords-Forward)                                                    

○ Created a searchable index of a corpus of computer science research papers via Whoosh, a search engine Python library

○ Populated another searchable index from which clients can enter search phrases on a Flask web application and retrieve a dictionary ranking of related keywords based on NLTK, a natural language processing Python library

Illinois Geometry Lab Research Project (https://tinyurl.com/yapdbyrj)                                                                                                              

○ Collected data sent for storage in an AWS S3 bucket using AWS Firehose as a delivery mechanism.

○ Transferred the S3 data to AWS Redshift for analytical processing using the SQL copy command. 

○ Displayed water depth vs. time data on a web browser using a local Node.js server connection to the Redshift database.

Projects

PCA Classifier (https://github.com/nachsub/cs361sp20/blob/master/cs361_final_project_part2.ipynb)  

○ Implemented PCA plots on sample data of eigenvalues in sorted order and top two principal components  

MNIST Classifier (https://github.com/nachsub/cs361sp20/blob/master/cs361_final_project_part2.ipynb)  

○ Implemented MNIST classifier via PyTorch trained neural network 

Optimization (https://github.com/nachsub/cs361sp20/blob/master/cs361%20project%20code.ipynb) 

○ Implemented stochastic optimization algorithms for gradient descent ADAM, SGD, and ADAGRAD in Python using Numpy

○ Plotted training loss of each gradient descent algorithm using Matplotlib error bars

Shell (https://github.com/nachsub/cs241_sp2020/tree/master/nks5-master/shell)    

○ Created a Linux shell in C that executes built-in and external commands using fork/exec/wait and signal handling

Brick Breaker (https://github.com/nachsub/cs126-final-project)    

○ Developed a C++ OpenFrameworks game that updates a Firebase database via HTTP JSON and GET requests.

Hack Illinois (https://github.com/openreferral/hsds-transformer)    

○ Added functionality to open source repository for users to input a file path for the output directory in Ruby

Diner Finder (https://github.com/nachsub/DinerFinder)    

○ Find nearby restaurants using the Google Maps API via Android Studio application development

Tick Task (https://github.com/ERiverIllini/TickTask) 

○ Prioritizes assignments by estimated time to complete and due date using MongoDB, Express, React, and Node.js

○ Updates events from users’ calendars to their website via Node.js script 

Naïve Bayes (https://github.com/uiuc-sp19-cs126/naivebayes-nachsub)   

○ Classifies a text file of images of digits, each digit of size 28x28, as digits 0-9 through windows console output using the Naïve Bayes machine learning classifier in C++

Timeline

Software Development Engineer

AWS Personalize
01.2022 - Current

Software Development Intern

Amazon Web Services
05.2021 - 08.2021

SDE Intern

Qolsys Software
05.2020 - 08.2020

Software Development Engineer Intern

Loadstar Sensors SDE
06.2019 - 08.2019

UIUC - Mathematics and Computer Science

Arts and Sciences

Software Design Studio -

Nachiket Subbaraman