Worked on implementation of workflows to enable cross-cluster search functionality, and ensured changes were robust and simple to implement. Identified solutions to use dependent services APIs within region and cross-region. Created extensive test plans and troubleshooting docs for debugging any issues with CCS functionality.
Identified, designed, and implemented changes to enable billing for service. Coordinated with different teams to understand the process of billing. Worked with the Product owner to get meter definitions for customers. Worked on e2e SKU testing.
Created Spark distribution for v2.4.4 and v3.0 to be used by different services within OCI. Supported sister team by providing changes in distribution based on their requirements. Enabled different services using distro by defining proper branching methodologies and ways of contributing to the distribution.
Developed Spark 2.4.4 and Hadoop 3.0 distribution as a service to run on big data clusters along with Hive, Yarn, Zookeeper, Trino, etc.
Developed a health-checker Kubernetes cron application to monitor health of the Spark thrift server Driver and send the metrics to UI console for customers to monitor the health of their clusters.
Improved control plane codebase to ensure configuration files are realm agnostic which helped remove redundancy in code and made it more extensible, scalable, and less error-prone.
Created Backup/Restore V2 and improved workflows of backup and restore operations by using Block Volume backup functionality which facilitated customers to clean up their old backups when cluster is deleted and ensured correct billing of resources used.
Big Data Software Engineer
Clarivate Analytics
09.2017 - 01.2020
Develop & deploy microservices for parser and harvester platform in Docker containerized environment.
Building secure Data API's using OAuth 2.0(JWT) to interact with NoSQL databases such as MongoDB, Cassandra
Build real-time data pipelines with RabbitMQ and Java Spring Framework.
Worked with Elasticsearch Aggregations framework and build ES queries using Elasticsearch's Java API to analyze logs and relevant data for debugging issues.
Education
Master of Science - Computer Engineering
Stony Brook University
STONY BROOK, NY
12.2016
Bachelor of Engineering - Electrical Engineering
Jabalpur Engineering College
JABALPUR, MP
07.2014
Skills
PROGRAMMING LANGUAGES: Java, Python, JavaScript
ENGINES: OpenSearch, Kubernetes, Docker, Rancher
TOOLS: Terraform, Postman, ELK, Grafana
DATABASES: MongoDB, Apache Cassandra, MySQL, SQL Server