● Building and Support Scalable Data Pipelines: Involved in support and maintenance of data pipelines on the data platform of the organization. Implementation of internal projects for Big Data processing by using Scala, Parquet, Delta, Databricks Spark, Kafka, MongoDB, AWS and Jenkins.
● Generated entities aka data products in clean data lake that can be used across organization and downstream consumers.
● Executed horizontal scaling strategy by creating replica AWS data servers enhancing system availability, preventing data loss, and efficiently handling increasing data volume
● Implemented strategy to save AWS costs for the business unit.
● Ensured synchronization, eventual consistency, and fault tolerance between the main AWS data servers and replicas, enhancing data replication reliability infra
●Building Scalable Data Pipelines and Web Dashboards: Developed scalable automated data pipelines and Webdashboards using Scala, Java, PySpark, SQL, DynamoDB on top of Apache Spark with mechanisms for Extract Transform Load (ETL) processing of billions of shopping records and optimized existing data load process by 25%
●Enhanced customer’s Value-for-Money Shopping experience by creating a trending products widget response in Java, used Docker for containerization and deployed it on beta env using CI/CD pipeline
●Automation of Bill Settlement: Slashed legacy intercarrier settlement costs 25% by automating voice billsettlements using Blockchain and Python, while ensuring the continuation and enhancements of services
●Fraud Detection System: Achieved 20x reduction in false positives and improved turn-around time by 2 hours byconceptualizing and developing fraud detection system using Hadoop, Python, Spark, Rest API, MongoDB,Node.js, and React.js and machine learning to detect unauthorized activity in >30 million calls
●Distributed Database Query Appliance (IBM Netezza): Incorporated query optimizations using C++, Linux, SQL, Shell and resolved critical bugs for different hardware specifications of data warehouse appliance, thereby reducing severity 1 and 2 defect backlogs by 90%, beating the inflow
●Cloud based Machine learning applications: Replaced existing manual approach of anomaly detection using machine learning in Python, as a result, discovering new anomalies in 3 categories of demographic distribution data
●Cloud Edge Computing: Hosted pretrained machine learning models on AWS using Amazon SageMaker enabling the end-to-end machine learning, thus significantly reducing model building complexity
Results-oriented Software engineer with 9+ years of experience specializing in Software Development. Possess a strong background in software development and a record of accomplishment of successfully delivering high-quality solutions. Skilled in problem-solving, collaboration, and working in fast-paced environments. Proven ability to translate complex business requirements into technical solutions.