Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

HARATHI SURYA PATCHIPALA

Fremont,CA

Summary

With over 7 years of experience in Data Engineering and Data Science technologies, I offer expertise in the following areas:


  • Experience in developing data pipelines and providing ETL solutions to process and manipulate large datasets.
  • Experience in delivering and supporting large scale data-management and migration projects.
  • Experience in designing and implementing scalable architectures that can accommodate increasing data volumes and user loads.
  • Thorough knowledge of using the right set of tools and technologies for providing solutions to process large sets of structured and semistructured data.
  • Profound experience in analyzing and interpreting complex datasets to derive actionable insights.
  • Proven track record in building and training machine learning and deep learning algorithms, along with expertise in Natural Language Processing (NLP).
  • Extensive experience working within the Banking-Finance and Healthcare domains, ensuring a comprehensive understanding of industry-specific requirements and challenges.

Overview

7
7
years of professional experience
1
1
Certification

Work History

Senior Data Engineer

Capital One
05.2022 - Current


Project: Lineage Datamart

Description: Lineage Datamart serves as the curated relational database for tracking data movements and transformations across the enterprise, ensuring a robust data ecosystem.


Roles and Responsibilities:

  • Architected an AWS micro-services framework utilizing multiple Lambdas with SQS queues and DynamoDB datastores.
  • Worked with product managers to get project requirements and spearheaded the overall project leadership
  • Proficient in handling data within cloud databases such as AWS DynamoDB
  • Skilled in querying relational databases including SQL and Snowflake
  • Proficient in Scala for application development
  • Implemented integration tests to detect defects during real-time pipeline execution
  • Designed data structures to efficiently store curated data within the database
  • Experienced in CI/CD pipelines for testing and deploying microservices to production


Project: Data Monitoring And Alerting

Description: Data Monitoring and Alerting system tracks dataset timeliness and quality, promptly alerting producers and consumers of any deviations from defined data quality rule.


Roles and Responsibilities:

  • Contributed to the architectural design of the monitoring and alerting system.
  • Developed Scala microservices to assess dataset quality and timeliness against predefined rules.
  • Utilized JEXL (Java Expression Language) for runtime evaluation of expressions and scripts.
  • Implemented integration tests to identify real-time pipeline issues.
  • Managed multiple DynamoDB tables to store design-time and runtime data across microservices.

Senior Data Scientist

Friendly Health Technologies
01.2018 - 05.2022
  • Object Detection - Used Semantic segmentation model (FCN) to localize Handwritten text, signatures, hand checked areas in a scanned document
  • Image Denoising - Used deep learning algorithms like CNNs to classify noise in an image and UNet to remove the noise from image
  • Simulation of different noises in the image to create dataset for image denoising
  • Image preprocessing using python Opencv and Numpy
  • Handwriting OCR - Used CNNs, LSTMs to digitize handwritten text in a document
  • Transfer learning to finetune trained models for customized data
  • Experience using google tesseract to extract machine printed text in a document and finetuning of tesseract to work on specific type of documents
  • Document text classification using NLP
  • Entity extraction in a document using deep learning based NLP models
  • Worked with stakeholders to develop roadmaps based on impact, effort and test coordinations.

Freelance Python Developer

Guru.com
04.2017 - 01.2018
  • Developed python libraries for AWS S3 read / write operations using boto library
  • Python code optimizations for better performance
  • Debugging and fixing bugs in the given code base
  • Experience with different file formats like JSON, CSV etc.

Education

Bachelor of Technology - Electronics & Instrumentation Engineering

Jawaharlal Technological University
04.2013

Skills

  • Programming Skills: Python, Scala, SQL
  • Database: Snowflake, SQL
  • Technologies: Data Engineering, Machine Learning, NLP, Deep Learning
  • Version Control: GIT
  • AWS Services: Lambda, DynamoDb, SQS, Cloudwatch, EC2, S3, SNS
  • Containerization & Orchestration tools: Docker & Kubernetes
  • Data Visualization tools: Python (Pandas, NumPy, SciPy, Scikit-learn, TensorFlow, Keras), Spark, Tableau, Matplotlib

Certification

  • Certified AWS Solutions Architect
  • Certified Machine Learning Engineer by Stanford

Timeline

Senior Data Engineer

Capital One
05.2022 - Current

Senior Data Scientist

Friendly Health Technologies
01.2018 - 05.2022

Freelance Python Developer

Guru.com
04.2017 - 01.2018

Bachelor of Technology - Electronics & Instrumentation Engineering

Jawaharlal Technological University
HARATHI SURYA PATCHIPALA