Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Divya Kaki

Haslet,TX

Summary

over 12 years of demonstrated experience in retail, banking, and healthcare industries, Big-Data ,Azure DataBricks, Scala, Spark & Nifi development including 6+ years of experience with Hadoop Ecosystem in development and support of different Hadoop eco-system components. Dedicated and skilled Azure Databricks Data Engineer with extensive experience in designing, implementing, and optimizing data pipelines and analytics solutions. Proficient in utilizing Azure Databricks platform to transform raw data into actionable insights. Seeking to leverage expertise in data engineering and Azure technologies to contribute to innovative projects and drive business success. Experienced with development on python/scala/Spark Hive on premise application including 4+ year experience on Python with spark in health care development. 5+ years of experience in Hadoop Developer in Big Data/Hadoop technology development

Overview

12
12
years of professional experience
1
1
Certification

Work History

Senior Data Engineer

Avaap
06.2023 - 02.2024
  • The Ohio Department of Medicaid’s Demographic and Expenditure Dashboard is populated using enrollment, capitation, claims, and provider data from department’s Enterprise Data Warehouse (EDW).

Responsibilities:

  • Designed and implemented scalable data pipelines using Azure Databricks, Apache Spark, and other related technologies to process large volumes of data efficiently.
  • Collaborated with cross-functional teams to understand business requirements and translate them into technical solutions.
  • Developed and maintained ETL processes to ingest, clean, transform, and load data from various sources into Azure Databricks.
  • Optimized data workflows for performance, reliability, and cost-effectiveness, leveraging cluster tuning, partitioning strategies, and caching techniques.
  • Implemented data governance and security measures to ensure compliance with regulatory requirements and protect sensitive data.
  • Conducted performance monitoring, troubleshooting, and optimization of Spark jobs to meet SLAs and maintain high availability.
  • Provided technical guidance and mentorship to junior team members, fostering a culture of knowledge sharing and continuous learning.
  • Collaborated with data scientists and analysts to support advanced analytics and machine learning initiatives on Azure Databricks platform.

Technical Skills:

Big Data Technologies: Azure Databricks, Apache Spark, Apache Hive, Apache Impala, HDFS

Programming Languages: Python, Scala, SQL,Pyspark

ETL Tools: Apache NiFi, Streamsets

Database Systems: MySQL, Oracle

Data Serialization Formats: Avro, Parquet, JSON, XML

Version Control: Git

Client Reference:

Megan Glenn

773-750-0647

E-mail: megan.glenn@avaap.com

Senior Hadoop Developer

Cotiviti Inc.
01.2022 - 05.2023

Responsiblities:

  • Senior Hadoop Developer at Cotiviti, involved in backend ingestion Platform for NILE application
  • Responsible for Development/enhancements/monitoring/automation Big Data Platform/Hadoop and creating scripts/ programs to maintaining application up to date
  • Designed and developed real-time and batch data processing solutions using Azure Databricks, Apache Spark Streaming, and Azure Event Hubs.
  • Implemented data integration solutions to connect Azure Databricks with various data sources and destinations, including Azure Data Lake Storage, Azure SQL Database.
  • Collaborated with DevOps team to implement CI/CD pipelines for continuous integration and delivery of data engineering artifacts.
  • Conducted performance tuning and optimization of Spark jobs to improve data processing throughput and reduce latency.
  • Designed and implemented data lake architectures to support data exploration, analytics, and reporting requirements.
  • Developed custom UDFs and libraries in Python and Scala to extend the functionality of Azure Databricks and Apache Spark.

Client Reference:

Vasudeva Rao Vellala

4703947735

v.vasudevarao@gmail.com

HMS Healthcare, Cloudera Hadoop Engineer

Cotiviti Inc.
01.2020 - 01.2022

Responsibilities:

● Responsibilities include extensive development on migrating the legacy Vitreous Application which resides on Postgres SQL /JAVA Platforms to on premise Hadoop includes Python, Spark as the data ingestion platforms . Nifi being the job scheduler, HIVE being the open-source data-warehouse is used for the data validation.

● Have done a POC on the API calls to the geocoder and Google matrix A2C (Access 2 Care) applications. Via NIFI, designed an end-to-end workflow to make an API call to the source to get the geocodes for the patient id’s and their Locations.

● Have developed the Pyspark code to do the complete migration from the legacy systems to Hadoop.

● Have developed the python scripts to do the validation on the quality testing and validation of the application.

● Hands on Git repo, for code push and pull commit the code to the master, azure dev Ops pipelines to deploy the code to dev and TEST.

● Assisted in upgrading, configuration, and maintenance of various Hadoop infrastructures like Hive.

● Developed workflow in NIFI to automate the tasks of loading the data into HDFS and pre-processing, analyzing and training the classifier using MapReduce jobs, and Hive jobs Created framework in NIFI to pull data from RDBMS

● Developed workflows for all new and existing data sets ingested into a new data lake

● Experience in using NIFI in orchestrating the flow of data between different software systems

● Worked with NIFI admins to test & validate the NIFI new features, functionality post upgrade

● Designed workflows to send messages to Azure Service Bus

● Developed workflows to bulk extract data from multiple sources and dump to Data Lake & Hadoop HDFS

● Extensively worked in Hive UDFs and fine tuning.

● Responsible for building scalable distributed data solutions using Hadoop

● Developed Spark Code using Scala and Spark-SQL/Streaming for faster testing and processing of data Developed Scala scripts, UDF's using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into RDBMS through Sqoop

● Used Spark-Streaming APIs to perform necessary transformations and actions on the fly for building the common learner data model which gets the data in real time.

● Exploring with Spark improving the performance and optimization of the Existing algorithms in Hadoop using Spark context, Spark-SQL, data frame pair RDD's, Spark YARN.

● Involved in importing and exporting data from local/external file system and RDBMS to HDFS.

Client Reference:

Santhi Nutakki

Contact title: Tech lead

510-456-8476

Hadoop, NIFI & Scala-Java/Kafka Developer

AAP (Advanced Auto Part)
01.2020 - 01.2021
  • Responsibilites:
  • My responsibilities include Management Information System (MIS) enhancements and sustenance on the data lakes & pipelines for better insight creation from the data facts.
  • Ownership of the design and development of Data pipeline jobs from different source systems.
  • Assisted in upgrading, configuration, and maintenance of various Hadoop infrastructures like Pig, Hive, and HBase.
  • Developed workflow in NIFI to automate the tasks of loading the data into HDFS and pre-processing, analyzing and training the classifier using MapReduce jobs, Pig jobs, and Hive jobs Created framework in NiFi to pull data from RDBMS, NoSQL, and S3 ...etc sources for reusability
  • Developed workflows for all new and existing data sets ingested into a new data lake
  • Experience in using NIFI in orchestrating the flow of data between different software systems
  • Worked with Nifi admins to test & validate the Nifi new features, functionality post upgrade
  • Designed workflows to send messages to Azure Service Bus
  • Developed workflows to bulk extract data from multiple sources and dump to Data Lake & Hadoop Hdfs
  • Extensively worked in Hive UDFs and fine tuning.
  • Knowledge on Amazon EC2 Spot integration & and Amazon S3 integration
  • Responsible for building scalable distributed data solutions using Hadoop
  • Developed Spark Code using Scala and Spark-SQL/Streaming for faster testing and processing of data Developed Scala scripts, UDF's using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into RDBMS through Sqoop
  • Used Spark-Streaming APIs to perform necessary transformations and actions on the fly for building the common learner data model which gets the data from Kafka in real time.
  • Exploring with Spark improving the performance and optimization of the Existing algorithms in Hadoop using Spark context, Spark-SQL, data frame pair RDD's, Spark YARN.
  • Developed multiple Kafka Producers and Consumers from as per the software requirement specifications
  • Worked on Storm to handle the parallelization, partitioning, and retrying on failures and developed a data pipeline using Kafka and Strom to store data into HDFS.
  • Involved in importing and exporting data from local/external file system and RDBMS to HDFS.
  • Testing the AWS Kafka automation deployment process.
  • Develop the code for dynamic routing (orders and location) understand the schema and route the orders according to the rules match
  • Understanding the drools concept of rule engine.
  • Setting up the environment (Aws, git) and proper access to AAP repo
  • Code Analysis and understanding the flow of data knowing the upstream and downstream of the project.
  • Created cluster in the slower environment to work on the data gen part for code development
  • Complete the Kafka build on the AWS instance, successfully build boot-strap-host/build-host,
  • did successful install for the host to come up.
  • Created the views/stored procedures in the PostgreSQL for the retry framework for the rules to trigger in a particular order.
  • Involved in design of DLQ new relic, creating the topics and testing
  • Testing and code analysis

Java & Tibco Developer

UCSD
03.2019 - 08.2020

Java & Nifi Developer

Responsibilities:

● Designed and implemented workload distribution in NiFi using Remote Processor Groups for parallelism

● Used NiFi to schedule, automate and monitor Hive, Spark, and Shell scripts

● Created framework in NiFi to pull data from RDBMS, NoSQL, and S3 ...etc sources for reusability

● Developed workflows for all new and existing data sets ingested into a new data lake

● Experience in using NIFI in orchestrating the flow of data between different software systems

● Worked with Nifi admins to test & validate the Nifi new features, functionality post upgrade

● Designed workflows to send messages to Azure Service Bus

● Developed workflows to bulk extract data from multiple sources and dump to Data lake & Hadoop Hdfs

● Developing and involved in enhancements for web applications using Core Java, J2EE, JSP, Servlets, JDBC, Struts, Spring, Hibernate, JMS and XML

● Implemented reusable NiFi flows to process XML, CSV, and Log, etc files and store them into Hive, HDFS, HBase

● scripts to improve performance

● Good experience in writing Worked on analyzing, writing Hadoop MapReduce jobs using Java API, Pig, and Hive.

● Involved in loading data from the edge node to HDFS using shell scripting.

● Configured Oracle Database to store Hive metadata.

● Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive for optimized performance.

Data Engineer

Pepsico
06.2019 - 02.2020

Key responsibilities:

● Designed workflows to send messages to Azure Service Bus

● Developing and involved in enhancements for web applications using Core Java, J2EE, JSP, Servlets, JDBC, Struts, Spring, Hibernate, JMS and XML

● Implemented reusable NiFi flows to process XML, CSV, and Log, etc files and store them into Hive, HDFS, HBase

● Worked on both web applications with SOAP and Restful web services to provide backend support to applications led by cross functional teams

● Constant on call/remote support to maintain and resolve production problems of medium to critical complexity

● Coordinated with team members, clients and business analysts for timely delivery of functionality

● Configured additional levels of development and test environments for greater code quality Designed developed and implemented solutions to meet business objectives.

● Collaborate across teams to analyze and develop system requirements

● Consult business clients for any clarification needed on requirements, also receive user acceptance feedback

● Perform code reviews, organize daily status calls with offshore team and provide timely support for development continuity

● Involve in unit testing code, build and deploy onto cloud platforms and support additional levels of testing including integration and system testing

● Assist onsite production support teams when necessary

Client References:

Name: Karen Small

Title: Senior manager

Phone #:972-965-1746

Company Name: TCS / Pepsico

Java Engineer

Intuit
08.2017 - 02.2019

Responsibilites:

● Designed workflows to send messages to Azure Service Bus

● Developing and involved in enhancements for web applications using Core Java, J2EE, JSP, Servlets, JDBC, Struts, Spring, Hibernate, JMS and XML

● Implemented reusable NiFi flows to process XML, CSV, and Log, etc files and store them into Hive, HDFS, HBase

● Worked on both web applications with SOAP and Restful web services to provide backend support to applications led by cross functional teams

● Constant on call/remote support to maintain and resolve production problems of medium to critical complexity

● Coordinated with team members, clients and business analysts for timely delivery of functionality

● Configured additional levels of development and test environments for greater code quality Designed developed and implemented solutions to meet business objectives.

● Collaborate across teams to analyze and develop system requirements

● Consult business clients for any clarification needed on requirements, also receive user acceptance feedback

● Perform code reviews, organize daily status calls with offshore team and provide timely support for development continuity

● Involve in unit testing code, build and deploy onto cloud platforms and support additional levels of testing including integration and system testing

● Assist onsite production support teams when necessary.

Java & Tibco Developer

Citi Private Banking (CPB)
11.2015 - 07.2016

Java & Tibco Developer

Responsibilities:

● Involved in understanding, production and UAT and Implementation Support

● Involved in development of Java, JSP pages, JavaScript and Oracle

● Implementing MVC architecture using hibernate value objects and mapping xml files

● Used commons and log4j logging framework

● Worked on Unit and Integration Testing

● Used JavaScript for client-side validations in the JSP and HTML pages

● Used Spring for bean instantiation, annotations, controllers, request mapping to handle the webservice request and response

● Involved in front end development using Struts, JSP's and JSTL

● Used JAXB for marshalling and unmarshalling of work order, billing XML documents, and JAXP for processing

● Developed REST Web services to make web service calls simple and easy for the client to access it with the help of standard HTTP URIs

● Development of Service code using apache camel framework in JAVA/J2EE

● Design and developed request and response XML Schema(XSD) documents for webservice operations such as Retrieve History

● Developing Intranet Web Application using J2EE architecture, using JSP to design the user interfaces and Hibernate for database connectivity

● Developed SQL queries in Oracle

● Management of TIBCO major products such as BW, Active Matrix Business Works, Iprocess

● Develop Scripts and automation tools used to build, integrate, and deploy software releases for TIBCO Enterprise resources and applications

● Develops and processes and performs Administration and configuration of TIBCO Topics n Queues

Tibco Developer and Administrator

CITI, WINS & HSBC
10.2011 - 08.2015

Responsibilities:

● Involved in understanding, production and UAT and Implementation Support

● Involved in development of Java, JSP pages, JavaScript and Oracle

● Implementing MVC architecture using hibernate value objects and mapping xml files

● Used commons and log4j logging framework

● Worked on Unit and Integration Testing

● Used JavaScript for client-side validations in the JSP and HTML pages

● Used Spring for bean instantiation, annotations, controllers, request mapping to handle the webservice request and response

● Involved in front end development using Struts, JSP's, JSF and JSTL

● Used JAXB for marshalling and unmarshalling of work order, billing XML documents, and JAXP for processing

● Developed REST Web services to make web service calls simple and easy for the client to access it with the help of standard HTTP URIs

● Development of Service code using apache camel framework in JAVA/J2EE

● Design and developed request and response XML Schema(XSD) documents for webservice operations such as Retrieve History

● Developing Intranet Web Application using J2EE architecture, using JSP to design the user interfaces and Hibernate for database connectivity

● Developed SQL queries in Oracle

● Taking care of Tibco administrative and configuration activities as part of CTP

● Maintenance of staffware related servers, deployment of web services

● Development in terms of web services for new enhancements

● Gathering the requirement on handling the Tibco iprocess activities, design and development

● Develops and processes and performs Administration and configuration of TIBCO Topics n Queues

Education

Bachelors in Information Technology -

JNTUK
05.2011

Skills

  • Proficient in Azure Databricks, Apache Spark, SQL, Python, Scala, Pyspark
  • Experience with Azure services such as Azure Data Factory, Azure Data Lake Storage, Azure SQL Database, Azure Synapse
  • Nifi Architect, Hive, Impala, Sqoop, Hbase
  • Strong understanding of data engineering concepts and best practices
  • Familiarity with cloud-native architectures and microservices
  • Excellent problem-solving and troubleshooting skills
  • Effective communication and collaboration skills
  • Ability to work independently and in team environment

Certification

Academy Accreditation - Databricks Lakehouse Fundamentals

Timeline

Senior Data Engineer

Avaap
06.2023 - 02.2024

Senior Hadoop Developer

Cotiviti Inc.
01.2022 - 05.2023

HMS Healthcare, Cloudera Hadoop Engineer

Cotiviti Inc.
01.2020 - 01.2022

Hadoop, NIFI & Scala-Java/Kafka Developer

AAP (Advanced Auto Part)
01.2020 - 01.2021

Data Engineer

Pepsico
06.2019 - 02.2020

Java & Tibco Developer

UCSD
03.2019 - 08.2020

Java Engineer

Intuit
08.2017 - 02.2019

Java & Tibco Developer

Citi Private Banking (CPB)
11.2015 - 07.2016

Tibco Developer and Administrator

CITI, WINS & HSBC
10.2011 - 08.2015

Bachelors in Information Technology -

JNTUK

Academy Accreditation - Databricks Lakehouse Fundamentals

Divya Kaki