Summary
Overview
Work History
Education
Skills
Name & contact
Accomplishments
Personal Information
Certification
Timeline
Generic
Chandrakanth Lekkala

Chandrakanth Lekkala

Mason,OH

Summary

Senior Data Engineer with 9+ years of experience designing and delivering large-scale data solutions across cloud and big data ecosystems. Proven expertise in building high-performance ETL/ELT pipelines, optimizing cloud infrastructures (AWS, Azure, GCP), and enabling ML-driven analytics using Spark, Databricks, Snowflake, and Kafka. Certified in SnowPro, Terraform, CKAD, and AWS Developer, with a strong track record of driving automation, cost efficiency, and cross-functional collaboration. Passionate about turning complex data into actionable insights and scalable products.

Overview

2026
2026
years of professional experience
1
1
Certification

Work History

Sr Data Platform Engineer

84.51º
05 2023 - Current
  • Designed and implemented a seasonal product recommender system using Meta Prophet model, driving up to 9,053 daily product cart additions and boosting seasonal product sales by ~72% YoY across 1,242 Kroger stores in 16 states.
  • Engineered scalable, high-throughput data pipelines using Databricks Notebooks (Python), Azure Data Factory (ADF), and Datadog, ensuring efficient, reliable processing of massive datasets for seasonal trend analytics and customer personalization.
  • Spearheaded the implementation of ADF in the Feast feature store project, significantly improving ETL integration, automation, and data workflow efficiency.
  • Developed and shared a custom flake8 linter integrated with GitHub Actions (GHA) and Docker for the Feast feature store, enforcing standardized, high-quality feature engineering practices.
  • Migrated major P&LS assets, including the YNB and Popular Items pipelines, to Databricks Unity Catalog, centralizing governance, improving security, and streamlining data access across internal and external teams.
  • Leveraged Databricks Delta Share and Assist Bundle (Workflows, Jobs) to automate pipeline execution, enhance cross-team data collaboration, and support large-scale ML and analytics workloads.
  • Enhanced personalization strategies by integrating YNB pipelines powering Yellow Tags, Coupons, and product recommendations, and the Popular Items pipeline, driving personalized promotions and improving store-level engagement.
  • Integrated Unity Catalog with Alation to establish a unified metadata and data governance framework, improving compliance, discoverability, and organizational documentation.
  • Contributed critical technical insights during the evaluation and selection of GCP Vertex AI’s Matching Engine (vector database), influencing key architecture decisions for the new ML platform.
  • Implemented OpenAI-driven recommendation models using prompt engineering to improve category and UPC recommendations, boosting product diversity and relevance.
  • Developed CI/CD pipelines using GitHub Actions to automate Databricks workflow deployments, improving delivery reliability, speed, and consistency.
  • Authored detailed support documentation for ADF and Datadog, providing access instructions, troubleshooting guidelines, and incident response protocols.
  • Streamlined workflows for category derivation, OpenAI-based recommendations, and batch complement generation, delivering production-ready datasets for deployment.
  • Designed and implemented data movement strategies using Federated Queries, GCP Data Streams, and JDBC connectors with DataProc; developed Spark jobs for BigQuery, optimizing ingestion and transformation for scalable ML applications.
  • Championed validation protocols (including Blue Ribbon checks and rcmmndr integrations), ensuring data accuracy and consistency in final Kroger-facing outputs.
  • Dedicated >5% of work time to mastering new technologies (Azure, vector databases) and actively shared learnings through team knowledge sessions and cross-functional workshops.
  • Collaborated closely with engineering and data science teams to troubleshoot issues, align workflows, and optimize pipelines for seamless production deployment.

Sr Cloud Architect

Fidelity Information Services
06.2021 - 05.2023
  • Design, develop and deploy scalable and high-performance ETL/ELT pipelines for data lakes and warehouses using Airflow, Python, SQL, and db on cloud platforms such as Snowflake, Databricks, AWS and EKS
  • Collaborate with data scientists and analysts to understand their data requirements and help design and implement the appropriate data architecture and infrastructure
  • Build and maintain infrastructure as code using Terraform, ensuring the stability and scalability of the data pipeline
  • Implement performance optimization techniques such as query tuning, indexing, and caching to ensure data pipeline performance and reliability
  • Ensure data pipelines are secure, compliant, and meet regulatory requirements
  • Utilize Airflow to manage and orchestrate the ETL/ELT pipelines and other scheduled jobs
  • Document design decisions, code, and processes for reference and knowledge sharing
  • Stay up-to-date with emerging cloud technologies and tools in the industry and recommend their adoption as appropriate
  • Provided technical leadership and delivered innovative products and services to address customer specific requirements
  • Implement a data lake solution using Spark and Databricks, ingesting data from various sources including streaming data from Kafka
  • Worked with cloud architect to generate assessments and develop and implement actionable recommendations based on results and reviews
  • Developed generic store procedures using Snow SQL and Javascript to transform and ingest CDC data into Snowflake relational tables from external S3 stages
  • Build monitoring and alerting systems though Prometheus, Grafana and Slack using Kubernetes and Helm
  • Built and maintained ETL/ELT pipelines for a data lake and warehouse using Python, SQL, and dbt, resulting in a 20% increase in efficiency and a 30% reduction in errors
  • Implemented infrastructure as code using Terraform to create a scalable and cost-effective infrastructure for the data pipeline, resulting in a 25% reduction in infrastructure costs
  • Ensured compliance with regulatory requirements by implementing security and privacy measures in the data pipeline, resulting in successful completion of the audit
  • Collaborated with the team to integrate Docker, Airflow, and Kubernetes into the data pipeline, resulting in a streamlined and automated data processing and analysis workflow
  • Worked in a startup environment, collaborating with cross-functional teams to design and implement data solutions that met business needs in a fast-paced and rapidly changing environment
  • Served as subject matter expert on Snowflake, Terraform, Kafka and Kubernetes for both clients and internal team members
  • Worked on Prototype to create the external function in snowflake to call remote service implemented in AWS Lambda
  • Designing and building multiple self-managed high-volume HA Kafka clusters in hybrid environment on bare metal servers, AWS EC2 and Kubernetes
  • Cost optimization by identifying the resource usage patterns.

Senior Data Engineer

Worldpay For FIS
12.2018 - 05.2021
  • Involved in complete Bigdata flow of multiple applications, starting from data ingestion from upstream to HDFS, processing, and analyzing data in HDFS
  • Responsible for developing prototypes of selected solutions and implementing complex big data projects focusing on collecting, parsing, managing, analyzing, and visualizing large sets of data using multiple platforms
  • Understand how to apply new technologies to solve big data problems and to develop innovative big data solutions
  • Developed various data loading strategies and performed various transformations for analyzing datasets by using Cloudera Distribution for the Hadoop ecosystem
  • Worked extensively on designing and developing multiple Spark Scala ingestion pipelines, both Realtime and Batch
  • Responsible for handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations, and others during the ingestion process itself
  • Worked on importing metadata into Hive/impala and migrated existing legacy tables and applications to work on Hadoop by using Spark, Hive, and impala
  • Work on POC's to perform change data capture (CDC) and Slowly Changing Dimension phenom in HDFS using Spark and Delta Lake open-source storage layer that brings ACID transactions to Apache Spark
  • Extensively worked on POC to ingest data from the S3 bucket to snowflake using external stages
  • Developed multiple POCs using Spark and deployed on Yarn cluster, compared Performance of Spark with Hive and Impala
  • Responsible for Performance tuning Spark Scala Batch ETL jobs by changing configuration properties and using broadcast variables
  • Worked on Batch processing for History load and Real-time data processing for consuming live data on Spark Streaming using Lambda architecture
  • Developed Streaming pipeline to consume data from Kafka and ingest into HDFS in near real-time
  • Worked on Performing tuning of Spark Streaming Applications for setting right Batch Interval time, the correct level of Parallelism, and memory tuning
  • Implemented Spark SQL optimized joins to gather data from different sources and run ad-hoc queries on top of them
  • Wrote Spark Scala Generic UDFs to perform business logic operations at the record level
  • Developing Spark code in Scala and Spark SQL environment for faster testing and processing of data and Loading data into Spark RDD and doing In-memory computation to generate output response with less memory usage
  • Analyzed large amounts of data sets to determine the optimal way to aggregate and report on it
  • Worked on parsing and converting JSON/XML formatted files to tabular format in Hive/impala by using Spark Scala, Spark SQL, and Dataframe API
  • Worked on various file formats and compressions Text, Json, XML, Avro, Parquet file formats, snappy, Bzip2, Gzip compression
  • Worked on performing transformations & actions on RDDs and Spark Streaming data
  • Involved in converting Hive QL queries into Spark transformations using Spark RDDs, Spark SQL, and Scala
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data
  • Installed Open source Zeppelin Notebook for using Spark Scala, PySpark, Spark SQL, and Spark R API's interactively via web interface
  • Worked on integrating Zeppelin with LDAP for multiuser support in all environments
  • Responsible for Zeppelin Estimating resources usage and configuring interpreters for optimal use
  • Developed workflow in Oozie to automate loading data tasks into HDFS and pre-processing data and used Zookeeper to coordinate clusters
  • Supported the Hadoop platform in identifying, communicating, and resolving data integration and consumption Issues
  • Worked on Root Cause Analysis and Problem Management processes and assisted support team resolving issues in a data ingestion solution developed on the Hadoop platform
  • Used Zookeeper for various types of centralized configurations
  • Met with key stakeholders to discuss and understand all significant aspects of the project, including scope, tasks required, and deadlines
  • Supervised Big Data projects and offered assistance and guidance to junior developers
  • Multi-tasked to keep all assigned projects running effectively and efficiently
  • Achieved challenging production goals consistently by optimizing
  • Environment: Hadoop, Cloudera distribution, Scala, Python, Spark core, Spark SQL, Spark Streaming, Hive, HBase, Pig, Sqoop, Kafka, Zookeeper, Java 8, and UNIX Shell Scripting, Zeppelin Notebook, Delta Lake, AWS S3, AWS Lambda, Snowflake, SnowSql.

Hadoop/Spark Developer

Wells Fargo
06.2017 - 12.2018
  • Involved in complete Bigdata flow of multiple applications, starting from data ingestion from upstream to HDFS, processing, and analyzing data in HDFS
  • Hands-on experience in designing, developing, and maintaining software solutions in the Hadoop cluster
  • Exploring with Spark improving performance and optimizing existing algorithms in Hadoop using Spark Context, Spark SQL, Data Frame, and Spark Yarn
  • Worked on POC's with Apache Spark using Scala to implement Spark in the project
  • Build Scalable distributed data solutions using Hadoop Cloudera Distribution
  • Load and transform large sets of structured, semi-structured, and unstructured data
  • Implemented Partitioning, Dynamic Partitions, Buckets in Hive
  • Extending HIVE and PIG core functionality by using custom User Defined Functions (UDF), User Defined Table-Generating Functions (UDTF) and User Defined Aggregating Functions (UDAF) for Hive and Pig using python
  • Developed Hadoop streaming jobs to process terabytes of JSON/XML format data
  • Developed complex MapReduce streaming jobs using Java language using Hive and Pig and using MapReduce Programs using Java to perform various ETL, cleaning, and scrubbing tasks
  • Developing and running Map-Reduce Jobs on YARN and Hadoop clusters to produce daily and monthly reports per user's need
  • Develop code in Hadoop technologies and perform Unit Testing
  • Involved in creating Hive tables, loading structured data, and writing hive queries, which will run internally in MapReduce
  • Designed the ETL ran performance tracking sheet in different phases of the project and shared it with the Production team
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs and Scala
  • Involved in using SQOOP for importing and exporting data between RDBMS and HDFS
  • Used Hive to analyze Partitioned and Bucketed data and compute various metrics for reporting
  • Used Hive to analyze partitioned and bucketed data and compute various metrics for reporting & used hive optimization techniques during joins and best practices in writing hive scripts using HiveQL
  • Involved in developing Hive DDLS to create, alter, and drop Hive tables
  • Involved in loading data from the Linux file system to HDFS
  • Created PIG scripts to load, transform, and store data from various sources into HIVE metastore
  • Importing and exporting data into HDFS and Hive using Sqoop
  • Experienced in managing and reviewing Hadoop log files
  • Identify and design the most efficient and cost-effective solution through research and evaluation of alternatives
  • Demonstrated Hadoop practices and broad knowledge of technical solutions, design patterns, and code for medium/complex applications deployed in Hadoop production
  • Ingested semi-structured data using Flume and transformed it using Pig
  • Inspected and analyzed existing Hadoop environments for proposed product launches, producing cost/benefit analyses to use included legacy assets
  • Developed a highly maintainable Hadoop code and followed all best practices regarding coding
  • Environment: Hadoop, MapReduce, Java, Scala, Spark, Hive, Pig, Spark SQL, Spark Streaming, Sqoop, Python, Kafka, Cloudera, DB2, Scala IDE(Eclipse), Maven, HDFS.

Java Developer

It Keysource
07.2016 - 05.2017
  • Involved in Various Stages of Software Development Life Cycle (SDLC) deliverables of the project using the AGILE Software development methodology
  • Implemented the application using Spring MVC Framework and handled the security using spring security
  • Involved in batch processing using the Spring Batch framework to extract data from the database and load into corresponding application tables
  • Developed Controller Classes using spring MVC, spring AOP framework
  • Designed and Developed End to End customer self-service module using annotation-based Spring MVC, Hibernate, Java Beans, and jQuery
  • Developed the User Interface using JSP, jQuery, HTML5, CSS3 Bootstrap and Angular JS
  • Implement functionality such as searching, filtering, sorting, categories, validating using Angular framework
  • Used Angular directives, working on attribute level, element level, and class level directives
  • Implemented Bean classes and configured in the spring configuration file for dependency injection
  • Implemented the persistence layer using Hibernate that uses the POJOs to represent the persistence database
  • Created mappings among the relations and written named HQL queries using Hibernate
  • Design a common framework for REST API consumption using Spring Rest Templates
  • Used Design Patterns like Facade, Data Transfer Object (DTO), MVC, Singleton, and Data Access Object (DAO)
  • Written SQL and PL/SQL queries like stored procedures, triggers, indexes, views
  • Used Log4j, JUnit for logging, and Testing
  • Documented all the SQL queries for future testing purposes
  • Prepared test case scenarios and internal documentation for validation and reporting
  • Coordinating with the QA team and resolving the QA defects
  • Wrote services to store and retrieve user data from the Mongo DB
  • Worked with a WebSphere application server that handles various requests from the Client
  • Deploying fixes and updates using the IBM WebSphere application server
  • Experience in developing automated unit testing using JUnit framework
  • Used Git controls to track and maintain the different versions of the project
  • Reviewed code and debugged errors to improve performance
  • Reworked applications to meet changing market trends and individual customer demands
  • Researched new technologies, software packages, and hardware products for use in website projects
  • Worked with business users and operations teams to understand business needs and address production questions
  • Wrote, modified, and maintained software documentation and specifications
  • Environment: Java, Spring MVC, Spring Batch, Hibernate, Web Services, Html 5, CSS3, JavaScript, Bootstrap, MAVEN, WebSphere, Eclipse, JUnit, jQuery, log4j, Windows, Git.

SQL Developer

It Keysource
01.2016 - 06.2016
  • Involved in complete Software Development Lifecycle (SDLC)
  • Interpret written business requirements and technical specification documents
  • Contribute technical content to as-built documents
  • Create, document, and implement unit test plans, scripts, and test harnesses
  • Wrote complex SQL Queries, Stored Procedure, Triggers, Views & Indexes using DML, DDL commands and user defined functions to implement the business logic
  • Advised optimization of queries by looking at execution plan for better tuning of Database
  • Worked on the creation of tables, indexes, sequences, constraints, and procedures
  • Prepared and executed proper data validations
  • Documenting Procedures and obtaining approvals for various data fetching methods
  • Worked on attaining the proper data requirements for the SQL code to be written
  • Performed Normalization & De-normalization on existing tables for faster query results
  • Wrote T-SQL Queries and procedures to generate DML Scripts that modified database objects dynamically based on inputs
  • Maintained positive communication and working relationship with all business levels
  • Reviewed, analyzed and implemented necessary changes in appropriate areas to enhance and improve existing systems.

Education

Master of Science - Computer Information Systems

Florida Institute of Technology
May.2016

Bachelor of Science - Computer Science And Engineering

Jawaharlal Nehru Technological University
2014

Skills

  • Programming Languages: Python, Scala, SQL, Java, Shell Scripting
  • Big Data & Analytics: Hadoop (HDFS, MapReduce), Spark, Kafka, Hive, Pig, Impala, Flume, Delta Lake, Databricks, Snowflake, dbt
  • Cloud Platforms: AWS (EC2, S3, Lambda, Redshift, EMR), Azure (ADF, Databricks, Synapse), GCP (BigQuery, DataProc, Vertex AI)
  • Data Orchestration & Pipeline Tools: Airflow, Oozie, Kafka Streams, Azure Data Factory
  • Databases: Relational (Oracle, MySQL, SQL Server), NoSQL (HBase, Cassandra, MongoDB)
  • Web Technologies: XML, HTML, CSS, JavaScript
  • DevOps & Infrastructure: Terraform, Kubernetes, Docker, CI/CD (GitHub Actions), Serverless Computing, Load Balancing, Cloud Infrastructure Management, Infrastructure as Code (IaC), Prometheus, Grafana
  • esting & Tools: PyCharm, VS Code, Eclipse, JUnit, pytest, unittest, QTP, JIRA, Quality Center (QC), Tableau

Name & contact

Accomplishments

    Trainings:

  • Edureka trained and certified Apache Spark and Scala developer
  • Certifications:

  • Successfully earned CKAD Certification.
  • Successfully earned SnowPro Core certification.
  • Successfully earned Terraform Associate Certification.
  • Successfully earned AWS Developer Associate Certification.
  • Successfully earned certification on Apache Spark and Apache Hadoop with IBM Big Data University.
  • Successfully earned certification on Big data analysis with Apache Spark and Distributed Machine Learning with Apache Spark with edX BerkeleyX Sponsored by Databricks

Personal Information

Certification

  • Successfully earned SnowPro Core certification.
  • Successfully earned Terraform Associate Certification.
  • Successfully earned AWS Developer Associate Certification.
  • Successfully earned certification on Apache Spark and Apache Hadoop with IBM Big Data University.
  • Successfully earned certification on Big data analysis with Apache Spark and Distributed
  • Machine Learning with Apache Spark with edX BerkeleyX Sponsored by Databricks

Timeline

Sr Cloud Architect

Fidelity Information Services
06.2021 - 05.2023

Senior Data Engineer

Worldpay For FIS
12.2018 - 05.2021

Hadoop/Spark Developer

Wells Fargo
06.2017 - 12.2018

Java Developer

It Keysource
07.2016 - 05.2017

SQL Developer

It Keysource
01.2016 - 06.2016

Bachelor of Science - Computer Science And Engineering

Jawaharlal Nehru Technological University

Sr Data Platform Engineer

84.51º
05 2023 - Current

Master of Science - Computer Information Systems

Florida Institute of Technology