Summary

Overview

Work History

Education

Skills

Name & contact

Accomplishments

Personal Information

Certification

Timeline

Chandrakanth Lekkala

Mason,OH

Summary

Senior Data Engineer with 9+ years of experience designing and delivering large-scale data solutions across cloud and big data ecosystems. Proven expertise in building high-performance ETL/ELT pipelines, optimizing cloud infrastructures (AWS, Azure, GCP), and enabling ML-driven analytics using Spark, Databricks, Snowflake, and Kafka. Certified in SnowPro, Terraform, CKAD, and AWS Developer, with a strong track record of driving automation, cost efficiency, and cross-functional collaboration. Passionate about turning complex data into actionable insights and scalable products.

Overview

2026

years of professional experience

Certification

Work History

Sr Data Platform Engineer

84.51º

05 2023 - Current

Designed and implemented a seasonal product recommender system using Meta Prophet model, driving up to 9,053 daily product cart additions and boosting seasonal product sales by ~72% YoY across 1,242 Kroger stores in 16 states.
Engineered scalable, high-throughput data pipelines using Databricks Notebooks (Python), Azure Data Factory (ADF), and Datadog, ensuring efficient, reliable processing of massive datasets for seasonal trend analytics and customer personalization.
Spearheaded the implementation of ADF in the Feast feature store project, significantly improving ETL integration, automation, and data workflow efficiency.
Developed and shared a custom flake8 linter integrated with GitHub Actions (GHA) and Docker for the Feast feature store, enforcing standardized, high-quality feature engineering practices.
Migrated major P&LS assets, including the YNB and Popular Items pipelines, to Databricks Unity Catalog, centralizing governance, improving security, and streamlining data access across internal and external teams.
Leveraged Databricks Delta Share and Assist Bundle (Workflows, Jobs) to automate pipeline execution, enhance cross-team data collaboration, and support large-scale ML and analytics workloads.
Enhanced personalization strategies by integrating YNB pipelines powering Yellow Tags, Coupons, and product recommendations, and the Popular Items pipeline, driving personalized promotions and improving store-level engagement.
Integrated Unity Catalog with Alation to establish a unified metadata and data governance framework, improving compliance, discoverability, and organizational documentation.
Contributed critical technical insights during the evaluation and selection of GCP Vertex AI’s Matching Engine (vector database), influencing key architecture decisions for the new ML platform.
Implemented OpenAI-driven recommendation models using prompt engineering to improve category and UPC recommendations, boosting product diversity and relevance.
Developed CI/CD pipelines using GitHub Actions to automate Databricks workflow deployments, improving delivery reliability, speed, and consistency.
Authored detailed support documentation for ADF and Datadog, providing access instructions, troubleshooting guidelines, and incident response protocols.
Streamlined workflows for category derivation, OpenAI-based recommendations, and batch complement generation, delivering production-ready datasets for deployment.
Designed and implemented data movement strategies using Federated Queries, GCP Data Streams, and JDBC connectors with DataProc; developed Spark jobs for BigQuery, optimizing ingestion and transformation for scalable ML applications.
Championed validation protocols (including Blue Ribbon checks and rcmmndr integrations), ensuring data accuracy and consistency in final Kroger-facing outputs.
Dedicated >5% of work time to mastering new technologies (Azure, vector databases) and actively shared learnings through team knowledge sessions and cross-functional workshops.
Collaborated closely with engineering and data science teams to troubleshoot issues, align workflows, and optimize pipelines for seamless production deployment.

Sr Cloud Architect

Fidelity Information Services

06.2021 - 05.2023

Design, develop and deploy scalable and high-performance ETL/ELT pipelines for data lakes and warehouses using Airflow, Python, SQL, and db on cloud platforms such as Snowflake, Databricks, AWS and EKS
Collaborate with data scientists and analysts to understand their data requirements and help design and implement the appropriate data architecture and infrastructure
Build and maintain infrastructure as code using Terraform, ensuring the stability and scalability of the data pipeline
Implement performance optimization techniques such as query tuning, indexing, and caching to ensure data pipeline performance and reliability
Ensure data pipelines are secure, compliant, and meet regulatory requirements
Utilize Airflow to manage and orchestrate the ETL/ELT pipelines and other scheduled jobs
Document design decisions, code, and processes for reference and knowledge sharing
Stay up-to-date with emerging cloud technologies and tools in the industry and recommend their adoption as appropriate
Provided technical leadership and delivered innovative products and services to address customer specific requirements
Implement a data lake solution using Spark and Databricks, ingesting data from various sources including streaming data from Kafka
Worked with cloud architect to generate assessments and develop and implement actionable recommendations based on results and reviews
Developed generic store procedures using Snow SQL and Javascript to transform and ingest CDC data into Snowflake relational tables from external S3 stages
Build monitoring and alerting systems though Prometheus, Grafana and Slack using Kubernetes and Helm
Built and maintained ETL/ELT pipelines for a data lake and warehouse using Python, SQL, and dbt, resulting in a 20% increase in efficiency and a 30% reduction in errors
Implemented infrastructure as code using Terraform to create a scalable and cost-effective infrastructure for the data pipeline, resulting in a 25% reduction in infrastructure costs
Ensured compliance with regulatory requirements by implementing security and privacy measures in the data pipeline, resulting in successful completion of the audit
Collaborated with the team to integrate Docker, Airflow, and Kubernetes into the data pipeline, resulting in a streamlined and automated data processing and analysis workflow
Worked in a startup environment, collaborating with cross-functional teams to design and implement data solutions that met business needs in a fast-paced and rapidly changing environment
Served as subject matter expert on Snowflake, Terraform, Kafka and Kubernetes for both clients and internal team members
Worked on Prototype to create the external function in snowflake to call remote service implemented in AWS Lambda
Designing and building multiple self-managed high-volume HA Kafka clusters in hybrid environment on bare metal servers, AWS EC2 and Kubernetes
Cost optimization by identifying the resource usage patterns.

Senior Data Engineer

Worldpay For FIS

12.2018 - 05.2021

Involved in complete Bigdata flow of multiple applications, starting from data ingestion from upstream to HDFS, processing, and analyzing data in HDFS
Responsible for developing prototypes of selected solutions and implementing complex big data projects focusing on collecting, parsing, managing, analyzing, and visualizing large sets of data using multiple platforms
Understand how to apply new technologies to solve big data problems and to develop innovative big data solutions
Developed various data loading strategies and performed various transformations for analyzing datasets by using Cloudera Distribution for the Hadoop ecosystem
Worked extensively on designing and developing multiple Spark Scala ingestion pipelines, both Realtime and Batch
Responsible for handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations, and others during the ingestion process itself
Worked on importing metadata into Hive/impala and migrated existing legacy tables and applications to work on Hadoop by using Spark, Hive, and impala
Work on POC's to perform change data capture (CDC) and Slowly Changing Dimension phenom in HDFS using Spark and Delta Lake open-source storage layer that brings ACID transactions to Apache Spark
Extensively worked on POC to ingest data from the S3 bucket to snowflake using external stages
Developed multiple POCs using Spark and deployed on Yarn cluster, compared Performance of Spark with Hive and Impala
Responsible for Performance tuning Spark Scala Batch ETL jobs by changing configuration properties and using broadcast variables
Worked on Batch processing for History load and Real-time data processing for consuming live data on Spark Streaming using Lambda architecture
Developed Streaming pipeline to consume data from Kafka and ingest into HDFS in near real-time
Worked on Performing tuning of Spark Streaming Applications for setting right Batch Interval time, the correct level of Parallelism, and memory tuning
Implemented Spark SQL optimized joins to gather data from different sources and run ad-hoc queries on top of them
Wrote Spark Scala Generic UDFs to perform business logic operations at the record level
Developing Spark code in Scala and Spark SQL environment for faster testing and processing of data and Loading data into Spark RDD and doing In-memory computation to generate output response with less memory usage
Analyzed large amounts of data sets to determine the optimal way to aggregate and report on it
Worked on parsing and converting JSON/XML formatted files to tabular format in Hive/impala by using Spark Scala, Spark SQL, and Dataframe API
Worked on various file formats and compressions Text, Json, XML, Avro, Parquet file formats, snappy, Bzip2, Gzip compression
Worked on performing transformations & actions on RDDs and Spark Streaming data
Involved in converting Hive QL queries into Spark transformations using Spark RDDs, Spark SQL, and Scala
Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data
Installed Open source Zeppelin Notebook for using Spark Scala, PySpark, Spark SQL, and Spark R API's interactively via web interface
Worked on integrating Zeppelin with LDAP for multiuser support in all environments
Responsible for Zeppelin Estimating resources usage and configuring interpreters for optimal use
Developed workflow in Oozie to automate loading data tasks into HDFS and pre-processing data and used Zookeeper to coordinate clusters
Supported the Hadoop platform in identifying, communicating, and resolving data integration and consumption Issues
Worked on Root Cause Analysis and Problem Management processes and assisted support team resolving issues in a data ingestion solution developed on the Hadoop platform
Used Zookeeper for various types of centralized configurations
Met with key stakeholders to discuss and understand all significant aspects of the project, including scope, tasks required, and deadlines
Supervised Big Data projects and offered assistance and guidance to junior developers
Multi-tasked to keep all assigned projects running effectively and efficiently
Achieved challenging production goals consistently by optimizing
Environment: Hadoop, Cloudera distribution, Scala, Python, Spark core, Spark SQL, Spark Streaming, Hive, HBase, Pig, Sqoop, Kafka, Zookeeper, Java 8, and UNIX Shell Scripting, Zeppelin Notebook, Delta Lake, AWS S3, AWS Lambda, Snowflake, SnowSql.

Hadoop/Spark Developer

Wells Fargo

06.2017 - 12.2018

Involved in complete Bigdata flow of multiple applications, starting from data ingestion from upstream to HDFS, processing, and analyzing data in HDFS
Hands-on experience in designing, developing, and maintaining software solutions in the Hadoop cluster
Exploring with Spark improving performance and optimizing existing algorithms in Hadoop using Spark Context, Spark SQL, Data Frame, and Spark Yarn
Worked on POC's with Apache Spark using Scala to implement Spark in the project
Build Scalable distributed data solutions using Hadoop Cloudera Distribution
Load and transform large sets of structured, semi-structured, and unstructured data
Implemented Partitioning, Dynamic Partitions, Buckets in Hive
Extending HIVE and PIG core functionality by using custom User Defined Functions (UDF), User Defined Table-Generating Functions (UDTF) and User Defined Aggregating Functions (UDAF) for Hive and Pig using python
Developed Hadoop streaming jobs to process terabytes of JSON/XML format data
Developed complex MapReduce streaming jobs using Java language using Hive and Pig and using MapReduce Programs using Java to perform various ETL, cleaning, and scrubbing tasks
Developing and running Map-Reduce Jobs on YARN and Hadoop clusters to produce daily and monthly reports per user's need
Develop code in Hadoop technologies and perform Unit Testing
Involved in creating Hive tables, loading structured data, and writing hive queries, which will run internally in MapReduce
Designed the ETL ran performance tracking sheet in different phases of the project and shared it with the Production team
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs and Scala
Involved in using SQOOP for importing and exporting data between RDBMS and HDFS
Used Hive to analyze Partitioned and Bucketed data and compute various metrics for reporting
Used Hive to analyze partitioned and bucketed data and compute various metrics for reporting & used hive optimization techniques during joins and best practices in writing hive scripts using HiveQL
Involved in developing Hive DDLS to create, alter, and drop Hive tables
Involved in loading data from the Linux file system to HDFS
Created PIG scripts to load, transform, and store data from various sources into HIVE metastore
Importing and exporting data into HDFS and Hive using Sqoop
Experienced in managing and reviewing Hadoop log files
Identify and design the most efficient and cost-effective solution through research and evaluation of alternatives
Demonstrated Hadoop practices and broad knowledge of technical solutions, design patterns, and code for medium/complex applications deployed in Hadoop production
Ingested semi-structured data using Flume and transformed it using Pig
Inspected and analyzed existing Hadoop environments for proposed product launches, producing cost/benefit analyses to use included legacy assets
Developed a highly maintainable Hadoop code and followed all best practices regarding coding
Environment: Hadoop, MapReduce, Java, Scala, Spark, Hive, Pig, Spark SQL, Spark Streaming, Sqoop, Python, Kafka, Cloudera, DB2, Scala IDE(Eclipse), Maven, HDFS.

Java Developer

It Keysource

07.2016 - 05.2017

Involved in Various Stages of Software Development Life Cycle (SDLC) deliverables of the project using the AGILE Software development methodology
Implemented the application using Spring MVC Framework and handled the security using spring security
Involved in batch processing using the Spring Batch framework to extract data from the database and load into corresponding application tables
Developed Controller Classes using spring MVC, spring AOP framework
Designed and Developed End to End customer self-service module using annotation-based Spring MVC, Hibernate, Java Beans, and jQuery
Developed the User Interface using JSP, jQuery, HTML5, CSS3 Bootstrap and Angular JS
Implement functionality such as searching, filtering, sorting, categories, validating using Angular framework
Used Angular directives, working on attribute level, element level, and class level directives
Implemented Bean classes and configured in the spring configuration file for dependency injection
Implemented the persistence layer using Hibernate that uses the POJOs to represent the persistence database
Created mappings among the relations and written named HQL queries using Hibernate
Design a common framework for REST API consumption using Spring Rest Templates
Used Design Patterns like Facade, Data Transfer Object (DTO), MVC, Singleton, and Data Access Object (DAO)
Written SQL and PL/SQL queries like stored procedures, triggers, indexes, views
Used Log4j, JUnit for logging, and Testing
Documented all the SQL queries for future testing purposes
Prepared test case scenarios and internal documentation for validation and reporting
Coordinating with the QA team and resolving the QA defects
Wrote services to store and retrieve user data from the Mongo DB
Worked with a WebSphere application server that handles various requests from the Client
Deploying fixes and updates using the IBM WebSphere application server
Experience in developing automated unit testing using JUnit framework
Used Git controls to track and maintain the different versions of the project
Reviewed code and debugged errors to improve performance
Reworked applications to meet changing market trends and individual customer demands
Researched new technologies, software packages, and hardware products for use in website projects
Worked with business users and operations teams to understand business needs and address production questions
Wrote, modified, and maintained software documentation and specifications
Environment: Java, Spring MVC, Spring Batch, Hibernate, Web Services, Html 5, CSS3, JavaScript, Bootstrap, MAVEN, WebSphere, Eclipse, JUnit, jQuery, log4j, Windows, Git.

SQL Developer

It Keysource

01.2016 - 06.2016

Involved in complete Software Development Lifecycle (SDLC)
Interpret written business requirements and technical specification documents
Contribute technical content to as-built documents
Create, document, and implement unit test plans, scripts, and test harnesses
Wrote complex SQL Queries, Stored Procedure, Triggers, Views & Indexes using DML, DDL commands and user defined functions to implement the business logic
Advised optimization of queries by looking at execution plan for better tuning of Database
Worked on the creation of tables, indexes, sequences, constraints, and procedures
Prepared and executed proper data validations
Documenting Procedures and obtaining approvals for various data fetching methods
Worked on attaining the proper data requirements for the SQL code to be written
Performed Normalization & De-normalization on existing tables for faster query results
Wrote T-SQL Queries and procedures to generate DML Scripts that modified database objects dynamically based on inputs
Maintained positive communication and working relationship with all business levels
Reviewed, analyzed and implemented necessary changes in appropriate areas to enhance and improve existing systems.

Education

Master of Science - Computer Information Systems

Florida Institute of Technology

May.2016

Bachelor of Science - Computer Science And Engineering

Jawaharlal Nehru Technological University

2014

Skills

Programming Languages: Python, Scala, SQL, Java, Shell Scripting
Big Data & Analytics: Hadoop (HDFS, MapReduce), Spark, Kafka, Hive, Pig, Impala, Flume, Delta Lake, Databricks, Snowflake, dbt
Cloud Platforms: AWS (EC2, S3, Lambda, Redshift, EMR), Azure (ADF, Databricks, Synapse), GCP (BigQuery, DataProc, Vertex AI)
Data Orchestration & Pipeline Tools: Airflow, Oozie, Kafka Streams, Azure Data Factory

Databases: Relational (Oracle, MySQL, SQL Server), NoSQL (HBase, Cassandra, MongoDB)
Web Technologies: XML, HTML, CSS, JavaScript
DevOps & Infrastructure: Terraform, Kubernetes, Docker, CI/CD (GitHub Actions), Serverless Computing, Load Balancing, Cloud Infrastructure Management, Infrastructure as Code (IaC), Prometheus, Grafana
esting & Tools: PyCharm, VS Code, Eclipse, JUnit, pytest, unittest, QTP, JIRA, Quality Center (QC), Tableau

Name & contact

Accomplishments

Trainings:

Edureka trained and certified Apache Spark and Scala developer

Certifications:

Successfully earned CKAD Certification.
Successfully earned SnowPro Core certification.
Successfully earned Terraform Associate Certification.
Successfully earned AWS Developer Associate Certification.
Successfully earned certification on Apache Spark and Apache Hadoop with IBM Big Data University.
Successfully earned certification on Big data analysis with Apache Spark and Distributed Machine Learning with Apache Spark with edX BerkeleyX Sponsored by Databricks

Personal Information

Certification

Successfully earned SnowPro Core certification.
Successfully earned Terraform Associate Certification.
Successfully earned AWS Developer Associate Certification.
Successfully earned certification on Apache Spark and Apache Hadoop with IBM Big Data University.
Successfully earned certification on Big data analysis with Apache Spark and Distributed
Machine Learning with Apache Spark with edX BerkeleyX Sponsored by Databricks

Timeline

Sr Cloud Architect

Fidelity Information Services

06.2021 - 05.2023

Senior Data Engineer

Worldpay For FIS

12.2018 - 05.2021

Hadoop/Spark Developer

Wells Fargo

06.2017 - 12.2018

Java Developer

It Keysource

07.2016 - 05.2017

SQL Developer

It Keysource

01.2016 - 06.2016

Bachelor of Science - Computer Science And Engineering

Jawaharlal Nehru Technological University

Sr Data Platform Engineer

84.51º

05 2023 - Current

Master of Science - Computer Information Systems

Florida Institute of Technology

Chandrakanth Lekkala

Summary

Overview

Work History

Sr Data Platform Engineer

Sr Cloud Architect

Senior Data Engineer

Hadoop/Spark Developer

Java Developer

SQL Developer

Education

Master of Science - Computer Information Systems

Bachelor of Science - Computer Science And Engineering

Skills

Name & contact

Accomplishments

Personal Information

Certification

Timeline

Sr Cloud Architect

Senior Data Engineer

Hadoop/Spark Developer

Java Developer

SQL Developer

Bachelor of Science - Computer Science And Engineering

Sr Data Platform Engineer

Master of Science - Computer Information Systems

Similar Profiles

NIKI CLORANNIKI CLORAN

Tiffany BarkerTiffany Barker

Makensey BarrMakensey Barr

Ethan BuckEthan Buck

ARNAV BAWANKULEARNAV BAWANKULE