Summary

Overview

Work History

Education

Skills

Certification

Timeline

Kishore Kotaru

Arlington,USA

Summary

Data Engineer with 7 years of experience in designing and optimizing scalable data pipelines and ETL workflows for real-time and batch processing. Proficient in big data technologies such as Spark and Hadoop, and cloud platforms including AWS and Azure, enabling efficient processing of large datasets. Expertise in Python and SQL for developing reliable data processing scripts, alongside strong skills in data modeling and query optimization for both relational and NoSQL databases. Proven ability to implement CI/CD pipelines and monitoring solutions to enhance data workflow efficiency and ensure data governance compliance.

Overview

years of professional experience

Certification

Work History

Data Engineer

7-Eleven

Arlington, Texas

02.2023 - Current

Designed and deployed real-time inventory data pipeline using Kafka and Spark Streaming, processing over 5 million daily POS transactions.
Built AWS Glue ETL workflows to transform raw data into optimized Parquet/Delta Lake formats, reducing storage costs by 35%.
Developed PySpark scripts for data cleansing and validation, enhancing accuracy by 25% through automated anomaly detection.
Implemented Change Data Capture with Debezium and Kafka to sync inventory updates across over 10,000 stores in near real-time.
Optimized Redshift clusters via partitioning and distribution keys, cutting query times by 50% for business reporting.
Created Grafana dashboards to monitor pipeline health, tracking latency, throughput, and error rates.
Automated data lineage tracking using OpenLineage to ensure compliance with audit requirements.
Monitored data systems performance, identifying bottlenecks and implementing solutions to maintain system efficiency.
Automated data quality checks and error handling processes to ensure the integrity and reliability of datasets.
Managed version control and deployment of data applications using Git, Docker, and Jenkins.
Migrated legacy batch jobs to serverless AWS Lambda, reducing runtime by 60%.
Worked as part of project teams to coordinate database development and determine project scopes and limitations.

Data Engineer

ABC Fitness

09.2017 - 03.2023

Developed and implemented data models, database designs, data access and table maintenance codes.
Developed Python scripts for extracting data from web services API's and loading into databases.
Optimized existing queries to improve query performance by creating indexes on tables.
Analyzed user requirements, designed and developed ETL processes to load enterprise data into the Data Warehouse.
Created stored procedures for automating periodic tasks in SQL Server.
Researched and integrated new data technologies and tools to keep the data architecture modern and efficient.
Provided technical mentorship to junior data engineers, guiding them on best practices and project execution.
Established and enforced data governance policies and procedures to comply with regulatory requirements and ensure data privacy.
Participated in agile development processes, contributing to sprint planning, stand-ups, and reviews to ensure timely delivery of data projects.
Collaborated with cross-functional teams to gather requirements and translate business needs into technical specifications for data solutions.
Conducted rigorous testing and validation of data pipelines to ensure accuracy and completeness of data.
Collaborated with data scientists and analysts to understand data needs and implement appropriate data models and structures.
Designed data warehousing solutions, applying dimensional modeling techniques for optimized data retrieval.
Implemented data visualization tools like Tableau and Power BI to create dashboards and reports for business stakeholders.
Configured and maintained cloud-based data infrastructure on platforms like AWS, Azure, and Google Cloud to enhance data storage and computation capabilities.
Implemented and optimized big data storage solutions, including Hadoop and NoSQL databases, to improve data accessibility and efficiency.
Optimized Spark jobs for improved performance, scalability, and reliability.
Created Hive tables, optimized queries, stored procedures, functions and views on Hadoop clusters.
Developed and implemented Spark applications using Python and Scala.

Education

Master of Science - Data Science

University of The Cumberland's

Williamsburg, KY

05-2024

Skills

Python (Pandas, PySpark)
SQL (Query Optimization, Window Functions)
Scala (Spark)
Bash Scripting
Spark, Kafka, Airflow
Hadoop (HDFS, Hive)
AWS (S3, Redshift, Lambda, Kinesis, EMR, Glue)
Azure (Data Factory, Synapse, Databricks,)
GCP (BigQuery, Dataflow)
Snowflake / Redshift / BigQuery
PostgreSQL / MySQL / Oracle
Cassandra / MongoDB / DynamoDB
GDPR/CCPA Compliance

CI/CD (Jenkins, GitLab)
Terraform / CloudFormation
Docker / Kubernetes
Git / Bitbucket
Prometheus / Grafana
Big data processing
ETL development
Data modeling
Data governance
Data analysis
Agile methodologies
Data visualization

Certification

AWS Solution Architect Associate

Timeline

Data Engineer

7-Eleven

02.2023 - Current

Data Engineer

ABC Fitness

09.2017 - 03.2023

Master of Science - Data Science

University of The Cumberland's

Kishore Kotaru

Summary

Overview

Work History

Data Engineer

Data Engineer

Education

Master of Science - Data Science

Skills

Certification

Timeline

Data Engineer

Data Engineer

Master of Science - Data Science

Similar Profiles

Yingyi YangYingyi Yang

Lynn ColemanLynn Coleman

Tia HeathTia Heath

Ashton SmithAshton Smith

Joanna Ramirez De HudielJoanna Ramirez De Hudiel