Summary
Overview
Work History
Education
Skills
Projects
Certification
Timeline
Generic

Kalyan Veluri

Ashburn,VA

Summary

  • Result oriented Cloud Data Engineer / Scientist with Stakeholder focus in niche and robust solutions for Cloud Migrations, Data Lake, Machine Learning Models, and BI applications across Marketing, Telecom and Gaming clients
  • In depth technical and business knowledge from 5 years of professional progressive experience in the IT and Data Engineering (Structured and Unstructured/ Big Data) delivery consulting space
  • Built ground-up Data Lake, Data warehousing and Machine Learning solutions leveraging various Frameworks (Hadoop, Spark, Docker, PyTorch, Tensorflow), Cloud services (AWS S3, Data Pipeline, Glue, EMR, Athena, Lambda and others) and programming languages (SQL, Python, Shell Scripting, Java)
  • Extensive knowledge on Amazon Web Services (AWS) EC2, S3, Elastic Map Reduce (EMR) and also on Redshift, Identity and Access Management (IAM).
  • Designed and developed solutions for On-Premise Enterprise Data Warehouse and Business Intelligence Solutions leveraging databases (Oracle, SQL Server, PostgreSQL) and tools (Informatica, Power BI, Tableau, and others)
  • Proficient in SQL, PL/SQL programming skills like Triggers, Stored Procedures, Functions, Packages etc. in developing applications.
  • Experience on Cloud Databases and Data warehouses (Confidential Redshift/RDS)
  • Hands on experience on AWS cloud services (VPC, EC2, S3, RDS, Redshirt, Data Pipeline, EMR, RDS, SNS, SQS)
  • Developed Spark code using Python/Scala and Spark-SQL for faster testing and processing of data.
  • Experience with developing and maintaining applications written for Amazon Simple Storage, AWS Elastic Map Reduce, and AWS Cloud Watch.
  • Experience in writing Down-Stream and up-Stream Pipelines using Python.
  • Good exposure of automation of ETL process using Python and Shell script.
  • Adept with Agile/Scrum, SDLC methodologies

Overview

6
6
years of professional experience
1
1
Certification

Work History

Data Analytics Engineer

WB Games
07.2022 - Current
  • Designed and developed ETL processes in AWS GLUE to source disparate gaming datasets from external sources and ingest into Data Lake and AWS Redshift data warehouse system
  • Expertise in creating, debugging, optimization of ETL Data Pipelines using Apache Airflow to load data into Data Lake stores (Landing/Raw, Integration and Curated) and into Redshift for data analytics workloads
  • Stored and retrieved data from data-warehouses using Amazon Redshift
  • Worked on scheduling jobs using Airflow scripts using python
  • Adding different tasks to DAG's and dependencies between tasks
  • Written Postgres stored procedures, functions, packages, triggers to implement business rules into Application level
  • Configured Spark streaming to receive real time data from kafka and store stream data into AWS S3
  • Collaborated with BI analysts and DBA in curating gameplay data into intuitive and efficient assets that drive insights
  • Created data visualization dashboards and reports through Looker
  • Designed and implemented ETL/Data Pipelines to capture reports into Redshift using Looker API to pull KPIs
  • Leveraged JIRA for Scrum, GitHub for source code control, Jenkins for CI/CD, Confluence for Documentation
  • Used Git for version control, JIRA for project tracking and Jenkins for continuous integration
  • Participated in code reviews with peers to ensure proper test coverage and consistent code standards
  • Collaborated with internal and external stakeholders to optimize data sourcing pipelines and verify data quality
  • Worked with Agile software lifecycle methodologies. Create design documents when and as requires. Perform coding, debugging, and testing
  • Working with agile methodology to ensure delivery of high-quality work with Bi-weekly iteration.

Data Engineer

Kroger
01.2022 - 05.2022
  • Engineered Programmatic Advertising Data and API integration solutions as part of self-service SPMP (Smart Private Market Place) for campaigning and measuring media investments across Kroger
  • Inventory channels
  • Designed and implemented ETL/Data Pipelines to ingest disparate datasets from external SSP/DSP partners into Data Lake hosted in AWS cloud
  • Orchestrated ETL/Data Pipelines using AWS SNS and Lambda functions to automate campaign/deal configuration steps like invoke API calls, trigger AWS ECS tasks, Athena queries for sharing deals data between Audigent and SSP partners and impression/measurement data with Kroger client
  • Developed and optimized ETL Data Pipelines using technologies like AWS Data Pipelines, Airflow, Unix Shell Scripts, Python modules, Athena queries, AWS Lambda Functions, ECS tasks, JSON config, AWS DMS, Marketplace Qlik Attunity Replicator, and others
  • Migrating API, Storage and Query Engines from AWS Beanstalk, S3, Athena to Azure counterparts API Services, Storage Account, Synapse Analytics
  • Created Pods and controlled them using Kubernetes, using Jenkins pipelines to push all micro service builds to Docker registry and then deploy to Kubernetes.
  • Designed compliance frameworks for multi-site data warehousing efforts to verify conformity with state and federal data security guidelines

Big Data Engineer / Data Scientist

CENTREPOINT INFORMATICS PVT LTD.
05.2018 - 04.2021
  • Developed and optimized Spark Python (PySpark) ETL jobs using EMR/Glue for ingesting disparate external data sources into Data Lake and storing transformed data in RedShift for speeding complex analytical query workloads
  • Designed and deployed high performance data pipelines for Data Lake and Analytical applications
  • Familiar with Hadoop file system, AWS S3 storage, and big data formats such as Parquet, Avro, and JSON
  • Created AWS Athena tables and queries for ad hoc data analysis
  • Design, develop and test ETL mappings and workflows using Informatica and analyze systems for accuracy
  • Monitor and adjust end-to-end Informatica workflows to enhance productivity
  • Developed Informatica Power Center mappings, workflows to convert legacy SSIS ETL Executed Data Analysis and Data Visualization on survey data using Tableau Desktop as well as Compared respondent's demographics data with Univariate Analysis using Python (Pandas, NumPy, Seaborn, Sklearn, and Matplotlib)
  • Automate Informatica processes to update status tables after running maps successfully
  • Worked on Tableau to build customized interactive reports, worksheets, and dashboards
  • Reviewed basic SQL queries and edited inner, left, & right joins in Tableau by connecting live/dynamic and static datasets
  • Used version control tools Git to update project with team members
  • Working with agile methodology to ensure delivery of high-quality work with monthly iteration
  • Performed all necessary day-to-day Git support for different projects, Responsible for maintenance of Git repositories, and access control strategies.

SQL Developer Intern

Eduquity Career Technologies Pvt Ltd
10.2017 - 04.2018
  • Work in Agile environment to ensure delivery of high-quality work
  • Design database schemas and ensuring their stability, reliability, and performance
  • Ability to write complex SQL queries and optimizing queries in SQL Server 2012
  • Knowledge and experience in RDBMS concepts, Views, Triggers, Stored Procedures, and Indexes
  • Assist in developing and implementing new technologies
  • Created and improved existing reports and data analytics visualizations using Power BI.

Education

Master's - Computer Science

Cleveland State University
Cleveland, OH
12.2022

Bachelor's - Computer Science

GITAM
Bangalore, India
06.2018

Skills

  • Big Data - Hadoop, Spark, Hive, NoSQL DynamoDB etc
  • AWS Services - S3, EC2, Glue, Data Pipeline, Lambda, SageMaker, CloudWatch, CloudTrail, SNS, SQS, Redshift, EMR, etc
  • Oracle, SQL Server, MySQL
  • ETL - Informatica
  • Power BI, Tableau, Looker
  • Python, SQL, Shell Scripting, Hive/Spark SQL, PySpark, Java
  • GitHub, JIRA, JSON, CI/CD Jenkins, Apache Airflow, Jupiter, Agile/Scrum, Kafka
  • Cloud and On-Premise Data Engineering
  • ETL, Data Warehouse and Data Lake solutions
  • Big Data Integration and Cloud Migration
  • Data Analytics and Data Science
  • Data Pipelines/Workflows and ML Pipelines
  • Agile/Scrum and Waterfall Project Lifecycles

Projects

American Sign Language 

  • Detecting American Sign Language in real-time is implemented using convolutional neural network.
  • Used ASL letter database of hand gestures from Kaggle. Used ReLU activation function and Adam optimizer.
  • Using cv2, I was able to utilize the webcam on my laptop to capture frames and send them through the model to predict what class each frame was.


Facial Expression Recognition Challenge

  • In this project, we will predict expression using data from csv file.
  • Used the Facial Expression dataset from Kaggle by using Kaggle API and save in my google drive.
  • Train and tested the dataset Using SVM, decision Tree and KNN classification algorithms to find the f1_score for each model.
  • Perform grid search to find the best hyperparameters for different algorithms Display the f1-score for each model using data visualization


Business intelligence and Visualization

  • Building a Business Analytic Data Mining Model using Microsoft BI Data Mining tool and OLAP Cubes.
  • Used Adventure Work database that are already created to design and build DW Cubes for BI Project.
  • Used Visual Studio SSDT to build and deploy the cube to SQL Server and Microsoft SQL Server Management Studio for Writing MDX queries to retrieve data.
  • Visualized the data which is retrieved from Each MDX queries or Data Mining Results.


Web Log Parser (Python, Flask, AWS)

  • Created a web application using python where the parser read the time log data files and determine the time author spent on each file.
  • Used AWS as the cloud platform and Flask API to develop the web application.


Intentionally vulnerable web-application 

  • This project is a web application developed though PHP, MySQL and JavaScript which is made vulnerable intentionally.
  • There are many students with the attacking skills who don't know where to test or practice.
  • This project gives a platform to practice those skills legally and gives better understanding of securing the web application.

Certification

AWS Certified Data Analytics – Specialty

Timeline

Data Analytics Engineer

WB Games
07.2022 - Current

Data Engineer

Kroger
01.2022 - 05.2022

Big Data Engineer / Data Scientist

CENTREPOINT INFORMATICS PVT LTD.
05.2018 - 04.2021

SQL Developer Intern

Eduquity Career Technologies Pvt Ltd
10.2017 - 04.2018

Master's - Computer Science

Cleveland State University

Bachelor's - Computer Science

GITAM

AWS Certified Data Analytics – Specialty

Kalyan Veluri