Summary

Overview

Work History

Education

Skills

Websites

Certification

Technical Experience

Work Availability

Timeline

Hi, I’m

Vivek Kotha

Data Engineer

Prosper,TX

It's fine to celebrate success but it is more important to heed the lessons of failure.

Bill Gates

Summary

Data Engineer with 7+ years of experience designing scalable cloud-native data platforms using PySpark, Snowflake, and AWS. Strong expertise in dimensional modeling, performance optimization, distributed data processing, and AI-enabled data platforms. Experienced in delivering end-to-end data solutions integrated with modern front-end applications.

Overview

years of professional experience

Certification

Work History

FINRA
Rockville, MD

Data Engineer

08.2024 - Current

Job overview

Developed data-driven front-end applications using React.js and Angular, integrating REST APIs with Snowflake-backed data services to deliver real-time analytics dashboards.
Designed a cloud data lake and analytics layer on AWS using S3, Glue Catalog, and Athena to enable governed storage and fast interactive queries. Integrated on-demand Athena querying directly into the UI and scheduled ETL with AWS Glue to keep datasets fresh and reliable.
Designed and implemented Snowflake dimensional data models (Star/Snowflake schemas) supporting analytics use cases across regulatory datasets, improving query performance by 35%.
Led Snowflake performance tuning initiatives using clustering keys, micro-partition pruning, query profile analysis, and warehouse scaling strategies.
Built scalable ELT pipelines using Snowflake, S3, and Glue, optimizing compute costs and reducing pipeline runtime by 40%.
Implemented secure data sharing and RBAC in Snowflake, enforcing governance and compliance standards for sensitive financial datasets.
Integrated Snowflake Cortex AI capabilities to enable AI-powered insights directly within the data warehouse environment.
Implemented data governance frameworks including encryption at rest/in transit, masking policies, secure views, and audit monitoring to ensure regulatory compliance.
Delivered end-to-end cloud-native data engineering solutions across AWS and Snowflake, from ingestion and modeling to API exposure and front-end integration.

Vista Applied Solutions Group
Herndon, VA

Data Engineer

06.2023 - 07.2024

Job overview

Designed, tested, validated, and implemented predictive models for pricing, operational efficiency, and fraud prevention using statistical analysis and data mining techniques. Demonstrated expertise in transforming raw data into actionable insights to drive profitable growth and support sound business decisions.
Proficient in SAS, SQL, Python, and R for comprehensive data analysis and modeling. Collaborated effectively with cross-functional teams to improve data quality and model accuracy, contributing to the development and optimization of mathematical ratemaking models to meet business and product line objectives.
Created detailed documentation of analytics projects and developed user-friendly dashboards to visualize cause-and-effect relationships. Successfully present complex analytical findings and recommendations to internal management, enhancing decision-making processes.
Committed to professional growth within the field of predictive modeling, maintaining up-to-date knowledge of industry research, developments, and trends. Demonstrated ability to research and apply new programming and data mining techniques, working well under supervision and within team environments to support multiple projects simultaneously.

Amazon Web Services
Herndon, VA

SDE-I (Data Engineer)

09.2022 - 04.2023

Job overview

Built ETL Data pipelines for cleaning and preprocessing data using AWS Glue, Kafka, Databricks, PySpark, SQL, and ML and streamlined predictive modeling on product-based data.
Designed and implemented a real-time processing data pipeline to process semi-structured data by ingesting 100 million+ raw records from various data sources using PySpark, Scala, SQL, and Pandas in Databricks.
Collaborated with cross-functional teams to identify and resolve data quality issues, improving overall data accuracy by 15-40%.
Transformed legacy machine learning models, implemented linear regression, decision trees and fine-tuned using hyperopt and spark trials, reducing development time by 80% through parallelization.

University of South Florida
Tampa, FL

Teaching/Student Assistant

05.2021 - 05.2022

Job overview

Managed product data integration into multiple AWS storage services, including S3, Redshift, RDS, and DynamoDB, resulting in a 50% reduction in data retrieval time.
Performed in-depth SQL query analysis and implemented database normalization techniques, resulting in an improvement in system performance.
I worked in textual mining and NLP components such as NLU and NLG for text analytics with NLTK libraries. Implemented a Generative Model, Generative Adversarial Networks, for data sampling for predictive analysis on neural networks.

TCS
Kolkata, India

Data Engineer

11.2018 - 01.2021

Job overview

Collaborated with a team of 2 data engineers to develop and implement a PySpark-based data ingestion pipeline, resulting in a 30% increase in processing speed.
Created Tableau dashboards to visualize and analyze large datasets, leading to an increase in data-driven decision-making across the organization.
Managed product data integration into multiple AWS storage services, including S3, Redshift, RDS, and DynamoDB, resulting in a 50% reduction in data retrieval time.
Performed in-depth SQL query analysis and implemented database normalization techniques, resulting in an improvement in system performance. Developed complex SQL transformations, window functions, and CTE-based logic to support scalable data marts and KPI reporting layers.
Engineered distributed PySpark data pipelines processing 500M+ records daily using optimized partitioning, broadcast joins, and memory tuning. Implemented Spark performance optimization strategies including caching, skew mitigation, and adaptive query execution.
Designed and implemented scalable ELT pipelines using PySpark and SQL to ingest structured and semi-structured data into Snowflake, improving data availability for analytics teams.
Built dimensional data models (Star and Snowflake schemas) to support enterprise reporting and advanced analytics use cases. Optimized Snowflake workloads by tuning virtual warehouses, implementing clustering keys, and analyzing query execution plans, reducing processing time by 30%.

Education

University of South Florida
Tampa, FL

Master of Science from Business Analytics and Information Systems

01.2021 - 08.2022

University Overview

GPA: 3.75/4.0

Amrita Vishwa Vidyapeetam
Banglore, India

Bachelor of Technology From Electronics And Communication Engineering

06.2014 - 06.2018

Skills

Python
Scala
SQL
Linux/Unix
MS SQL
SSIS
MySQL
Snowflake
Apache Spark
Databricks
NoSQL
DBFS
Parquet
Avro
ORC
JSON
Hive
HBase
Presto
Zeppelin
Hue
Splunk
Flume
Git
GitHub
Jira
Docker
Tableau

Apache Airflow
Azure
Apache Kafka
Agile
Kinesis
SQS
S3
DynamoDB
Lambda
Glue
EMR
Athena
Redshift
TensorFlow
Scikit-Learn
Pandas
NumPy
SciPy
Seaborn
XGBoost
Linear Models
PCA
GLMs
T-SNE
NLTK
LLMs

Websites

Certification

AWS Certified Cloud Solutions Architect - Associate

Google Cloud Certified Professional Data Engineer

Technical Experience

Survival Analysis of Breast Cancer Patients using Data Mining, 2021-07-01, 2021-09-30, Evaluated early detection and recurrence risk using machine learning (Two-class Neural Networks and Two-class Decision Jungle). Estimated the marginal effects of predictors such as age, treatment, cell types, nodes, and diagnosis on survival and likelihood of recurrence.
National Park Service (Full Stack Web application), 2021-01-01, 2021-05-31, Developed an MVC-style full-stack application that ingests real-time data from an external API and persists it to a backend data store, using JavaScript, C#, HTML, and CSS. Hosted the solution on Microsoft Azure, enabling reliable access and streamlined operations for end users.

Availability

See my work availability

Not Available

Available

monday

tuesday

wednesday

thursday

friday

saturday

sunday

morning

afternoon

evening

swipe to browse

Timeline

Data Engineer

FINRA

08.2024 - Current

Data Engineer

Vista Applied Solutions Group

06.2023 - 07.2024

SDE-I (Data Engineer)

Amazon Web Services

09.2022 - 04.2023

Teaching/Student Assistant

University of South Florida

05.2021 - 05.2022

University of South Florida

Master of Science from Business Analytics and Information Systems

01.2021 - 08.2022

Data Engineer

TCS

11.2018 - 01.2021

Amrita Vishwa Vidyapeetam

Bachelor of Technology From Electronics And Communication Engineering

06.2014 - 06.2018

Similar Profiles

Balram Chowdary KondraguntaBalram Chowdary Kondragunta
Data Engineer at Aspervision TechData Engineer at Aspervision Tech
Naren MinukuriNaren Minukuri
Data Engineer at Wellsfargo India SolutionsData Engineer at Wellsfargo India Solutions
HARIHARA PRASAD REDDY VANCHARLAHARIHARA PRASAD REDDY VANCHARLA
Data Engineer at Cliff IT SolutionsData Engineer at Cliff IT Solutions
Anil KanugantiAnil Kanuganti
Data Engineer at TIAAData Engineer at TIAA
Aashish SainiAashish Saini
Data Engineer at Shorthillstech Pvt LtdData Engineer at Shorthillstech Pvt Ltd

CREATE PROFILE

Summary

Overview

Work History

FINRARockville, MD

Job overview

Vista Applied Solutions GroupHerndon, VA

Job overview

Amazon Web ServicesHerndon, VA

Job overview

University of South FloridaTampa, FL

Job overview

TCSKolkata, India

Job overview

Education

University of South FloridaTampa, FL

University Overview

Amrita Vishwa VidyapeetamBanglore, India

Skills

Websites

Certification

Technical Experience

Timeline

Data Engineer

Data Engineer

SDE-I (Data Engineer)

Teaching/Student Assistant

University of South Florida

Data Engineer

Amrita Vishwa Vidyapeetam

Similar Profiles

Balram Chowdary KondraguntaBalram Chowdary Kondragunta

Naren MinukuriNaren Minukuri

HARIHARA PRASAD REDDY VANCHARLAHARIHARA PRASAD REDDY VANCHARLA

Anil KanugantiAnil Kanuganti

Aashish SainiAashish Saini

FINRA
Rockville, MD

Vista Applied Solutions Group
Herndon, VA

Amazon Web Services
Herndon, VA

University of South Florida
Tampa, FL

TCS
Kolkata, India

University of South Florida
Tampa, FL

Amrita Vishwa Vidyapeetam
Banglore, India