Summary
Overview
Work History
Education
Skills
Projects
Volunteering
Timeline
Hi, I’m

Raj Katta

(Raja Sekhar Katta)
New York,NY
Raj Katta

Summary

Dynamic data leader with over 10 years of experience specializing in data modeling, ETL management, and data quality monitoring. Proven track record in optimizing data architectures and enhancing data integrity across complex systems demonstrates a commitment to excellence. Expertise in collaborating with cross-functional teams drives significant improvements in data transfer and analysis processes. Passionate about implementing innovative solutions using advanced technologies such as Snowflake, Airflow, DBT, and Elementary, while also possessing strong training and mentoring abilities to foster team development and knowledge sharing.

Overview

10
years of professional experience

Work History

Roots Automation
New York, NY

Director of Data Engineering
03.2025 - Current

Job overview

AI & Product Analytics

  • Architected embedded product analytics infrastructure enabling B2B customers to self-serve operational insights on their own product usage
  • Defined and standardized accuracy measurement methodology for Claude/OpenAI-powered document processing models, aligning engineering, product, and customer success behind a unified accuracy narrative
  • Designed low-confidence score detection analytics to surface model errors, improving customer trust and model iteration cycles
  • Delivered customer-facing operational metrics dashboards covering real-time product usage and AI model performance

Architecture & Leadership

  • Built entire data architecture from the ground up at an AI-native company leveraging Claude and OpenAI for insurance document processing
  • Modernized data stack by consolidating all data operations into a unified platform featuring Snowflake, FiveTran, and Omni

Team Building & Governance

  • Pioneered "Data University," a company-wide upskilling program driving a 3x increase in daily active users and a shift toward self-service analytics
  • Mentored cross-functional team members in DBT, SQL, and Git best practices
  • Established formal data governance via RFC processes and On-Call SLA runbooks

Environment: DBT(Data Build Tool), Snowflake, SQL, Python, Azure, Blob store, Eventhub, Airflow, Elementary, Openflow

Thrive Market
New York, NY

Sr Data Architect
06.2022 - 03.2025

Job overview

  • Product & Experimentation Analytics
    Designed and executed A/B test frameworks for new B2C product launches, delivering hit/miss analysis that directly informed go/no-go decisions
  • Conducted time series analyses on customer behavior to identify usage trends, enabling data-driven product feature prioritization

Data Architecture & Infrastructure

  • Optimized UTM validation pipelines and data refresh cadences, reducing processing time by 80% and generating significant infrastructure cost savings
  • Developed standardized EDW architecture with automated data integration processes, improving reliability and cross-system data management
  • Integrated a robust data quality framework into existing architecture, improving data integrity and reliability across teams

Cost & Efficiency

  • Implemented data retention optimization and query execution scheduling strategies, driving a projected $80–90K in annual cost savings
  • Designed scalable query architecture that significantly reduced processing times and improved resource utilization across multiple concurrent projects

Leadership & Team Building

  • Scaled the data team from 2 contractors to a lean, high-impact 4-person team over 2 years, establishing hiring processes and driving team culture from the ground up
  • Led end-to-end data initiatives as the primary analytics stakeholder, partnering with product and marketing teams to define KPIs and deliver insights

Environment: DBT(Data Build Tool), Snowflake, SQL, Python, AWS, Redshift, DMS, Airflow, Elementary

Sema4
New York

Sr Data Architect
03.2021 - 06.2022

Job overview

Key Achievements:

  • Automated Data Transformation Architecture: Established a robust pipeline using Airflow to automate over 20 manual processes, streamlining data transformation and enhancing workflow efficiency.
  • Modular Design for Cancer Data: Developed new modules to integrate various cancer types and test results from diverse health systems, improving data accessibility for analysis.
  • Standardization with OMOP Model: Transformed cancer data into the OMOP common data model, ensuring compatibility and acceptance across the healthcare industry for better data utilization.
  • SQL Query Optimization: Increased efficiency by optimizing SQL queries, significantly reducing the time required to execute data processing modules and improving overall performance.
  • Enhanced Data Analysis Capabilities: Provided structured and standardized data to support doctors and researchers in analyzing cancer metrics, facilitating more informed decision-making.
  • Improved Collaboration Across Systems: Designed an architecture that enhances collaboration and data sharing between healthcare systems, promoting comprehensive cancer research and insights.

Environment: SQL, Python, Airflow, Redshift, AWS services

Clickup
San Diego

Lead Data Engineer
10.2020 - 03.2021

Job overview

Key Achievements:

  • Built Comprehensive Data Architecture: Developed data architecture from ground up by understanding diverse business use cases across organization, ensuring alignment with company objectives.
  • Automated Data Transformation: Utilized Apache Airflow to schedule daily jobs, streamlining data transformation processes and enhancing operational efficiency.
  • Implemented Data Vault Modeling: Created data vault schemas that enable easy analysis by teams, promoting data accessibility and collaboration across ClickUp.
  • Integrated Diverse Data Sources: Successfully migrated large volumes of data from Redshift and RDS to Snowflake, consolidating data from eight sources into a common data lake with optimized schemas.

Environment: SQL, Python, Snowflake, Redshift, Postgres, Fivetran, RDS, AWS services

FreeWheel Media Inc
New York

DATA DEVELOPER/Analyst Consultant
08.2018 - 06.2020

Job overview

Key Achievements:

  • Developed Visual Analytics Dashboards: Created Looker and QuickSight dashboards to effectively visualize business use cases, enhancing client engagement and insights.
  • Delivered Tailored Data Solutions: Designed and implemented solutions to meet client data requests using AWS resources like Lambda and Databricks, with DataDog for monitoring.
  • Engineered Snowflake Data Architecture: Built a robust data architecture in Snowflake to support internal business intelligence analytics across Freewheel.
  • Managed Client Relationships: Successfully managed two clients over eight months, providing customized visualizations and data solutions tailored to their specific requirements.
  • Environment : Databricks, SQL, Looker, Quick-sight, Lambda, ECS, Spark, DataDog, Snow flake, Excel.

FreeWheel Media Inc
New York

Big Data Developer
09.2017 - 07.2018

Job overview

Key Achievements:

  • Led Large-Scale Data Projects: Managed comprehensive data projects encompassing data modeling, ETL development, and data warehousing to support organizational goals.
  • Designed Cloud-Focused Data Architecture: Developed an efficient and scalable data architecture that is GDPR compliant, facilitating targeted customer analysis for analysts.
  • Implemented Robust Security Measures: Planned and executed security protocols to safeguard sensitive data, ensuring compliance with industry regulations.
  • Established Data Accuracy Verification: Developed processes to verify data accuracy, enhancing the reliability and integrity of analytics across the organization.
  • Environment : GO, bash, SQL, python, Airflow, AWS(EC2, Athena, S3, Quicksight), Azakaban.

Bradley University
Peoria

GRADUATE RESEARCH ASSISTANT
10.2016 - 12.2016

Job overview

  • Developed various Machine Learning models to predict riskiest node from various metrics identified
  • Research to predict and formulate resilience of various nodes present in supply chain based on various factors.
  • Designed and developed various risk metrics and test efficiency of risk metrics
  • Environment: C#, Python, Neural Network model, Confidence Factor model, Decision tree model

Education

Bradley University
Peoria

Master of Science from Computer science

Sastra University
Tamil Nadu

Bachelor of Science from Electronics and communication

Skills

  • Data Warehousing: Snowflake, Redshift, Athena, RDS, DynamoDB
  • Data Modeling & Transformation: DBT, Data Vault, Data Lake Architecture
  • Data Quality & Monitoring: Elementary, Metaplane
  • Cloud Services (AWS): EC2, IAM, Lambda, Glue, S3, QuickSight, DMS
  • ETL & Workflow Orchestration: Apache Airflow, Azkaban, Databricks
  • Data Visualization: Looker, Domo
  • Programming Languages: SQL, Python, Go, Bash, Scala, C, Presto, PostgreSQL, NoSQL

Projects

Ad Skipping Analytics

Objective: Analyzed ad receptivity metrics on traditional and addressable platforms.

Achievements:.

  • Formed a team that won the Freewheel Hackathon among 50 teams, gaining a prize to attend AWS re-invent
  • Received visibility and input from various business leaders, resulting in a white paper and personal appreciation from the CEO.
  • Conducted comprehensive analysis of ad skippers using linear set-top box and digital data.
  • Established correlation between linear ad skipping habits and digital ad receptivity, and analyzed user ad viewing habits by vertical.
  • Technologies: Databricks, Scala, S3, Athena, Looker.

Volunteering

Minds Matter
Mentor & Team Lead | 9 years

  • Mentorship: Served as a mentor for three years, attending Saturday sessions to support my mentee, Britney, in her college journey, leading to her acceptance at Boston University and securing four scholarships.
  • Personal Growth: Witnessed significant growth in Britney over the years, fostering her academic and personal development.
  • Team Leadership: Currently in my sixth year as a Team Lead, managing eight mentees and sixteen mentors.
  • Mentor-Mentee Pairing: Analyze interests and personalities to effectively pair mentors with mentees, enhancing the mentorship experience.
  • Team Building: Organize bonding activities to strengthen relationships among mentors and mentees, promoting a supportive community.

Timeline

Director of Data Engineering

Roots Automation
03.2025 - Current

Sr Data Architect

Thrive Market
06.2022 - 03.2025

Sr Data Architect

Sema4
03.2021 - 06.2022

Lead Data Engineer

Clickup
10.2020 - 03.2021

DATA DEVELOPER/Analyst Consultant

FreeWheel Media Inc
08.2018 - 06.2020

Big Data Developer

FreeWheel Media Inc
09.2017 - 07.2018

GRADUATE RESEARCH ASSISTANT

Bradley University
10.2016 - 12.2016

Sastra University

Bachelor of Science from Electronics and communication

Bradley University

Master of Science from Computer science
Raj Katta(Raja Sekhar Katta)