Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

KIRAN CHOWDOJU

Hoffman Estates,IL

Summary

Seasoned Data Engineer/Architect with a decade of experience in developing end-to-end data solutions on cloud platforms. Specializing in modern data lakehouse architectures, data governance, and AI-driven analytics, with proficiency in tools like PySpark. Demonstrated success in optimizing data performance through advanced techniques and managing data lakes on AWS. and Azure Holds multiple certifications in data engineering and cloud technologies, showcasing a commitment to continuous professional development.

Overview

13
13
years of professional experience
1
1
Certification

Work History

Solutions Architect/Lead Data Engineer

Motorola Solutions
Chicago, IL
03.2020 - Current
  • Architected Alation as a central metadata hub, leading foundational work for an integrated data quality framework and enhancing data discovery for applications through Python-automated metadata enrichment.
  • Engineered optimal framework design for Big Data platforms (AWS Redshift, EMR, S3 data lakes), employing schema evaluation, partitioning, bucketing, and S3 optimizations to achieve significant performance gains (e.g., Redshift ETL >80% faster).
  • Defined load testing scope through rigorous UAT performance validation for Big Data Redshift deployments (SC&P cluster), ensuring stability and high performance for critical analytical capabilities.
  • Supported data scientists by delivering a reliable Alation data catalog and enabling Athena for ad-hoc querying on S3 data lakes, providing accurate lineage/metadata crucial for model test and validation groundwork.
  • Demonstrated strong ability to build new data pipelines end-to-end, leveraging AWS Glue for ETL, EMR with PySpark for complex transformations, and AWS Lambda for event-driven processing and automation.
  • Proactively identified existing data gaps and performance bottlenecks in AWS Redshift and Big Data systems (OMAR source analysis), rectifying schema inefficiencies to boost query performance and data integrity.
  • Delivered automated solutions using Python, Lambda, and AWS Glue to process, transform, and deliver enriched data to applications, significantly enhancing analytical capabilities and operational efficiency.
  • Architected and managed robust Big Data solutions on AWS, utilizing Redshift (petabyte-scale), EMR clusters (with Hive & PySpark), S3 data lakes (with Hudi/Iceberg/Delta understanding), EC2 (compute), and Athena for analytics.
  • Orchestrated and automated complex, event-driven data pipelines using AWS Step Functions and Amazon EventBridge, managing and deploying underlying infrastructure components reliably through Infrastructure as Code (IaC) practices.
  • Orchestrated and automated complex, event-driven data pipelines using AWS Step Functions and Amazon EventBridge, managing and deploying underlying infrastructure components reliably through Infrastructure as Code (IaC) practices.
  • Extensively used SNS notifications and CloudWatch alerts for robust logging, monitoring, and proactive issue resolution across all AWS data pipelines and services, ensuring high availability.
  • Primarily utilized Python (with Pandas, PySpark) as a core language from the Tools and Languages spectrum for Big Data processing on EMR, ETL scripting in AWS Glue, and Lambda automation. (Note: Your provided experience still doesn't detail development with Django, Java, or Scala from the JD's list).
  • Implemented and optimized various data pipelines using AWS Lambda for serverless processing, integrated with S3 events and other native AWS technologies for efficient data flow and transformation.
  • Delivered end-to-end lifecycle management for databases including AWS Redshift (advanced tuning, upgrades), MySQL, and AWS DynamoDB, ensuring high performance and availability for diverse application needs. (Note: Experience with DocumentDB or MongoDB from the JD's list was not detailed).
  • Led multiple migration projects and cloud modernization initiatives (e.g., Redshift L0/SC&P workload isolations), showcasing capabilities directly applicable to transitioning on-premise big data platforms to AWS.
  • Championed cost optimization across AWS services by expertly provisioning and right-sizing EC2 instances, optimizing S3 storage, and refining Redshift/EMR configurations, alongside leading various impactful POCs (Redshift Serverless, Alation Cloud, Azure Fabric)

Data Engineer

Asurion
02.2018 - 03.2020

Okay, I understand now. You want the text presented as a list of bullet points, and within each bullet point, only specific keywords or key technical terms should be bolded.

Here is the information formatted that way:

  • ATLAS ingestion is the process to move Data from On Premises (RDBMS) systems to Cloud.
  • Data can be streamed in real time or in batches.
  • In ATLAS framework, the MS SQL server data is ingested to L2 layer of Amazon Redshift and finally into L1 and L2 layer of Hive.
  • The data in Hive is consumed by Business users.
  • Reports are generated using Spotfire upon the data in Hive to get insights of the data and to do forecasting/budgeting.
  • Informatica Power center tool is being used to ingest the data in Redshift and Hive.
  • An Informatica workflow is developed for a table to ingest and once the workflow is deployed to Production, Data Catchup activity is performed to load the complete historical data into cloud.
  • Once the historical data is Catched Up, Active batch (a job scheduling tool) triggers the workflows on a daily basis.
  • Designed, Developed, and tested ETL processes in AWS environment.
  • Worked on data ingestion, encryption etc.
  • Converted excel to csv file using python.
  • Developed python scripts to load data from SharePoint to sql server and Aurora.
  • Developed python scripts to load data from Sql server to Oracle database.
  • Worked on password encryption using python scripts.
  • Worked on file ingestion.
  • Loaded flat files into S3, Hive, and Redshift.
  • Worked on Active Batch templates, event-based schedules in Active Batch.
  • Optimized and refactored existing ETL processes from SQL Server environment to AWS environment.
  • Worked on AWS Redshift database, Distribution key, Sort key, Compression analysis in Redshift.
  • Worked on the unloading into S3 buckets.
  • Worked on external hive tables.
  • Hands on experience in processing data from AWS S3- Buckets using various python modules like S3FS and BOTO3.
  • Extensively worked on various database modules like Sqlalchemy, Pyodbc to load/ retrieve data from On-prem and Cloud databases.
  • Converted excel to csv file using python.
  • Worked on Python Lists, Dictionaries.
  • Worked on IICS scheduling using third party tools.
  • Worked on AWS lambda functions.

ETL Consultant

Honeywell Aero Space
Tempe, USA
03.2017 - 02.2018
  • Worked in Sprit based releases and participated in scrum meeting and daily standup meetings
  • Created Hive external tables and accessed data from S3 data lake
  • Worked on Hudi framework POC
  • Environment: Informatica 10.1, Informatica Cloud, Amazon Redshift, Hive, Active Batch, Hudi Framework

Application Development Senior Analyst

Accenture Solutions Private
Hyderabad, India
12.2014 - 02.2017
  • Project Description: This Project focuses on Life Insurance applications and the information required for issuing new Individual Life Insurance business
  • It assists various business units with system enhancements, maintenance and support
  • It is a method by which prudential writes and issues individual life insurance
  • Involved in gathering requirements from Business and converted them it into Technical documents
  • Extensively worked on Power Center Designer, Workflow Manager
  • Developed ETL technical specs, ETL execution plan, Test cases, Test scripts etc
  • Created ETL mappings and mapplets to extract data from Golden Gate, Stage area and load into EDW (Teradata 14)
  • Queried target database using Teradata BTEQ Warehouse
  • Worked on Programming using PL/SQL, Stored Procedures, Functions, Packages, Database triggers for Oracle and SQL Server
  • Writing Teradata sql queries to join or any modifications in the table
  • Designed and developed table structures, stored procedures and functions to implement business rules
  • Coordinated Daily scrum meetings, spring planning, spring review, and spring retrospective
  • Involved in various phases of SDLC from requirement gathering, analysis, design, development and testing to production
  • Developed Java classes conforming to J2EE design patterns such as Business Delegate, Service Locator, Session faade, Value Object and packaged with J2EE specification and deployed in BEA WebLogic 6.X application server
  • Experienced with transformations like Source Qualifier, Lookups, Expression Editor, Update Strategy, Filter, Router, Joiner and Sequence generator, SQL and etc
  • Used the Aggregator transformation to load the summarized data for Sales and Finance departments
  • Extensive experience using connected and unconnected lookups in several mapping using different caches like Static cache, Dynamic cache and Persistent cache
  • Designed, developed, implemented and maintained Informatica Power center
  • Worked on data cleansing, data matching, data conversion
  • Developed Error handling process incase Fact record gets submitted without corresponding dimension record
  • Created multiple sessions for all the existing mappings
  • Scheduled Informatica Workflows/Sessions to execute on a timely basis and create backup for source files
  • Contribute in estimations for the project, identify and log project milestones and periodically report to management throughout the life of the project
  • Developed UNIX shell scripts to control the parameter files and copy dates from one parameter file to other
  • Created various resources Informatica, Teradata, Erwin and reporting and loaded into Metadata Manager warehouse using Informatica
  • Company Overview: Liberty Mutual Insurance, Hyerabad-India

Software Engineer

Tata Consultancy Services
Hyderabad, telanagna
03.2012 - 12.2014
  • JPMC RFS Migration Wave3: The project is a migration project which involves migration of an existing Data warehouse maintained by Ab-Initio as ETL and DB2 as the primary database to Informatica as ETL and Teradata as the primary database
  • The migration is carried out with the intention of providing better analytical capabilities and deeper drill down of information for the end user and to get single view of customer i.e 360-degree view of customer
  • JPMC uses mainframe as their enterprise platform
  • The core enterprise data is maintained in DB2 database
  • Initially done reverse engineering and documented existing logic by analyzing ab-initio graphs translated them into technical specifications
  • As part of reverse engineering discussed issues/complex code to be resolved and translated them into Informatica logic and prepared ETL design documents
  • Experienced working with team, lead developers, Interfaced with business analysts, coordinated with management and understand the end user experience
  • Used Informatica Designer to create complex mappings using different transformations to move data to a Data Warehouse
  • Developed mappings in Informatica to load the data from various sources into the Data Warehouse using different transformations like Source Qualifier, Expression, Lookup, aggregate, Update Strategy and Joiner
  • Optimized the performance of the mappings by various tests on sources, targets and transformations
  • Scheduling the sessions to extract, transform and load data in to warehouse database on Business requirements using Control-M scheduling tool
  • Extracted (Flat files, mainframe files), Transformed and Loaded data into the landing area and then into staging area followed by integration and sematic layer of Data Warehouse (Teradata) using Informatica mappings and complex transformations (Aggregator, Joiner, Lookup, Update Strategy, Source Qualifier, Filter, Router and Expression
  • Designed and developed complex Router, Sequence Generator, Ranking, Aggregator, Joiner, Lookup, Update Strategy transformations to load data identified by dimensions using Informatica ETL (Power Center) tool
  • Involved in creation and usage of stored procedures and Informatica power mart for data Loads in to data mart on a weekly basis
  • Processed mainframe COBOL EBCDIC fixed-width files
  • Worked on COBOL copybooks
  • Exposure to all the steps of Software Development Life Cycle (SDLC)
  • Also worked in development environment using Agile Methodology
  • Kicked off the Informatica workflows by running control-M scheduler
  • Involved in creating Teradata utilities scripts TPT, Fast load, Multi load, Tpump and Fast Export
  • Extensively worked with UNIX Shell (Korn Shell) scripting to validate and verify the data that is loaded in the flat files
  • Environment: Informatica Power Center 9.1.1, Teradata 13, Ab-initio, db2, Control-M UNIX Shell Scripts

Education

Master of Technology - Power And Energy Systems

National Institute of Technology
India
05.2011

Skills

  • Azure AKS Azure Functions
  • Data Lake ,Delta Lake,One Lake
  • ADLSV2,Sysapse,ADF,OneLake
  • Pipelines,ADF Mounting,ADB
  • DataBricks
  • Redshift,AWS Glue,Aurora MySQL
  • Alation
  • Informatica Metadata Manager
  • Ab-Initio
  • Teradata,Oracle,SQL Server,DB2
  • Python,Shell,Pandas,PySpark
  • Apache Airflow,Active Batch,Tivoli TWS
  • Control-M,ESP Scheduler,S3,EC2
  • RDS,Lambda,AWS DMS,Cloud Watch
  • Amazon RDS,Amazon Aurora,Tableau
  • Tableau Prep,Tableau RMS
  • Tableau Serverless,Tabjolt,Data Download
  • Matillion ETL,Fivetran,Snowflake,Firebolt
  • Amazon Redshift,Redshift serverless,Athena
  • AWS EMR,EMR serverless,Spectrum
  • SQL,ETL,Business Intelligence
  • Data governance
  • Big data solutions
  • Athena, S3, Lambda, Glue, EMR, Kinesis, SNS, CloudWatch, etc
  • Python, Pyspark, Pandas
    Terraform
    S3, EMR, Glue, Lambda, Athena, Kinesis, EC2, SNS, SQS, Cloudwatch
    Redshift, Document DB, DynamoDB and Mongo DB

Certification

  • 🏆AWS Data Engineer-https://www.credly.com/badges/5ee1be92-3789-4729-a0a9-1c09047397c1/public_url
  • 🏆Azure Data Fundamentals-https://learn.microsoft.com/en-us/users/kiranch-3844/credentials/159BC3CC037B794F?ref=https%3a%2f%2fwww.linkedin.com%2f
    🏆Azure Fabric Engineer https://learn.microsoft.com/en-us/users/kiranch-3844/credentials/45a28a0a70cf2cb3?ref=https%3A%2F%2Fwww.linkedin.com%2F
    🏆Dremio Verified Lakehouse Associate https://www.credly.com/badges/d57dce25-c6cd-419b-80be-87028f08bba0/linked_in?t=stz5xq
    🏆Generative AI Fundamentals https://credentials.databricks.com/6296f931-4651-40ca-b2f3-69f46bd0aea4#acc.jQ84wkEC

Timeline

Solutions Architect/Lead Data Engineer

Motorola Solutions
03.2020 - Current

Data Engineer

Asurion
02.2018 - 03.2020

ETL Consultant

Honeywell Aero Space
03.2017 - 02.2018

Application Development Senior Analyst

Accenture Solutions Private
12.2014 - 02.2017

Software Engineer

Tata Consultancy Services
03.2012 - 12.2014

Master of Technology - Power And Energy Systems

National Institute of Technology
KIRAN CHOWDOJU