Summary
Overview
Work History
Education
Skills
Websites
Timeline
Generic

Sai Kalyani Rachapalli

Charlotte,NC

Summary

  • Over 7 years of IT experience in Analysis, Design and Development in various business applications as an ETL developer with major contribution in Informatica Power Center.
  • Out of 7 years, 3 years of experience with integration Hadoop into traditional ETL, accelerating the extraction, transformation, and loading of massive structured and unstructured data
  • Well understanding of HDFS Hadoop File System and architecture.
  • Experience in Informatica Administration – Installations, Upgrades, Setup and maintenance.
  • Day to day Code migration in DEV/SIT/UAT/PROD environments.
  • Extensive knowledge in Documenting, Developing and Implementing projects on various Domains.
  • Experienced in Kimball Methodology – Dimensional Modeling, SCD (Slowly Changing Dimension) Types- 1, 2, 3 and 6, Data Marts and data warehouses.
  • Clear knowledge on OLTP and OLAP Systems, developing Database schema – Star and Snow flake schema (Dimensions and Facts), E-R Modeling- Logical and Physical Modeling Concepts.
  • Experience in migrating other databases to Snowflake.
  • Participate in design meetings for creation of the Data Model and provide guidance on best data architecture practices.
  • Participates in the development improvement and maintenance of snowflake database applications.
  • Worked with Informatica Data Quality (IDQ) toolkit for analysis, data cleansing, data matching, data conversion, exception handling, reporting and monitoring capabilities.
  • Good knowledge in interacting with Informatica Data Explorer (IDE), and Informatica Data Quality (IDQ).
  • Hands on experience with Data Profiling/Data Quality using Informatica Developer, BDM and MDM toolset.
  • Extensive usage of Reusable and Non-Reusable Transformations- Expression, Aggregator, Filter, Router, Normalizer, Sorter, Lookup- Unconnected and Connected, Update strategy, Sequence generator, Stored procedure, Joiner.
  • Experienced in developing Complex Mappings, Mapplets (Re-usable Business Logic), Work lets, Tasks- Session, Command and Event wait, Workflows and Batch Processes.
  • Profound knowledge in implementing different types of Caches for Lookup’s- Dynamic, Persistent and Static. Experience in creating Parameters and Variables.
  • Implemented Pre-Session and Post-Session shell scripting to run workflows using pmcmd command.
  • Extraction of data from Sources to Targets- Staging Area, Data Marts or Data Warehouses.
  • Extensive use of Informatica Debugging in mappings and used Session Log File to trace the errors when loading into targets.
  • Experience in performance tuning of Informatica mappings and sessions to improve performance for the large volume projects. Also implemented the performance tuning techniques in the oracle databases using the hints, indexes, compression concepts.
  • Implemented Performance Tuning Techniques on Mappings, Session and Targets.
  • Hands on experience in writing PLSQL- Procedures, Cursors, Triggers and Packages.
  • Advanced experience with HDFS, MapReduce, Hive, Hbase, ZooKeeper, Impala, Pig and Flume, Oozie.
  • Worked closely with Architects when designing the Data Marts and Data Warehouses, with Business Analyst in understanding the business needs and interacting with other team members in completing the task as scheduled.
  • Experience in providing Support to Integration team, Application Development team and to Production team.
  • Experienced in working with Tools- TOAD, SQL Developer and SQL Plus.
  • Extensive knowledge on Shell Scripting experienced in using SED and AWK Commands.
  • Highly motivated and goal-oriented individual with a strong background in SDLC Project Management and Resource Planning using AGILE methodologies.
  • Strong Analytical skills, Communication skills with Good Listening and Interpersonal Skills.

Overview

8
8
years of professional experience

Work History

ETL Developer

Volkswagen Group of America
04.2018 - Current

Description:

Volkswagen group of Americas is one of the largest automakers in the industry. This project involved in setting up a new CRM Salesforce platform replacing the Legacy system called Listen which was in place for the past 30 years.


Responsibilities:

  • Responsible for Business Analysis and Requirements Collection.
  • Worked on Informatica Power Center tools- Designer, Repository Manager, Workflow Manager, and Workflow Monitor.
  • Parsed high-level design specification to simple ETL coding and mapping standards.
  • Designed and customized data models for Data warehouse supporting data from multiple sources on real time.
  • Involved in building the ETL architecture and Source to Target mapping to load data into Data warehouse.
  • Created mapping documents to outline data flow from sources to targets.
  • Involved in Dimensional modeling (Star Schema) of the Data warehouse.
  • Extracted the data from the flat files and other RDBMS databases into staging area and populated onto Data warehouse.
  • Implemented Informatica BDM mappings for extracting data from DWH to Data Lake.
  • Developed BDE mappings using Informatica Developer and created HDFS files in Hadoop system.
  • Design & development of BDM mappings in Hive mode for large volumes of INSERT/UPDATE.
  • Implemented SCD type1 mappings using BDE and load the data into Hadoop Hive tables using Push down mode.
  • Wrote HiveQL queries validate HDFS files & Hive table data to make sure the data meet the requirements.
  • Worked on workflow concept in Informatica BDM
  • Maintained stored definitions, transformation rules and targets definitions using Informatica repository Manager.
  • Used various transformations like Filter, Expression, Sequence Generator, Update Strategy, Joiner, Stored Procedure, and Union to develop mappings in the Informatica Designer.
  • Extensively used workflow variables, mapping parameters and mapping variables.
  • Created mapplets to use them in different mappings.
  • Developed mappings to load into staging tables and then to Dimensions and Facts.
  • Used existing ETL standards to develop these mappings.
  • Worked on different tasks in Workflows like sessions, events raise, event wait, decision, e-mail, command, worklets, Assignment, Timer and scheduling of the workflow.
  • Created sessions, configured workflows to extract data from various sources, transformed data, and loading into data warehouse.
  • Extensively worked on Facts and Slowly Changing Dimension (SCD) tables.
  • Extensively used SQL* loader to load data from flat files to the database tables in Oracle.
  • Modified existing mappings for enhancements of new business requirements.
  • Used Debugger to test the mappings and fixed the bugs.
  • Evaluate Snowflake Design considerations for any change in the application.
  • Build the Logical and Physical data model for snowflake as per the changes required.
  • Wrote UNIX shell Scripts & PMCMD commands for FTP of files from remote server and backup of repository and folder.
  • Involved in Performance tuning at source, target, mappings, sessions, and system levels.
  • Prepared migration document to move the mappings from development to testing and then to production repositories.
  • Migrated the code into QA (Testing) and supported QA team and UAT (User).
  • Created detailed design document for ETLs.
  • Created detailed Unit Test Document with all possible Test cases/Scripts.
  • Conducted code reviews developed by my teammates before moving the code into QA.
  • Created Tivoli jobs and scheduling process to run the jobs in development and test environment.


Environment: Informatica BDM (Big Data Management) 10.5.2/10.4.1/10.2.1/9.6.1, Informatica Power Center 10.4.1/10.2.1/9.6.1/9.5.1, Snowflake, SQL Server 2008, Tivoli workload scheduler, Toad 14.0/12.10/10.1/8.5, Informatica power exchange, Putty, WinSCP

ETL Developer

Cancer Treatment Centers of America
01.2016 - 04.2018

Description:

Cancer Treatment Centers of America is a private, for-profit operator of cancer treatment hospitals and outpatient clinics in Arizona, Georgia, Illinois, Oklahoma and Pennsylvania which provide both conventional and alternative cancer treatments. The hospital is the domain which includes technical and information management functions of maintaining the provider data. The data concerns patient visits and observations kept in all scripts electronic health records. There are systems which also maintain patients’ registration and billing information.


Responsibilities:

  • Gathering process information, business model and requirements.
  • Using Informatica Developer to create mappings and applications to process the data and store in the form of HDFS (Hadoop File System).
  • Creating high level and low level design documents along with unit test cases.
  • Working with Hue, Hive and Impala to query the data for analysis and Big Data management.
  • Integrated Hadoop into traditional ETL, accelerating the extraction, transformation, and loading of massive structured and unstructured data.
  • Involved in analysis of legacy data and preparing low level design to load the data to staging tables.
  • Creating mappings to format the fields as per business unit requirement, the formatting involves conversion of Raw format data into meaningful data through parsing, formatting of various fields.
  • Loaded the aggregate data into a relational database for reporting, dash boarding and ad-hocanalyses.
  • Configured SQL database to store Hive metadata and loaded unstructured data into Hadoop File System (HDFS).
  • Region for the QA of Business Unit and various reports regarding data is generated as requested by them.
  • Developed and conducted unit testing the Informatica mappings / worklets / workflows and scheduled them. Preparation of system test plan.
  • Involved in loading data from UNIX file system to HDFS. Automated the steps to load log files into Hive.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Used HUE to save Hive queries for each required report and for downloading the query results as csv or excel.
  • Conducted analysis of current process running and documenting.
  • Used batch scripts in session level to get files from remote location.
  • Worked on Claims data in building Data warehouse.
  • Performed Informatica code tuning on a regular basis.


Environment: Informatica Power Center 9.6.1, IDQ Informatica Developer Tool 10.1.1, Oracle 12c, windows Power shell ISE, windows batch scripting, Cloudera Manager 3.7.x, Cloudera Manager 4.7.x, CDH 4.6, Cloudera Manager 4.8.2, and Search 1.2

Education

Master of Science -

Eastern Michigan University
Ypsilanti, MI
05.2015

Bachelor of Science -

Acharya Nagarjuna University
Guntur
04.2012

Skills

Data Warehousing

Informatica Power Center (Designer, Repository Manager, Workflow Manager, Workflow Monitor), Informatica BDM, power exchange, Informatica data quality, Snowflake data validation

BI Tools

SAS, Business Objects

Hadoop/Big Data

Hadoop 0202-cdh3u3, HDFS 0202, Map Reduce 0202, Hbase 0904 ,Pig081 ,Hive 071, Impala 12, Sqoop130, Flume094, Cassandra, Oozie 232, HUE 1200, Zookeeper 333, YARN ,Cluster Build, MYSQL, Datameer, R-Analytics, Cloudera Manager 37x,CLoudera Manager 47x, CDH 46, Cloudera Manager 482, and Search 12

Databases

Oracle 12C/10g/11g, DB2, MS SQL Server, Sybase, Teradata

Languages/Web

SQL, PL/SQL, SQL *Loader, Unix Shell Scripting and MS-Excel, XML

GUI Tools

TOAD, SQL Plus, SQL Navigator, Putty, Winscp, FileZilla

Modelling Tool

Erwin

Environment

HPUX, AIX 45, Solaris 2x, MS Windows, Unix, Windows NT

Timeline

ETL Developer

Volkswagen Group of America
04.2018 - Current

ETL Developer

Cancer Treatment Centers of America
01.2016 - 04.2018

Master of Science -

Eastern Michigan University

Bachelor of Science -

Acharya Nagarjuna University
Sai Kalyani Rachapalli