Summary
Overview
Work History
Education
Timeline
Generic

George Agustine James Ruban

San Antonio,TX

Summary

Over 15+ years of IT experience in designing and leading Enterprise Data Warehouse, data mart, and BI solutions for Banking, Finance and Insurance domains Extensively worked on data extraction, Transaction and loading data from various sources like Oracle, SQL Server, IBM DB2, My SQL, Netezza, Mainframe and Flat files using DataStage, Informatica and UNIX shell scripting Over 6+ years of experience in Hadoop/Big Data Technologies like Hive, HBase, PIG, SQOOP, SPARK, Kafka, YARN Possess strong hands-on experience and functional knowledge in Information management stream that includes Data warehousing/Business Intelligence. Knowledge of best practices on Process Sequence, Data Quality Lifecycles, Naming Convention, Version Control. Practical understanding in Data modeling concepts like Star-Schema Modeling, Snowflake Schema Modeling, Fact and Dimension tables with extensive implementation on Slowly Changing Dimensions - Type I & II in Dimension tables Involved in end to end SDLC of project development by gathering Business requirements, Analysis and Design Reviews, Development, Code walkthroughs, Production implementation and Post implementation validation. Strong Knowledge of data warehouse implementation process, from business requirements through logical modeling, physical database design, data sourcing, data transformation, data loading and performance tuning. 2+ years of experience in data migration from on premise databases such as Netezza/DB2 to Cloud platforms such as Snowflake. Having experience in migration of (DataStage 8.5 to 9.1, DataStage 9.1 to DataStage 11.5),Conversion of ETL Tools from Microsoft SQL Server Integration Services(SSIS) to DataStage 11.5 and also in database migration. Established methods and procedures for tracking data quality, completeness, redundancy, and improvement. Involved in Data Lineage, Data Profiling and Data Cleansing activities using IBM IGC and IBM Information Analyzer Extensive experience in automating several reusable components using Python, UNIX Scripts Experience in Agile methodology implementation using Scrum process Good Leadership skills and Customer interaction skills with proven track record of handling customers Excellent Communication and presentation skills along with ability to quickly adapt to new environments and learn new technologies About 11years experience of working in USA and in Onsite-Offshore working model. Experience in successfully leading technical teams of size up to 15 developers and multiple projects simultaneously Effective in cross-functional and global environments to manage multiple tasks & assignments concurrently Experience in training resources and actively coordinated with onsite business users for project implementation

Overview

25
25
years of professional experience

Work History

Tech Lead

United State Automobile Association, USAA
San Antonio, Texas
10.2017 - Current
  • 7.5/8/9.0.20, Active Batch V9
  • Control Management: RTC, Star Team, GIT
  • Methodology: Agile, waterfall
  • Key Achievements (Tools/Utilities Developed):
  • Re-integration Process – Built reusable process to check and populate missing surrogate keys for all data assets
  • Hadoop Ingestion and Run to Run Control Framework – The framework is designed to be reusable and easily maintainable, and it collects key job and quality control metadata during execution
  • This has helped Bank Reporting and Analytical space on a path to fulfill its data quality mission, also enforced important standards for placing new data assets on Hadoop
  • Quality Framework for Data Streaming – Built reusable streaming controls to ensure the integrity of the data consumed
  • Insync Framework – Built reusable utilities to load, validate, purge data from Netezza to snowflake
  • ETL Framework Migration – implemented the Infrastructure Framework to standardize the data quality across data movement, transformation and storage platforms
  • Lead the team in migrating all the ETL jobs ( ~ 26000 jobs) across USAA to incorporate the framework
  • Project: Bank Credit Card Analytics This project is to build General Purpose Mart (GPM) and Subject Area Mart (SAM) for various credit card product and support the data warehousing applications running in Bank domain
  • The scope extends in also performing proof of concepts on new technology
  • Tools & Technologies: DataStage 11.5, UNIX Shell scripting, Python, Netezza, Control M Scheduler, Apache Nifi, Confluent Kafka, Snowflake, data build tool (dbt), StreamSets, Dremio, GIT
  • Responsibilities:
  • Working closely with business analysts in requirements gathering, reviewing business rules and identifying data sources
  • Developed data flows in Apache Nifi to consume Kafka topic for credit card replatform
  • Built Quality Framework for data streaming
  • Provide design guidance to team
  • Conduct design reviews and provide feedback
  • Document technical risks and responses
  • Worked on Proof of Concept Projects on Dremio and Streamsets
  • Built reusable python utilities to extract data from Netezza database
  • Built Insync Framework to load, Validate, purge utilities using python, shell scripting
  • Built StreamSets jobs to move data from on premise to cloud s3 buckets
  • Built dbt models to pull data from foundation layer (s3 buckets) to integrated and informative layer
  • Integrate, built and deploy the code using GIT flow and CI/CD pipelines
  • Conduct data capacity planning, life cycle, duration, usage requirements, feasibility studies, and other tasks
  • Create strategies and plans for data security, backup, disaster recovery, business continuity, and archiving
  • Coordinated the project development with onsite team ,near shore team in Mexico and off-shore development team in India

Senior Data Engineer

Whataburger LLC
San Antonio, Texas
03.2017 - 10.2017
  • 1: Talent ID 1.13 The objective of the project is to identify the Talent Performance of the “In” and “Above” store Employees and improve the Process
  • Project 2: Big Data Implementation The objective of this effort is to manage the Big Data Environment using IBM Big Insights platform and showcase the capabilities to the Whataburger business users
  • Worked closely with IBM in designing, capacity arrangement, cluster set up, performance fine-tuning, monitoring, structure planning, scaling and installing IBM Big Insights 4.1
  • Project 3: Sentiment Analytics on Customer Data This was a Proof of concept project and the objective of this effort is to showcase the advantages of IBM Big Insights Text Analytics and R Programming
  • Utilized Sentiment Analysis extractor available in Big Insights Text Analytics to find the polarity of the customer feedback for a particular incident
  • Tools & Technologies: DataStage 11.5, SQL Server 2012, Netezza, IBM Big Insights 4.2, Putty, Active Batch V9, R Studio 3.3.3, Aginity workbench, MicroStrategy, Apache Ambari 2.2.0
  • Responsibilities:
  • Worked with subject matter experts and project team to identify, define, collate, document and communicate the requirements
  • Performed source system data analysis in order to manage source to target data mapping
  • Worked closely with IBM in designing, capacity arrangement, cluster set up, performance fine-tuning, monitoring, structure planning, scaling and installing IBM BigInsights 4.1
  • Had a 2-day session with IBM in understanding and learning the value added services provided in IBM BigInsights
  • Conducted a walk through session on IBM BigInsights with all the line of business users in Whataburger
  • Created automated process to Manage, analyze and clear unwanted hadoop log files
  • Created extractors in BigInsights Text Analytics using specific rules which helped the users to extract structured data from unstructured and semi-structured text
  • Configured IBM BigR which helped the business users to explore, transform, analyze and model big data hosted in BigInsights cluster using R Program
  • Automated the process of data archiving in Hadoop Environment
  • Using shell scripts and file watcher programs, automated the process to pull the data into hadoop Environment
  • Utilized Sentiment Analysis extractor available in BigInsights Text Analytics to find the polarity of the customer feedback for a particular incident.

IPF MARKET AND WORKFORCE SOLUTION
08.2015 - 02.2017
  • The objective of “Integrated Planning and forecasting – Market and Workforce solution” is to establish a single strategic and integrated platform for planning and forecasting processes
  • The scope of this project is to leverage the new oracle planning suite to establish the forecasting process
  • It is to deliver an infrastructure which includes the delivery of data staging design for statistical model results that will be consumed by the oracle forecasting solution
  • Project 2: ETL Framework Migration Duration

ETL Lead

United State Automobile Association, USAA
San Antonio, Texas
04.2011 - 02.2017

05.2015 - 06.2016
  • The objective of “ETL Framework Migration” is to find out all the ETL jobs that runs without framework or with Java based ETL Framework and convert them to Python Based ETL Framework.

FASG Analytics
05.2011 - 07.2015
  • The objective of “FASG Analytics” is to build an Analytical Data Store (Data Mart) by extracting the IMCO (Mutual Funds & Brokerage) and Life data from Host system
  • The scope of this project is to implement Mart which would store the Mutual fund and Brokerage details as well as the Health, Annuity and Life Information
  • This project will utilize the data from PIMCO and HAL data sources and will operationalize the data load process within FASG project environment
  • This project also provides high level reconciliation process to confirm accuracy of the data for modelling
  • Tools & Technologies: Informatica 9.6, DataStage 9.1/8.5, IBM DB2, Oracle, Control M 8, RTC, Putty, Hadoop, IBM Information Governance Catalog, Netezza ,IBM Information Analyzer
  • Responsibilities:
  • Working closely with business analysts in requirements gathering, reviewing business rules and identifying data sources
  • Developing Data stage parallel/sequence jobs and implemented slowly change dimensions (SCD) type 1, type 2 on DataStage 9.1 version
  • Worked on migrating all the parallel and sequence jobs form DataStage 8.5 to DataStage 9.1
  • Set up Bigsql environment for data discovery on the Hadoop layer for the business team
  • Worked on the proposal for the ETL Framework Migration
  • Co-ordinated with offshore and identified whether the Control M scripts belong to “no ETL framework” category or “Java based ETL Framework” category
  • Documented the steps to convert the jobs to “Unix Based ETL Framework” and allocate hours for each steps
  • Prepare the timeline to convert all the ETL jobs to Unix Based ETL Framework for all the line of business
  • Prepared the final presentation to showcase the entire effort to the clients
  • Upon initial approval from the Framework team, prepared the Statement of work and followed up with the procurement team and got the final approval for the fixed price project
  • Based on the timeline, started interacting with each line of business and got their approval for conversion
  • Co-ordinated with offshore and implemented the changes, tested the cycle with the help of each line of business SME
  • Created RFC and migrated the jobs to prod
  • Prepared a 24
  • 7 production support plan and Monitored the cycle until the warranty period.

Bank Sandbox Application
06.2009 - 04.2011
  • Bank Sandbox application has all the member debt solutions data kept in the database for the reporting and tactical solution to find out the member capabilities and the debt ratio
  • It also has all the credit details of the members and the reports will be generated based on the sandbox data to find out the areas of improvement and also to maintain the member relationship easier for the business.

ETL Developer

United State Automobile Association, USAA
Chennai, India
03.2007 - 04.2011

12.2007 - 05.2009
  • Market Performance Metrics system (MPMS) is a part of the Reporting and Analysis Program
  • As a part of this project, the data from various line of business was extracted and loaded into the Staging Data Store (SDS)
  • The data from various business systems and different vendors were FTPed over to the UNIX server
  • The ETL tool Informatica was used to pull the raw data, consolidate multiple products and then prepare a metric system on top of it., The purpose of this project is to build a data warehouse to implement strategic reporting and analytic capabilities to support decision making for consumers of P&C data and replace legacy reporting systems, enabling improved time to market of best-in-class products by providing integrated, consistent, timely, and accurate data
  • This would be a single source of information for Actuaries, Marketing, Underwriters and other users
  • The data is moved to staging data store by ETL tool (Informatica)
  • SDS tables are populated from Source tables/flat files/xml by running daily and weekly workflows
  • Reports are created using the Crystal Reports
  • Tools & Technologies: Informatica 8.1.1/8.6, DataStage 8.1, IBM DB2, SQL Developer, Control M 7.5, Star Team, Putty
  • Responsibilities:
  • Analyze the Business Requirements and Involved in Analysis, Design, Development, UAT and Production phases for new modules and enhancements of the application
  • Coordination between offshore & Onsite teams
  • Preparation of the estimates, time lines of the deliverables and project execution plan
  • Analysis of the data sources
  • Preparation of S2T, Analysis and design documents
  • Create Informatica mapping for transformation of data elements and schedule scripts through Informatica scheduler
  • Developed Workflows to run the mappings
  • Prepare and validate product architecture and design model
  • Configuration and defect management using star team
  • Guiding project team to prepare UTP and UTRs for the functionality testing
  • Involved in the project and views creation in Star Team while moving the code to UAT and Runways
  • Involved in all the configuration management activities till code moved to production
  • Involved in Release Management of the application and Production Support for fixing the issues
  • Prepared job warranty list and provided with back out procedures.

Java Developer

SANLAM SA
Chennai, India
11.2006 - 02.2007
  • Contact Center Application (CCA) The objective of “Contact Center Application” is to create a platform for member service
  • The project deals with building an application to interact with Customer about their insurance policy either through phone call or through chat
  • Tools & Technologies: Rational Software Architect (RSA), Star Team, DB2, Wicket Framework, Control Center
  • Academic Qualification:
  • Qualifications
  • Year

Education

Bachelor of Engineering - Electrical & Electronics

Sona College of Technology Anna University
2006

Timeline

Tech Lead

United State Automobile Association, USAA
10.2017 - Current

Senior Data Engineer

Whataburger LLC
03.2017 - 10.2017

IPF MARKET AND WORKFORCE SOLUTION
08.2015 - 02.2017

05.2015 - 06.2016

FASG Analytics
05.2011 - 07.2015

ETL Lead

United State Automobile Association, USAA
04.2011 - 02.2017

Bank Sandbox Application
06.2009 - 04.2011

12.2007 - 05.2009

ETL Developer

United State Automobile Association, USAA
03.2007 - 04.2011

Java Developer

SANLAM SA
11.2006 - 02.2007

Bachelor of Engineering - Electrical & Electronics

Sona College of Technology Anna University
George Agustine James Ruban