Resourceful Technical Solution Architect with 17 years, leading and executing a broad range of Enterprise in Cloud/Hybrid solutions - Snowflake /Azure /AWS /GCP / Bigdata Analytics & BI Solution enablement initiatives, Excel at providing scalable and flexible end-to-end solutions to complex business problems with a proven ability to successfully lead technical initiatives from strategy inception through full life-cycle development and engage with stake holders as well the sales teams.
Overview
17
17
years of professional experience
1
1
Certification
Work History
Technical Solution Architect
TFS Toyota
03.2023 - Current
Financial
AORDP is a typical Datawarehouse to enterprise scaling capacity environment containing different line of business from different regions to SFC's subject areas like lease, Auction, Collection, Activity, GLRisk, Lease
Deferral, Loan deferral, Application data, Lease EW charges facilitating analytical decision making and operational data store providing 360-degree view of relationship across the line of business which accesses
Risk platform out of the data
Developed strategies for warehouse implementation, data acquisition and access, and data archiving and recovery
Designed and built relational databases for data storage and processing
Evaluate new data sources for adherence to organization's quality standards and ease of integration
Built data models and defined the structure, attributes, and nomenclature of data elements
Been part of migration existing application/warehouse from aws redshift to Snowflake environment
Re-aligned data modelling using existing with Data vault for changing attributes
Migrated scripts from redshift sql to Snowflake Sql
Loaded historical data and incremental data into snowflake using python framework
Realigned new ETL pipelines.
Technical Solution Architect
Dsilo
01.2023 - 03.2023
The scope of the project was to solutioning and offload Multiple environments solution onto the Hybrid cloud
EDW implementation on On-prem / Cloud stacks to make proper handshaking mechanism to get better insights Out of the data
Developed strategies for warehouse implementation, data acquisition and access, and data archiving and recovery
Designed and built relational databases for data storage and processing
Evaluate new data sources for adherence to organization's quality standards and ease of integration
Built data models and defined the structure, attributes, and nomenclature of data elements
Designed large scale Data architectures with Data Dictionaries, Data Repositories, Data Governance, Big Data Systems, System Latency, Data Recovery, Disaster Recovery, Publish and Subscribe, Service-Oriented Architecture, Representational State Transfer, File Transfer, Data Lakes, and Chain of Custody of Data
Migrated data from multiple sources (Cloud environments AWS, Azure) to snowflake
Achieved goal in maintaining single version of truth
Stage the API data from kafka in Stream sets (In JSON file format) into snowflake by Flattening the same for different functional services and rest transformed into down streams
Design and provide oversight of manual review data.
Cloud Solution Architect
CVS
08.2022 - 11.2022
Dell
The scope of the project was to solutioning and offload Multiple environments solution onto the Hybrid cloud
EDW implementation on On-prem / Cloud stacks to make proper handshaking mechanism to get better insights Out of the data
Architecture and design reviews and assistance (Solutioning and Offloading projects/Process)
Rapid understanding and translating client business changes and concerns into solution oriented
Solutions included cloud agnostic as well optimize pipelines in multiple environments like GCP, Teradata AZURE, Snowflake
Built evolution plan for complete discovery phase for data dictionary and lineage for effective Enterprise reporting purpose
Implemented Data Audit/Ingestion/Archival framework and automate the entire process and reduced the human efforts
Cloud Solution Architect
Biorasi, Appstek CORP
03.2022 - 08.2022
Played an active role in high performance cloud data warehouse architecture and design
Developed complex data models in snowflake/DBT to support analytics and self-service dashboards for better insight and quicker results using clearer data models
Designed and implemented effective analytics solutions and models with snowflake using DBT framework
Built Audit/Data Quality/S2T mapping using Matillion for end-to-end ETL pipelines
Developed simple to complex etl pipelines In and out of Datawarehouse using combination of DBT and snowflake's SnowSQL as well python wherever it pitches in
Participated in a collaborative team designing Custom inbuilt transformations and reusable components for a snowflake and DBT
Tune and troubleshoot snowflake for performance and optimize utilization still snow has internal mechanism To take care of this activities
Written SQL queries against snowflake for adhoc users and business users
Model, Lift and shift custom SQL/Models and transpose the same using DBT with Slowly changing dimension CDC for incremental data tables/views for better insights for subject areas
Developed Patch and real-time data processing layers
Ingested structured as well un-structured/semi-structured data using batch and stream process
Collaborated different pipelines to accommodate workloads
Developed stored procedures and advanced python scripts for CDC
Overseen the migration of data from legacy system based on customer pain areas in terms of accessing data
Guide developers in preparing functional/technical specs to define bookkeeping
Designed data archival strategy.
Cloud Solution Architect
Biorasi, Appstek CORP
12.2021 - 05.2022
Worked on snowsql and snow pipe
Migrated part of sqlserver database into Snowflake environment
Developed data warehouse model in snowflake
Involved in Benchmarking Snowflake to understand best possible way to use the existing cloud resources
Developed ELT/ELT workflows using Matillion for different workloads
Model, Lift and shift custom SQL/Models and transpose the same using DBT with Slowly changing dimension CDC for incremental data tables/views for better insights for subject areas
Used copy command to load bulk load
Created pipes for continues data loading as well scheduled corn in snowflake
Created complex pipelines to capture/store data using DBT Tool
Created views for optimizing in data retrieval for end users
Implemented Datawarehouse model from 1000 data sets into different schemas by leveraging DBT to retrieve data easily/Load the data INTO permanent tables
Evaluated and made most optimized Datawarehouse by design considerations if any changes in the application.
Technical architect/Lead Data engineer
Appstek CORP
04.2021 - 12.2021
Migrated part of Teradata objects /Sap BW into Cloud Hybrid environment as part of POC as part of phase 1
Processed data loading in Big Query from Google cloud storage using Google Data-Proc
Designed and developed and delivered data integration/data extraction/data migration using to GCP
Scheduled end to end using GCP cloud composer service
Migrated ETL jobs to Google Platform
Maintained BQ datasets for reporting requirements
Used most of the services like Google Big Query, Google Cloud Storage, Google Dataflow, Cloud SQL, Google Cloud Data Proc, Google Pub/Sub, Sqoop and Py-Spark for data loading and ingesting
Built and maintained data catalog in GCP Phase – 2
Achieved 90% growth in data quality check creation process by creating rule automation tool in python
Worked in developing ETL pipelines on s3 parquet files in data lake using AWS glue
Performed data analytics on data lake of Nahdi using pyspark and data bricks,
Used components of AWS EC2, s3, EMR, RDS, Athena, Glue
Analyzed data quality issues through exploratory data analysis (EDA) using SQL, python, and relevant Pandas
Worked on creating automation scripts leveraging various python libraries to perform accuracy checks from various source and target systems/Databases
Implemented Python scripts to generate heatmaps to perform issues and root cause analysis for data quality Report failures.
Solution Architect
Saudi Ministry of Tourism, Appstek CORP
02.2021 - 03.2021
Hadoop platform which provides highly scalable and available platform with 360-degree view of citizen portfolio (ETL Process & Data Storage) and to handle large volumes of data using Big Data Technology
Solution to support cost effective BIG-DATA eco-system and combination of BI and Analytics
Loaded Data (Mini Batch/Near Real-time) and integrating with BID-DATA to make it one-UI eco-system
Visualization Tool (Reporting & Analytics): Enhancing User Experience Single Tool for Analysis, Reporting & Dash-Boarding
Enabling Seamless User experience through visualization tool for BIGDATA
Querying Interface: Highly efficient querying interface supporting ad-hoc queries and regular business queries from Business users to Data Scientists
KPI's & Models: Building KPI's and AI based models which will provide insights to key stake holders
Created User interfaces which will be leveraged by end users to access the insights generated from data analysis
Design, implement and maintain all AWS infrastructure and services within a managed service environment
Design, Deploy and maintain enterprise class security, network, and systems management applications within an AWS environment
Implement process and quality improvements through task automation
Institute infrastructure as code, security automation and automation or routine maintenance tasks
Perform data migration from on premises environments into AWS
Support the business development lifecycle (Business Development, Capture, Solution Architect, Pricing and Proposal Development)
Strong knowledge of Amazon Kinesis, AWS Lambda, Amazon Simple Queue Service (Amazon SQS), Amazon Simple Notification Service (Amazon SNS), and Amazon Simple Workflow Service (Amazon SWF)
Strong knowledge with Web Services, API Gateways and application integration development and design
Solution Architect
Imdaad
04.2020 - 12.2020
Route Optimization should use the Smart algorithm and logic to calculate best routes between multiple location on different geographical locations
It also considers the other factors such as diverse vehicles, delivery timing, to be collected material types, historical & live traffic information to calculate the travel time and service duration for each trip/client/location
A route optimization solution should deliver the most profitable route strategy for today's mobile workforces and be a sustainable and environmental company in the region
Migrated applications from internal data center to Cloud
Azure technical customer engagement including architectural design sessions and implementation of projects using big data use-cases based, Hadoop-based design patterns and real time/stream analytics
Enabled End-to-End Cloud Data solutioning and data stream design with tools of Hadoop, Spark, Python
Designed and architected scalable data processing and analytics solutions, including technical feasibility, integration, development for Big Data storage, processing and consumption of Azure data, analytics, big data (Hadoop, Spark), business intelligence (Reporting Services, Power BI), NoSQL, HDInsight, Stream Analytics, Azure Data Factory, Event Hubs and Notification Hubs
Developed spark applications using spark-sql In data bricks for data extraction, transformation, and aggregation from multiple formats for analyzing & transforming the data data to uncover insights into the customer usage patterns
Migrated complete postgres dB to Azure
Data Architect
Virbac, Appstek CORP
12.2019 - 03.2020
Understanding of existing database model / schema loading for reporting purposes
Used Data vault 2.0 to track data audit and
Developed complex dashboards based on customer requirement
Created a proof of concept for the optimal data integration process by taking multiple source systems and relationships that change frequently by using data vault 2.0
Created Tableau scorecards, dashboards using stack bars, bar graphs, scattered plots, geographical maps, Gantt charts using show me functionality
Worked extensively with Advance analysis Actions, Calculations, Parameters, Background images, Maps, Trend Lines, Statistics, and Log Axes
Groups, hierarchies, sets to create detail level summary report and Dashboard using KPI's
Created workbooks and Dashboards using calculated metrics for different visualization requirements
Drew upon full range of Tableau platform technologies to design and implemented concept solutions and create advanced BI visualizations.
Solution Architect
Hippo, Appstek CORP
09.2019 - 12.2019
Understanding of existing database model / schema loading for reporting purposes
Delivered Proof of Concepts for new Solutions on GCP Cloud
Ingest data using Cloud pub sub, created pipelines in cloud dataflow
Data Extraction from Data Warehouse using Google Big Query / SQL Query
Built and maintained data catalog in GCP
Convert the SQL query to DBT using the DBT template language
Developed Python scripts to further validate, aggregate & transform data as per business logic provided for each report
Report delivery automation over Email / FTP / SFTP as per defined frequency based on configuration files
Reports UAT & Signoff with the respective stake holders
Automated the build and configuration of IaaS based solutions in Google Cloud Platform
Reports deployment & maintenance instructions to DevOps team.
Big Data Solution Architect
DGT, Teradata Corp
01.2018 - 04.2019
Ministry of income tax Indonesia Government Project DGT - Work involves working closely with the business counterparts on defining overall BIG Data Framework Strategy and proposing the right solution for Data Management & Cloud integrated with in –premise Hadoop cluster and effectively using the ingestion data security frameworks
Lead this Project using Combination of on-premises Hadoop cloudera components and on Amazon EC2/S3 to collect unstructured data from heterogeneous data sources and use Hadoop to transform data into a structured & meaningful data that will be exposed for search Analytics, Insights, visualization Tableau
Revamped the complete solution by Developing framework to accommodate huge jobs i.e., 20,000 jobs at overall enterprise level within 14 hours of time frames where the existing process was taking 48 hrs
Enabled innovation through continuous development & benchmarking NoSQL (Amazon Dynamo DB, AWS Glue,MongoDB, and Cassandra) data stores, and implementing big data analytics with technologies like Hadoop, Amazon EMR, Amazon Redshift, and Amazon Kinesis, Spark, hive internal to different departments and work streams to support 20,000 jobs on an enterprise levels part of phase 1
Rajasthan Govt, RISl Advanced analytics: End to end implementation of Bigdata Analytics
Defined Next Gen Big data architecture for standardized Transaction Collection, Data Enrichment, optimized data access from Different source heterogeneous sources i.e., unstructured data as well logs, machine generated data)
With huge data load ranging from terabytes to petabytes of unstructured data
SAS Connector: Development of custom SAS connector between Hadoop and SAS and saving around 4 Cr of cost required to buy readymade connector
NFS Mount: Providing innovative solution of NFS mount to HDFS in Bi-Directional ways to solve critical space issue of several departments leading to cost saving on SAN storage
Solr Search: Solr Un-structured data content-based search implementation on various file types as readable pdf's word document
Data Lake: Huge data lake creation with data from different departments i.e., Transport, Commercial Tax, Excise, PDS, Grievance, Transaction, and welfare
Big data Digital Library: Creation and deployment of big data library for different type of documents storage and search for the department as well as the citizens
The data types are Audio, Video, Documents, PDF, social media contents, blogs and Images.
Telenor
Teradata Corp
The scope of the project was to offload DPI solution onto the Hadoop Platform
Initially the EDW implementation processed two billion CDRs in two days and business were getting reports twice a week, which during the implementation of the Hadoop solution increased to almost twenty billion CDRS in one day which we are able to manage loading in one day and providing reports to business on daily basis
In current implementation Source files are being placed on the EDGE node which are loaded into HDFS (Hadoop Distributed File System), All transformations and aggregations are being performed using PIG, HIVE and MapReduce jobs
Aggregates for a single day are moved to EDW via TDCH (Teradata Connector for Hadoop) and the reports are being refreshed on the moved data
Architecture and design reviews and assistance (DPI and Offload projects)
Shell scripts to handle loading
Implemented Data Audit/Ingestion/Archival framework and automate the entire process and reduced the human efforts
Professional services Consultant
Aircel India, Teradata Corp
10.2016 - 12.2016
Architecture and design reviews and assistance
Support sales team in RFP
Big Data Solution Architect
Petronas, Malaysia, Teradata Corp
04.2016 - 10.2016
As part of Petronas, I was involved in Petronas business critical Ingestion and data analytics visualization project my focus is to develop innovative R&D technology through hybrid data science talent team in Petronas Harnessing big data analytics with the state of the art of advanced analytics robotics IOT-HPC & cloud-based platform data to drive action insights in oil and gas: E&P efficiency: Maximize ROI, optimize operations, while reducing cost of maintenance & identify sever risk
Migrated applications from internal data center to AWS
Enabled End-to-End Cloud Data solutioning and data stream design with tools of the trade: Hadoop, Storm, Hive, Pig, AWS (EMR, Redshift, S3, etc and Data Lake Design
Reports design reviews and assistance
Designed and implemented Complex Tableau reports
Worked with stakeholders on requirements engineering- shaping requirements against unarticulated needs of an executive presence
Solution Architect
Big Data
10.2015 - 03.2016
Big Data Solution Architect
Uninor, Teradata Corp
08.2015 - 10.2015
The scope of the project was to offload solution onto the Hadoop Platform
Initially the EDW implementation processed two billion CDRs in two days and business were getting reports twice a week, which during the implementation of the Hadoop solution increased to almost twenty billion CDRS in one day which we are able to manage loading in one day and providing reports to business on daily basis
Highly accurate estimation for a huge PS engagement across multiple projects, with Uninor, India
Offloading of huge data to cost effective hadoop platform and integrated dwh data into hadoop system
Implemented effective reports and dashboards and decommissioned the unused kpi's
Big Data Solution Architect
Globe Telecom, Teradata Corp
07.2015 - 07.2015
Worked with Amdocs for implementing Hadoop Optimization/Capacity management for huge loads of CDR data ingestion pipelines
Cluster log level analysis and optimized entire process by re-writing the process to HBase
Hbase I/O Optimization, Garbage Collection
Scheduling compaction to off peak time
Big Data Technical Architect
Wells Fargo Bank
12.2014 - 07.2015
Worked with Volker rules which are issued by the Federal Deposit and Federal Reserve Board that the rules consist of following sections Accomplished with reporting and record keeping requirements using Hadoop eco system and end to end implementation of Bigdata eco system
Developed Ingestion framework using Spring XD/Modules/Processer/Jobs
Compliance program requirement's and violations if any has been covered and developed huge Hadoop integrated warehouse for further analysis and reporting to US federal Reserve board to authorize the data which are as per regulations
Written hive scripts to extract Value at risk (VaR) and stress Value-at-risk(stress-Var)
Written pig scripts to validate Risk Factor Sensitivities
Developed Spring XD module to calculate Risk and Position Limits & Usage
Developed Inventory Aging for trades using Spring XD/impala
Developed Spring XD job/Module to capture Customer facing trade ratio – trade and value count
Developed data framework which will capture all the entries of enumerations in the trade life cycle
Real-time analytics at ingestion time, e.g
Gathering metrics and counting values at all the levels of ingestion and capturing
Developed Spring XD project that aims to provide a one stop shop solution for many use-case
Reports design reviews and assistance
Big Data technical Architect
Smart Analytics
12.2012 - 12.2013
CISCO
Major contribution at Cisco was the successful implementation of Collaboration & Platform for (Big data Analytics – Hadoop, Sqoop, Hive, Pig, R, SAS, Tableau, Pentaho Business analytics) played the role of a Solution Architect
Value of collaboration tools/solutions and to use data for analytics and visualization as we move into the next generation of metrics reporting and benchmarking
The Collaboration Change Management (CCMS) service in the
Advanced Services Collaboration Architecture Group who chartered to assist external customers with the adoption of Cisco's collaboration tools or products so that customers get the full business value from them and buy more
Involved in prototyping of complete phases to discover the bottle necks
Developed Pentaho suite for Ingestion work where data validation and consolidation takes place in single job
Contributed a lot to stabilize the Hadoop eco system in all configurations
As part of ingestion/cleansing/Staging the data developed multiple Pig/Hive/Pentaho jobs for extraction, transform, load the data into Hadoop warehouse
Configured Carte server for load balancing for huge loads in a distributed mode
Written multiple scripts for reconciliation/error handling/Audit and the same has been integrated into Ingestion framework and Migrated and re-written SAS dataset extraction logics into Hive/Pig/R
Bank of America(CDM), USA, Infosys
Big Data Technical Architect
12.2012 - 04.2013
Project to archive last 10 years' data, which will be consumed by Analytics Users
For archiving historical data Hadoop Data File System is used
This distributed, scalable, and portable file system
The data to be archived is received from Core/Non-Core Tables of the Target Layer through Sqoop
Sqoop is a command-line interface application for transferring data between relational databases and Hadoop
The data in Hadoop is accessed through Hive to facilitate easy data summarization, ad-hoc queries and analysis of the data
Design, build and measure data ingestion and integration pipelines for large volumes of temporal data from different sources, database extraction (Teradata)
Design, build, and measure complex ETL jobs to process disparate, dirty data sources and form a high integrity, high quality, clean data asset
Support and scale vertica analytics database
Developed framework which acts as plug and play to Monitor and track data quality and data flow dynamics
Customer insight & Retail Analytics Risk management, Australia
Research Analyst
09.2009 - 04.2012
Retail Analytics data model as Research Analyst for the Retail analytics group and are responsible for delivering a range of banking products and services to retail customers, while Commercial services small to medium enterprises through to smaller corporate
The division also has a dedicated wealth management business designed to meet the needs of high-net-worth individuals
Specialized businesses Esanda and Agribusiness ensure the division is well placed to help customers enjoy more convenient banking while supporting local business to grow
Gather business requirements from Client Services Personnel
Direct communication with Clients to understand specific needs to build a data model
Provide internal users, analytic staff, and external clients support for proprietary solutions/services, and ad-hoc statistical programming and adhoc reporting using Obiee
Eliminated 40% manual Reporting efforts by driving creation of automated reports on Data Cube & distribution via an integrated Sales Workspace
Led several Project process on cutting edge BI technologies Actuate & OBIEE
Aviva Customer value mgmt., USA, TCS Jul 08
DWH programmer
01.2009
Transformation from the current fractured marketing process, which uses fragmented, brand-led data environments & independent teams, to a more integrated capability that is truly customer-centric and competitively advantageous
Investment in this capability is critical to profitable growth in the future
Design, build and measure data ingestion and integration pipelines for large volumes of temporal data from different sources, database extraction (Teradata).
DWH programmer
Allianz
11.2006 - 05.2008
SAP fin Implementation, IBM, NOD and RAPID are both input facilities and reporting mechanisms
NOD is 100% manually populated by users
RAPID is about 50% pre-populated by a data management resource on the business side
The data manager cleanses and matches data from multiple systems, including OPTA, PCPEU Facility, Customer View and EPIC, and feeds the information into RAPID systems which are in Informatica and SAS 7
Education
Master Of Computer Applications - Computer Applications & Information Technology
Madras University
2005
BCA, Bachelor of Computer Applications - Computer Applications & Information Technology
S.V University
2002
Skills
Specialized in Design Specifications, Competitive Assessment, Business Solutions Prowess, Client Requirements, Design Improvements, Delivery Management Processes, Technology Solution Design, Systems Thinking, Migration Strategies, Customer Experience Improvement, Solution Implementation, Solution Prototypes, Managing Multiple Projects, Production Work, Business Solutions, Identify Requirements, Process Evaluations, Continuous Deployment, Client Engagement, Best Practices and Methodologies, Architectural Practices, Integration Architectures, Systems Design Analysis, Data Retention, Mission Critical Applications, Process Analysis, Database Performance, System Architecture Design, Business Intelligence Data Modeling, Solution Management, Collaboration Framework, Improvement Recommendations
Technical Skills
AZURE, Data Factory, Data Bricks,
Snowflake, SnowPipe, SnowSQL,SnowPark
Google Big Query, Google Cloud Storage, Google Dataflow, Cloud SQL, Google Cloud Data Proc, Google Pub/Sub
Sourcing Manager at Cummins India, Pune ( Payroll - TFS Global System Pvt. Ltd.)Sourcing Manager at Cummins India, Pune ( Payroll - TFS Global System Pvt. Ltd.)