Overall 8+ years of IT experience in Analysis, Design, Developing and Testing and Implementation of business application systems. Proficient in designing, developing, validating, and deploying ETL processes for data collection, data staging, data movement, data quality, and archiving strategies. Expertise in designing ETL jobs using various Talend components, creating context variables, joblets, routines, and utilizing them in project-level jobs. Strong experience in sourcing data from S3 buckets and loading it into Redshift in an AWS environment. Skilled in scaling and performance tuning of Talend jobs for optimal efficiency. Familiar with Talend Administration Center (TAC) for job publication and scheduling using Job Conductor & Execution Plans. Experience in error handling and debugging Talend jobs to quickly identify and resolve issues. Conduct ETL unit and development tests, monitor results, and take corrective actions as needed. Prepare detailed system documentation, including requirements, design specifications, test plans, and user manuals/run books. Experiences on developing & leading the end-to-end implementation of Big Data projects by using Talend BIGDATA, comprehensive experience as a in Hadoop Ecosystem like Map Reduce, Hadoop Distributed File System (HDFS), Hive. Extensive experience in ETL methodology for performing Data Profiling, Data Migration, Extraction, Transformation and Loading using Talend and designed data conversions from wide variety of source systems including Oracle, DB2, SQL server, Teradata, Hive, Hana, and flat files, XML and Mainframe files and Active MQ. Good Experiences on relational database management systems, experience in integrating data from various data source like Oracle, MSSQL Server, MySQL and Flat files too. Hands on experience on Hadoop technology stack (HDFS, Map-Reduce, Hive, HBase and Spark). Excellent knowledge in deployment process from DEV to QA, UAT and PROD with both Deployment group and Import/Exports method. Excellent working experience in Waterfall, Agile methodologies. Familiar with design and implementation of the Data Warehouse life cycle and excellent knowledge on entity-relationship/multidimensional modeling (star schema, snowflake schema), Slowly Changing Dimensions (SCD Type 1, Type 2, and Type 3). Debugging ETL jobs errors, ETL Sanity and production Deployment in TAC-Talend Administrator Console using SVN. Experience in Trouble shooting and implementing Performance tuning at various levels such as Source, Target, Mapping Session and System in ETL Process. Experience in converting the Store Procedures logic into ETL requirements. Good communication and interpersonal skills, ability to learn quickly, with good analytical reasoning and adaptive to new and challenging technological environment. Experience in Big Data technologies like Hadoop/Map Reduce, HBASE, Hive, Sqoop, Hbase, Dynamodb, Elastic Search and Spark SQL. Experienced in ETL methodology for performing Data Migration, Data Profiling, Extraction, Transformation and Loading using Talend and designed data conversions from large variety of source systems including Oracle […] DB2, Netezza, SQL server, Teradata, Hive, Hana and non-relational sources like flat files, XML and Mainframe files. Experiences on Designed tables, indexes and constraints using TOAD and loaded data into the database using SQL
Loader. Experience in AWS S3, EC2, EMR, Lambda, RDS (MySQL) and Redshift cluster configuration. Knowledge on Data Warehousing ETL experience of using Informatica 9.x/8.x/7.x Power Center Client tools - Mapping Designer, Repository manager, Workflow Manager/Monitor and Server tools - Informatica Server, Repository Server manager. Experience in SQL Plus and TOAD as an interface to databases, to analyze, view and alter data. Expertise in Data Warehouse/Data mart, ODS, OLTP and OLAP implementations teamed with project scope, Analysis, requirements gathering, data modeling, Effort Estimation, ETL Design, development, System testing, Implementation, and production support. Experience in Dimensional Modeling using Star and Snowflake Schema, Identifying Facts and Dimensions. Utilized AWS services (S3, EC2, EMR, Amazon Redshift). Experience in implementing Azure data solutions, provisioning storage account, Azure Data Factory, SQL server, SQL Databases, SQL Data warehouse, Azure Data Bricks and Azure Cosmos DB. Implementation of data movements from on-premises to cloud in Azure. Extensive experience in developing Stored Procedures, Functions, Views and Triggers, Complex SQL queries using SQL Server, TSQL and Oracle PL/SQL. Extensive experience in writing UNIX shell scripts and automation of the ETL processes using UNIX shell scripting, and also used Netezza Utilities to load and execute SQL scripts using Unix. Created interactive and visually compelling dashboards and reports using Power BI, enabling stakeholders to gain actionable insights from data. Designed and optimized data models in Power BI for efficient data visualization and analysis. Implemented row-level security and data-level security in Power BI to ensure data confidentiality and compliance with regulatory requirements. Conducted data analysis and performance tuning to optimize ETL processes and Power BI reports for performance and scalability.
Overview
10
10
years of professional experience
Work History
Data Engineer
Walgreens (TCS)
Deerfield, IL
11.2021 - Current
Understand business use cases, integration business, write business & technical requirements documents, logic diagrams, process flow charts, and other application related documents
Used various transformations like Source qualifier, Aggregators, lookups, Filters, Sequence generators, Routers, Update Strategy, Expression, Sorter, Normalizer, Stored Procedure, Union etc
Project development estimations to business and upon agreement with business delivered project accordingly
Built Teradata ELT frameworks which ingests data from different sources using Teradata Legacy load utilities
Maintain and support Teradata architectural environment for EDW Applications
Involved in full lifecycle of projects, including requirement gathering, system designing, application development, enhancement, deployment, maintenance and support
Involved in logical modeling, physical database design, data sourcing and data transformation, data loading, SQL and performance tuning
Implemented complex SQL queries and stored procedures for data manipulation and reporting purposes
Worked on Informatica Advanced concepts & also Implementation of Informatica Push down Optimization technology and pipeline partitioning
Worked with Reporting developers to oversee the implementation of report/universe designs
Tuned performance of Informatica mappings and sessions for improving the process and making it efficient after eliminating bottlenecks
Worked with deployments from Dev to UAT, and then to Prod
Worked with Informatica Cloud for data integration between Salesforce, RightNow, Eloqua, Webservices applications
Expertise in Informatica cloud apps Data Synchronization (ds), Data Replication (dr), Task Flows & Mapping configurations
Worked on migration project which included migrating web methods code to Informatica cloud
Parsed complex files through Informatica Data Transformations and loaded it to Database
Developed SQL queries and scripts to extract, transform, and manipulate data from relational databases for reporting and analysis purposes
Collaborated with cross-functional teams to define data requirements, KPIs, and metrics for reporting and analytics projects
Automated data extraction, transformation, and loading (ETL) processes using scripting languages such as Python or PowerShell to improve efficiency and accuracy
Presented findings and insights to stakeholders through written reports, presentations, and data visualizations.
Cloud Data Engineer
Indiana Department of Homeland Security (IDHS)
Indianapolis, IN
04.2021 - 10.2021
Extensively used Spark stack to develop preprocessing job which includes RDD, Datasets and Data frames Api's to transform the data for upstream consumption
Developed Real-time data processing applications by using Scala and Python and implemented Apache Spark Streaming from various streaming sources like Kafka, Flume and JMS
Replaced the existing Map Reduce programs into Spark application using Scala
Built on premise data pipelines using Kafka and Spark streaming using the feed from API streaming Gateway REST service
Developed the Hive UDF's to handle data quality and create filtered datasets for further processing
Experienced in writing Sqoop scripts to import data into Hive/HDFS from RDBMS
Good knowledge on Kafka streams API for data transformation
Developed oozie workflow for scheduling & orchestrating the ETL process
Used Talend tool to create workflows for processing data from multiple source systems
Created sample flows in Talend, Stream sets with custom coded jars and analyzed the performance of Stream sets and Kafka steams
Design, develop and implement next generation cloud infrastructure at Confidential
Hands-on experience on working with AWS services like Lambda function, Athena, DynamoDB, Step functions, SNS, SQS, S3, IAM etc
Developed internationalized multi-tenant SaaS solutions with responsive UI's using React or AngularJS, with NodeJS and CSS
Creation of indexes, forwarder & indexer management, Splunk Field Extractor IFX, Search head Clustering, Indexer clustering, Splunk upgradation
Install and configured Splunk clustered search head and Indexer, Deployment servers, Deployers
Established monitoring and logging for data pipelines and databases, enabling proactive identification and resolution of issues
Optimized data storage and processing for performance and cost-effectiveness, resulting in improved system efficiency and reduced operational costs.
Sr ETL Developer
Veterans Affair
Austin, TX
08.2020 - 02.2021
Worked on Talend 7.1.1 Analyzing the source data to know the quality of data by using Talend Data Quality Involved in writing SQL Queries and used Joins to access data from Oracle, and MySQL
Solid experience in implementing complex business rules by creating re-usable transformations and robust mappings using Talend transformations like tConvertType, tSortRow, tReplace, tAggregateRow, tUnite etc
Developed Talend jobs to populate the claims data to data warehouse - star schema
Utilized Big Data components like tHDFSInput, tHDFSOutput, tHiveLoad, tHiveInput, tHbaseInput, tHbaseOutput, Developed mappings to load Fact and Dimension tables, SCD Type 1 and SCD Type 2 dimensions and Incremental loading and unit tested the mappings
Used tStatsCatcher, tDie, tLogRow to create a generic Joblets to store processing stats into a Database table to record job history
Created complex mappings with shared objects/Reusable Transformations/Mapplets for staging unstructured HL7 files into Data Vault
Used Data Vault for traditional batch loads and incremental build loads
Integrated java code inside Talend studio by using components like tJavaRow, tJava, tJavaFlex and Routines
Experienced in using debug mode of talend to debug a job to fix errors
Created complex mappings using tHashOutput, tHashInput, tNormalize, tDenormalize, tMap, tUniqueRow
TPivotToColumnsDelimited, etc
Used tRunJob component to run child job from a parent job and to pass parameters from parent to child job
Created Context Variables and Groups to run Talend jobs against different environments
Used tParalleize component and multi thread execution option to run subjobs in parallel which increases the performance of a job
Worked extensively in T-SQL for various needs of the transformations while loading the data into Data vault
Implemented Third party Scheduler (Automic Schedular) to trigger the jobs in TAC server
Have used AWS components (Amazon Web Services) - Downloading and uploading data files (with ETL) to AWS system using S3 talend components
Design and Develop ETL Processes in AWS Glue to migrate Campaign data from external sources like S3, Parquet/Text Files into AWS Redshift
Created SSIS package for loading the data coming from various interfaces and also used multiple transformations in SSIS to collect data from sources.
ETL Consultant
Infosys/Clarios
Milwaukee, WI
12.2019 - 06.2020
Worked on Talend Data Integration/Big Data Integration (6.1/5.x) / Talend Data Quality
Created Talend jobs to copy the files from one server to another and utilized Talend FTP components Created and managed Source to Target mapping documents for all Facts and Dimension tables Used ETL methodologies and best practices to create Talend ETL jobs
Followed and enhanced programming and naming standards
Design and Implemented ETL for data load from heterogeneous Sources to SQL Server and Oracle as target databases and for Fact and Slowly Changing Dimensions SCD-Type1 and SCD-Type2
Utilized Big Data components like tHDFSInput, tHDFSOutput, tPigLoad, tPigFilterRow, tPigFilterColumn, tPigStoreResult, tHiveLoad, tHiveInput, tHbaseInput, tHbaseOutput, tSqoopImport and tSqoopExport
Worked Extensively on Talend Admin Console (TAC) and Schedule Jobs in Job Conductor
Developed Talend jobs to populate the claims data to data warehouse - star schema
Troubleshot long running jobs and fixing the issues
Extensively used tSAPBapi component to load and read data from SAP System
Created jobs to pass parameters from child job to parent job
Developed data warehouse model in snowflake for over 100 datasets
Exported jobs to Nexus and SVN repository
Implemented update strategy on tables and used tJava, tJavarow components to read data from tables to pull only newly inserted data from source tables
Observed statistics of Talend jobs in AMC to improve the performance and in what scenarios errors are causing
Implemented few java functionalities using tJava and tjavaflex components
Part of data loading into data warehouse using big data Hadoop Talend ETL components, AWS S3 Buckets and AWS Services for redshift database
We must design jobs using Bigdata Talend and Pick files from AWS S3 Buckets and Load into AWS Redshift database Part of redshift database AWS maintenance, we must vacuum and analyze our AWS redshift tables
Designed and developed SSIS packages to move data from various sources into destination flat files and databases
Worked on SSIS package, DTS Import/Export for transferring data from database(oracle and text format data) to SQL server.
ETL Talend Developer
Simon Property Group
Indianapolis, IN
04.2017 - 11.2019
Developed ETL (data transformations and data movement) using Talend integration, SQL server
Supported production release during migration and high per care period
Created Java Routines, Reusable transformations, Joblets using Talend as an ETL Tool Developed mappings to load Fact and Dimension tables, SCDType 1 and SCD Type 2 dimensions and Incremental loading
Developed CDC (Change Data Capture) jobs to process inserts, updates and deletes in talend using routines
Refined and performed information relocations with SQL Server and complex Excel macross Made SQL put away systems and capacities for information purifying, organizing, and the consolidation of business rules
Performed issue examination and relocation issue goals with every day customer calls If customer support all through movement
Migrating from Cast Iron (Data stage) Jobs to Talend
Arranged documentation for the help assets on new usefulness and material for examination of client issues Striking Accomplishments
Designed and implemented product data vault to extract, transform and load the product data with history of changes on the products attributes
Made new SQL put away methodology to improve precision and speed of existing Forms
Involved in design, development of Talend mappings
For dividing rows used tSplitrow component and for removing duplicates used tUniquerow
Experienced with structured query nomenclature
Examples include MSSQL, DB2, SQL, etc.
SQL Developer
Usine Technologies
Hyderabad, INDIA
04.2014 - 06.2015
Created and managed schema objects such as Tables, Views, Indexes, and referential integrity depending on user requirements
Actively involved in the complete software development life cycle for a design of database for new Financial Accounting System
Successfully implemented the physical design of the new designed database into MSSQL Server 2008/2005
Used MS SQL Server 2008/2005 to design, implement and manage data warehouses OLAP cubes and reporting solutions to improve asset management, incident management, data center services, system events support and billing
Utilized T-SQL daily in creating customs view for data and business analysis
Utilized Dynamic T-SQL within functions, stored procedures, views, and tables
Used the SQL Server Profiler tool to monitor the performance of SQL Server particularly to analyze the performance of the stored procedures
Stored Procedures and Functions were optimized to handle major business crucial calculations
Implementation of data collection and transformation between different heterogeneous sources such as flat file, Excel and SQL Server 2008/2005 using SSIS
Migrated all DTS packages to SQL Server Integration Services (SSIS) and modified the package according to the advanced feature of SQL Server Integration Services
Defined Check constraints, rules, indexes and views based on the business requirements
Extensively used SQL Reporting Services and Report Builder Model to generate custom reports
Designed and deployed Reports with Drop Down menu option and Linked reports
Created subscriptions to provide a daily basis report and managed and troubleshoot report server related issues.
Data Scientist and Visualization (Tableau) Expert at Walgreens (TCS Project)Data Scientist and Visualization (Tableau) Expert at Walgreens (TCS Project)