Summary
Overview
Work History
Education
Skills
Certification
Training
Languages
Accomplishments
Websites
References
Timeline
Generic

Pragathi Macha

Herndon,USA

Summary

Experienced IT professional with 14 years of expertise in technology consulting, data warehousing, and cloud-based solutions across domains like FinTech, Mortgage, Oil & Gas, Marketing, and Manufacturing. Proficient in AWS, GCP, and ETL tools, with a strong focus on architecture, implementation, and automation. Certified AWS Developer Associate and trained in AWS Data Specialty and GCP Professional. Skilled in project leadership, technical documentation, and collaboration with stakeholders, bringing a results-driven, self-motivated approach to delivering effective solutions in fast-paced environments. Strong analytical, problem-solving, and communication skills, with a proven ability to manage multiple projects successfully.

Overview

14
14
years of professional experience
1
1
Certification

Work History

Data Engineer / ETL Developer

PayPal Inc.
Herndon, USA
08.2022 - Current
  • Successfully launched Global Buy Now Pay Later Product for Credit Bureau in 8 countries, supporting 5M+ users globally, delivering seamless credit integration for enhanced customer experiences
  • Engineered data integration and transformation processes to support PayPal Credit functionalities
  • Collaborated with PayPal Investors to launch Project Danube in 5 European countries, streamlining operations and delivering data-driven insights
  • Designed and implemented ETL pipelines processing 10 TB of data daily, enabling efficient business analytics
  • Optimized data pipelines, reducing runtime by 25% for faster analytics
  • Developing complex Extract, Transform, Load (ETL) processes and maintained to ensure data is accurately integrated from various sources into target systems using tools like Informatica, AWS/GCP services and SQL/Postgres/Oracle/Big Query
  • Tuned PySpark applications for optimal performance, identifying and resolving bottlenecks to enhance processing speed and resource utilization
  • Developed Python scripts for AWS S3 integration, achieving 99.9% data reliability
  • Combined data from different sources, ensuring consistency and quality throughout the data lifecycle
  • Written clean, efficient, and maintainable code, and participating in code reviews to uphold coding standards and best practices
  • Worked on identifying and resolving issues in data processing workflows, ensuring the reliability and accuracy of data outputs
  • Identified and communicated potential impacts of a change/issue to other areas
  • Created comprehensive documentation for developed solutions, including design specifications, code comments, and user guides
  • Optimized Query Performance, Session Performance Delivering objects for client that is correct and on time
  • Used Control M/DALM tool to create, schedule and monitor the jobs and send the message if any process failures
  • Automated workflows with Control M and GCP DALM, reducing manual intervention by 30%
  • Experience in using tools like Insomnia/Postman to efficiently with API development and regression testing
  • Achieved key Systems Development Life Cycle (SDLC) stages in terms of quantity, timing, and quality of all aspects of work allocated
  • Knowledge of key features of Cloud Service Providers
  • Ability to use continuous integration and distribution pipelines to deploy
  • Created Python scripts to upload data to AWS S3 buckets/GCP GCS buckets using AWS Glue/PySpark/Python scripts
  • Written PySpark scripts to perform complex ETL processes (filter, transform, etc.)
  • Responsible for communication with Client Manager and Server Hosting/Operational Support Vendors in case of Production load failures
  • Understanding of application lifecycle management
  • Worked in Agile Methodology
  • Worked with product owners and other development team members (data engineers/ data scientists) to determine new features and user stories needed in new/revised applications or large/complex development projects
  • Participated in all team ceremonies including planning, grooming, product demonstration and team retrospectives
  • Advanced proficiency in testing
  • Ability to explain Production / technical concepts and analysis implications clearly to a wide audience and be able to translate business objectives into actionable analyses
  • Superior business judgement – ability to flex between big picture thinking, understand and distill complex ideas, and analyze data to drive strategic objectives
  • Supported data migration to the cloud (AWS/GCP), enabling scalability and cost efficiency
  • Participate in code reviews with peers and managers to ensure that each increment adheres to original vision as described in the user story and all standard resource libraries and architecture patterns as appropriate
  • Respond to trouble/support calls for applications in production to make quick repair to keep application in production
  • Set up and configured a continuous integration environment

Data Engineer / ETL Developer

Hexaware Technologies Inc
Herndon, USA
01.2016 - 08.2022
  • Company Overview: Client – Fannie Mae
  • Implemented the Enterprise Data Integration and Information Governance programs along with enhancement activities for Credit Enhancement and Target State Module (CETS) by loading data from various sources into Target Base State Layer
  • Migrated legacy systems to AWS Cloud, cutting costs by 35% and downtime by 50%
  • Developed PySpark scripts for real-time transformations, processing 100K+ transactions daily
  • Enhanced data quality with automated checks, increasing accuracy by 20%
  • Led a team of 5 engineers to build a cloud-based warehouse, supporting 10+ departments
  • Optimized ETL workflows, improving query execution by 40%
  • Collaborated with business customers and analysts to analyze business processes, procedures, and user requirements to establish system requirements and create data warehouse technology solutions aligned with business needs
  • Designed enterprise systems and processes for capturing, integrating, and distributing information effectively
  • Developed solutions and detailed specifications to transform agreed-upon requirements into functional systems
  • Identified and communicated potential impacts of changes or issues on other system areas and processes
  • Led system and application design efforts, including creating system and program documentation and performing ongoing maintenance
  • Created logical and physical data flow models for ETL applications to streamline data processing and integration
  • Developed ETL mappings, workflows, and sessions to extract data from various sources, apply necessary transformations, and load it into OLAP databases based on business requirements
  • Wrote and optimized complex SQL queries to ensure accurate and efficient data loading into databases
  • Hands-on experience with databases including SQL, MySQL, Oracle, Netezza, and Redshift for effective data management and storage solutions
  • Used shortcuts to reuse objects without creating multiple objects in the repository and inherit changes made to the source automatically
  • Creation of mappings for capturing the audit and reject statistics for the support team
  • Delivery of the assigned work within specified timelines with high quality
  • Planned and conducted Informatica ETL unit & development testing activities ensuring that quality meets agreed specifications/requirements
  • Optimized Query Performance, Session Performance Delivering objects for client that is correct and on time
  • Using Autosys tool to create, schedule and monitor the jobs and send the message if any process failures
  • Achieved key Systems Development Life Cycle (SDLC) stages in terms of quantity, timing and quality of all aspects of work allocated
  • Knowledge of key features of Cloud Service Provider
  • Developed PySpark API scripts to bring in required data from multiple sources and upload it to AWS data stores
  • Created Python scripts to upload data to AWS S3 buckets to perform data analysis on EMR clusters
  • Written and scheduled PySpark scripts to filter and transform data from unstructured to structure for ETL
  • Experience with different AWS services like Lambda, EMR, SNS, SQS, S3, Redshift, etc
  • Client – Fannie Mae

Senior ETL Developer

Capgemini
06.2014 - 11.2015
  • Company Overview: Client – Schlumberger
  • Supported around 20 applications and communicated to client manager and all third parties involved in case of production issues and to work on the major and minor enhancements and support tickets
  • Estimation and analysis of the work process and designing Business requirement, technical design documents as per client specification
  • Designed the Logical & Physical model along with mappings sheets for the ETL layer
  • Designed and Implemented the ETL Layers for master’s entities and its information schema
  • Developed ETL Mappings, Workflows and Sessions which reads data from various sources using various transformations to load data into OLAP database as per the business requirements
  • Developed complex SQL queries
  • Use of Shell scripting to create bashes
  • Creation of Unix scripts to load source file which includes reading/copying data from sftp server, creating file lists, archiving the source files and merging files for publishing for downstream applications
  • Being Responsible for executing test cases
  • Testing - unit testing & integration testing & End-to-End testing
  • Resolution of Informatica loading errors in production
  • Documentation of Operational Manuals, Technical Design Specification, ETL Mapping Sheets, Analysis Sheet and Code Review Documents
  • Involvement in Code Migration of modules into SIT, Pre-Prod environments along with phase releases into Production for Go-Live
  • Worked on the resolution of incidents as a part of maintenance activity
  • Resolved production incidents (through diagnosis, testing & applying fix) for assigned application
  • Prepared test cases for Major releases
  • Responsible for communication with Client Manager and Server Hosting/Operational Support Vendors in case of Production load failure
  • Client – Schlumberger

ETL Developer

Deloitte
12.2013 - 06.2014
  • Conducted end-to-end testing of ETL workflows, ensuring 100% compliance with business requirements
  • Automated test case execution, reducing testing time by 30%
  • Estimation and analysis of the work process and designing Business requirement, design documents as per client specification
  • Analyzing the Technical design document and creating the SQLs Designing of KPIs based on the metrics used in Cognos reports
  • Thorough testing of KPIs and raising defects
  • Testing of codes developed in Informatica
  • Experience in performing different kinds of testing like Sanity Testing, Functional Testing, Integration Testing, Re-Testing and Regression Testing
  • Involved in executing the test cases and finding defects and verifying the fixed defects
  • Assisted the team in understanding Informatica development code
  • Involved in documentation of Informatica objects /Scheduled information of the jobs to help in quality testing

ETL Developer

KPIT Cummins
11.2010 - 11.2013
  • Data warehousing implementation for BI Cognos reporting purposes
  • Designed ETL workflows, processing 1M+ records daily for BI reporting
  • Enhanced query performance by 25%, enabling faster business decisions
  • Involved in analysis of different lines of business
  • Analysis of the client requirement and understanding of their business and their data workflow
  • Development of complex SQL queries and designed Informatica Mappings to load the data into warehouse
  • Development of Staging tables using Source as Flat Files, RDBS Sources, SAP R3 source and loaded from source to staging and implemented incremental load logic
  • Implementation of Historical data maintenance logic (SCD 1 and SCD 2 logic), improving reporting accuracy by 20%
  • Utilization of several Transformations like Lookup (Connected & Unconnected), Union, Joiner, Bapi, Normalizer, Router Expression, Aggregator, Sequence Generator, Update Strategy, Stored Procedure etc
  • For implementing the Transformation Logic in the Mappings
  • Creation sessions and scheduled Mappings using Power Center Workflow Monitor for the Daily Load
  • Designed the Error Table for easier validations
  • Creation of new database objects like Procedures, Partitions, Indexes and Views in ODS and DWH
  • Developed scripts for various ETL & integration needs
  • Performance tuning on partitioned tables for the best and optimum query performance
  • Created Batch Files, Parameter file and Parameter variables for implementing different logics
  • Assisted BI Developers to fetch Reports using Cognos
  • Provided modifications to existing functions initiated by Problem Report/Change Request (PRCR) and maintaining product compatibility
  • Identified the functional and non-functional requirement
  • Created unit test cases and performed the same on or before release
  • Served as a single point of contact/responsibility for building and executing the technical implementation or changes in the Production Server
  • Created Design documents, ETL Specifications
  • Assisted production support to the users
  • Trained the new joiners on Informatica Tool
  • Extensively used OBIEE Administration Tool for customization and modification of the physical, business model, and presentation layer of the repository
  • Creating dimension and fact tables, logical columns, hierarchies, level-based measures in Business model & mapping layer
  • Used OBIEE Answers to create reports as per the client requirements
  • Catered to reporting requirements for developing interactive dashboards, reports with different Views (Drill-Down, guided navigation, Pivot Table, Chart, Column Selector, dashboard and page prompts) using Oracle Presentation Services

Education

Bachelor of Engineering - Bio Medical

University of Mumbai

Skills

Operating Systems

UNIX, Windows, Linux

ETL Development Tools

Informatica PowerCenter, Ab Initio 40

Reporting Tools

OBIEE 10g, BI Publisher

Languages/Technology

SQL, PL/SQL, UNIX Shell Scripting, Python, Spark SQL,

Databases

Oracle, Oracle RDS,Netezza, Redshift, PostgreSQL, MySQL, Bigquery

Project tracking tools

Jira, Rally

Other Tools

Toad, Oracle SQL Developer, WinSCP, Putty, Autosys,GitHub, Bitbucket, Jenkins

Cloud Services (AWS/ GCP)

EC2, ECS, EBS, S3, EFS, IAM, SQS, RDS, Lambda, Cloud Watch, Autoscaling, EMR, AWS Glue, and SNS, Data Proc, GCS, BQ

Certification

  • AWS Developer Associate, Amazon Web Services (AWS)
  • AWS Data specialty, Udemy
  • GCP Associate, Udemy

Training

  • Big Data & Hadoop Training: Gained expertise in processing and managing large datasets using Hadoop's distributed computing framework.
  • Google Cloud Platform (GCP) Training: Developed proficiency in deploying and managing applications on GCP, enhancing cloud solution capabilities.
  • Amazon Web Services (AWS) Training: Acquired skills in utilizing AWS services for scalable and secure cloud application development.
  • Apache Spark Training: Learned to perform large-scale data processing and analytics using Spark's in-memory computing capabilities.
  • Python Basics Training: Established a solid foundation in Python programming for scripting and application development.
  • Talend Training: Gained experience in data integration and transformation using Talend's ETL tools.
  • Informatica IDQ and MDM Training: Learned the fundamentals of Informatica's Data Quality (IDQ) and Master Data Management (MDM) for ensuring data accuracy and consistency.
  • Tableau Training: Developed skills in data visualization and business intelligence using Tableau to create insightful dashboards.
  • Reporting Tools: Trained on OBIEE 10g.

Languages

English
Professional
Hindi
Professional
Marathi
Professional
Telugu
Professional
Spanish
Elementary

Accomplishments

• Several recognition emails from Clients and Stakeholders in Fannie Mae and PayPal

• Awarded by Capgemini for the outstanding contribution

• Rewarded by KPIT for being part of the project and awarded WoW award for its consistent delivery and smooth, successful implementation

• Appreciation from Senior Managers and Directors for consistent performance, great commitment towards work and being a team asset

References

References available upon request.

Timeline

Data Engineer / ETL Developer

PayPal Inc.
08.2022 - Current

Data Engineer / ETL Developer

Hexaware Technologies Inc
01.2016 - 08.2022

Senior ETL Developer

Capgemini
06.2014 - 11.2015

ETL Developer

Deloitte
12.2013 - 06.2014

ETL Developer

KPIT Cummins
11.2010 - 11.2013

Bachelor of Engineering - Bio Medical

University of Mumbai
Pragathi Macha