Professional Big Data Engineer with 6 years of industry experience including around 2.5 years of experience in Big Data
technologies. Expertise in Hadoop/Spark development experience, automation tools and E2E life cycle of software design
process. Outstanding communication skills, dedicated to maintain up-to-date IT skills and industry knowledge.
Overview
7
7
years of professional experience
Work History
Data Engineer
Amazon
Austin, TX
12.2022 - Current
Migrated the existing data from Mainframes/Teradata/SQL Server to Hadoop and perform ETL operations on it.
Designed and Implemented Sqoop incremental imports, delta imports on tables without primary keys and dates from Teradata and appends directly into Hive Warehouse.
Converted Pig scripts/components of the ETL process (transformations) to Spark API.
Worked with Avro and Parquet file formats and used various compression techniques to leverage the storage in HDFS.
Contributed to internal activities for overall process improvements, efficiencies and innovation
Communicated new or updated data requirements to global team
Designed and developed ETL workflow using Oozie and automated them using Autosys.
Worked on on-call production issues - scrubbing, resolve hive query issues, workaround for defects within SLA duration.
Explained data results and discussed how best to use data to support project objectives
Designed and developed analytical data structures
Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability
Generated detailed studies on potential third-party data handling solutions, verifying compliance with internal needs and stakeholder requirements
Big Data Engineer
Wipro Technologies
Hyderabad, TN
09.2019 - 11.2020
Developed custom UDF's for pig scripts and hive queries to implement business logic/complex analysis on the data.
Reduced the latency of spark jobs by tweaking the spark configurations and following other performance and Optimization techniques.
Responsible for loading unstructured and semi-structured data into Hadoop by creating static and dynamic partitions.
Used Sqoop to import/export data from various RDBMS (Teradata, Netezza, Oracle) to Hadoop cluster and vice versa.
Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability
Prepared documentation and analytic reports, delivering summarized results, analysis and conclusions to stakeholders
Communicated new or updated data requirements to global team
Used GDP on validation protocols, test cases and changed control documents
Built databases and table structures for web applications
Designed and implemented effective database solutions and models to store and retrieve data
Developed, implemented and maintained data analytics protocols, standards, and documentation
Designed and developed analytical data structures
Software Engineer
Cognizant Technologies
Hyderabad, TN
08.2016 - 08.2019
Datastage 8.5, Serena, Unix, TOAD, Oracle 10G Sep'10 to Nov '12.
Translating the business requirements into Datastage jobs.
Tuned the long running jobs by improving the query performance and reduced the running time of the Datastage jobs.
Worked with XML transformer and other complex transformations.
Writing shell scripts to trigger the Data stage jobs.
Scheduling the Datastage jobs using the ESP Scheduler.
Pfizer - Cognos 10, TOAD, Unix Nov'12 to Jul '13.
Develop and validate Reports and Dashborads.
Scheduling Reports and grant permissions to the reports only to the appropriate users.