Summary
Overview
Work History
Education
Skills
Declaration
Timeline
Generic

Anirudh Jaiswal

Beaverton

Summary

Data Engineering Professional with over 8 years of experience in Hadoop Ecosystem components, including MapReduce, HDFS, Spark, and Kafka. Expertise in NoSQL databases like Cassandra and MongoDB, with a strong grasp of data modeling, ETL processes, and migration strategies from RDBMS to Hadoop. Proficient in developing complex SQL queries, utilizing Agile methodologies, and employing tools such as Jenkins and Git for project management. Demonstrated ability in creating scalable data solutions and preparing analytical dashboards using Tableau.

Overview

13
13
years of professional experience

Work History

Senior Data Engineer

NIKE
10.2019 - Current
  • Experience in Developing Spark applications using Spark - SQL in Databricks for data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.
  • Developed a comprehensive data quality framework using Databricks SQL and Great Expectations, improving data reliability by 85% and accelerating data-driven decision-making processes by 30%.
  • Customize exiting enterprise software to improve the data quality and data management by determining various configurations in development environment and leading them from front with other teammates to access the credibility of big data.
  • Design, test and configure technology applications in big data platform.
  • Reduced production AWS EMR processing costs by 25% and decreased downtime by 37% through effective optimization techniques, resource management, and configuration adjustments.
  • Closely worked with Kafka Admin team to set up QA and Prod environments.
  • Part of Data Architecture roadmap and Data Governance for any AWS, Azure and Cloud implementation.
  • Responsible for data loads which requires monitoring and fixing issues of various Airflow DAG's.
  • ETL pipelines in and out of data warehouse using Snowflake and validate against SQL server.
  • Anticipate customer needs and provide inputs related to data issue or access for program solution.
  • Develop code using Hive, SQOOP, Apache Spark and Hadoop Distribution file system (HDFS) to derive master data for Product, Sales and other dimensions across different geography.
  • Work with Business Systems Analysts to analyze application requirements and provide technical knowledge to stakeholders for querying the data.
  • Develop workflows, mappings and sessions using ETL tool to process data through scheduling tools like CA Workload Automation.
  • Responsible for the data flow from different teams and environments to reporting applications and tables and for maintaining the security groups and portals for all the business applications/servers.
  • Work on troubleshooting issues post deployment and come up with solutions required to drive production implementation and delivery.
  • Coordinate with business users to understand business functionalities and their requirements.
  • Help define team best practices and progress implementations.
  • Lead on-site and offshore teams for managing and improvising their skills.
  • Project Enterprise Data & Analytics (EDA) in NIKE, Inc. is an American multinational corporation that is engaged in design, development, manufacturing, worldwide marketing and sales of footwear, apparel, equipment, accessories and services located in Beaverton, Oregon in the Portland metropolitan area.
  • Environment: Databricks,Spark2.2.1,SparkSQL,Scala2.11.8,MapR2.7,YARN,Teradata,HDFS,MapReduce,Datalake,Hive,Sqoop,ETL,Hadoop2.7,NOSQL,Talend,Flatfiles,UNIX,ShellScripting,RDBMS,Linux,Agile,TWS.

Software Developer/Big Data with ETL

Information Resources Inc.
03.2019 - 09.2019
  • Enhanced system performance by developing scalable software solutions in Java and Python.
  • Delivered high-quality applications by leading cross-functional agile development teams.
  • Accelerated data processing through implementing Spark with Scala and Spark-SQL.
  • Created Data Frames for aggregation using Spark code in Scala and PySpark.
  • Configured and built Maven projects within Eclipse IDE for streamlined development.
  • Facilitated bi-weekly sprint meetings to drive testing efforts and solutions.

Big Data and ETL Developer

UnitedHealth Group, Minneapolis, Minnesota.
08.2018 - 03.2019
  • Increased development efficiency by configuring MapR application development tenant.
  • Performed analytics using Hive and Spark SQL with Hadoop YARN integration.
  • Streamlined data extraction and Spark job automation with scripts and TWS.
  • Facilitated data transfers between Linux file systems and HDFS without issues.
  • Optimized ETL processes to improve overall data integration performance.
  • Collaborated across teams to gather requirements for data warehousing initiatives.

Engineering Consultant

Engineering Consultant, ECIL, Hyderabad, India.
05.2013 - 04.2016
  • Enhanced data testing efficiency by streamlining Hadoop application processes.
  • Developed Hive tables for efficient NoSQL data loading and relational database ingestion.
  • Implemented scalable data pipelines to optimize cross-platform data flow and integration.
  • Directed cross-functional teams to establish effective data architecture and governance frameworks.

Education

Master’s in Computer Technology -

Eastern Illinois University
Charleston, USA
01.2017

Bachelor’s in Electronics and Communication Engineering - undefined

Jawaharlal Nehru Technological University
Hyderabad, India
01.2013

Skills

  • Bigdata Ecosystems: Hadoop, Spark, Scala, Sqoop, MapReduce and MapReduce 2, Apache Spark, Kafka, HDFS, Zookeeper, Hive, Impala, PIG, Oozie, Flume, Apache phoenix, Solr, Amazon S3 Kerberos and Sentry
  • Hadoop Distributions: Cloudera, Hortonworks, MapR
  • Languages: Scala, Python, Java, C, C
  • Scripting/Markup languages: XML, HTML, CSS, Bash, Shell
  • Databases: Oracle, SQL Server, HBase, Teradata, PostgreSQL, Snowflake
  • Unit testing tools: Nose tests, Scala unit testing using Eclipse IDE
  • ETL & BI Tools: Cognos, Tableau, Qliksense
  • SDLC Methodologies: Agile, SCRUM, Waterfall
  • Tools: Eclipse, PyCharm, Putty, MobaXterm, IntelliJ
  • ETL development
  • Performance tuning

Declaration

I hereby declare that the information furnished above is true to the best of my knowledge and I bear the responsibility for the correctness of the above-mentioned.

Timeline

Senior Data Engineer

NIKE
10.2019 - Current

Software Developer/Big Data with ETL

Information Resources Inc.
03.2019 - 09.2019

Big Data and ETL Developer

UnitedHealth Group, Minneapolis, Minnesota.
08.2018 - 03.2019

Engineering Consultant

Engineering Consultant, ECIL, Hyderabad, India.
05.2013 - 04.2016

Bachelor’s in Electronics and Communication Engineering - undefined

Jawaharlal Nehru Technological University

Master’s in Computer Technology -

Eastern Illinois University