Over 6 years of professional experience in Data Modeling, Design and Development of Big Data technologies in depth understanding of Hadoop Distributed Architecture and its various components such as Node Manager, Resource Manager, Name Node, Data Node, Hive Server2, HBase Master, Region Server etc. Strong Proficiency in developing, deploying, and debugging cloud-based applications using AWS. Strong experience creating real time data streaming solutions using Spark streaming and Kafka. understanding of cloud-native applications to write code using AWS security using like I Am Roles & etc. Strong Knowledge for AWS Data Migration using DMS, Kinesis, Lambda, EMR, and Athena. Experience of pySpark on Azure Databricks for data cleaning, manipulation and optimization on Business needs. Expertise in writing end to end Data processing Jobs to analyze data using MapReduce, Spark and Hive. Experience in Kafka for collecting, aggregating and moving huge chunks of data from various sources such as web server, telnet sources etc. Experience on insurance and banking domain and also integrating applications like zeppelin and Docker. Developed Sqoop scripts for large dataset transfer between Hadoop and RDBMs. Experience in AWS service APIs, AWS CLI, and SDKs to write applications. Strong experience in working with UNIX/LINUX environments, writing shell scripts. Very good understanding of Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance. Extensive experiences in working with semi/unstructured data by implementing complex MapReduce programs using design patterns. Working knowledge of Azure data Factory and data processing solution through Azure data bricks. Good working experience in design and application development using IDE's like IntelliJ, Eclipse. Understanding of core AWS services, uses, and basic AWS architecture best knowledge of practices. Detailed understanding of Software Development Life Cycle (SDLC) and sound knowledge of project implementation methodologies including Waterfall and Agile. Ability to blend Claude service expertise with strong Aws skills to create and configure Ec2 instance and connect with Aws Redshift find out faster and deploy the query method and also using Aws s3 as a storage, also very comfortable with Aws RDS, Aws Athena, Aws Glue, Aws I am role for security, Aws lambda and also Aws step functions.
Python and SQL
Data Engineering
ETL processes
Performance Tuning
Linux Environment
Big Data Processing
Hadoop Ecosystem Knowledge
Streaming data processing
Databricks platform
Agile Methodologies