Certified Cloudera Hadoop Developer and working experience on designing and implementing complete End-to-End Hadoop infrastructure using GCP, AWS, Azure, Python, Spark, Scala, MongoDB, HBase, Hive, Impala. Built the data extraction utility to serve the Policy, Quotes, Claims and Location data to various consumers for data analytics. Migrated the HDFS data storage to Amazon Web Services (AWS). Experience with migrating SQL databases to Azure data lake, Azure data lake analytics and Azure SQL data warehouse. Can work parallelly in both GCP, AWS and Azure Clouds coherently. Hands of experience in GCP, Big Query, GCS bucket, G - cloud function, cloud dataflow, cloud shell, GSUTIL, BQ command line utilities, Data Proc. Hands-on experience in writing Python and Bash Scripts. Expertise in Implementing Spark and Scala programs for faster data processing. Having good experience in Spark-SQL, Data Frame, RDD's, Spark YARN. Experience in using SQOOP for importing and exporting data from RDBMS to HDFS and Hive. SQL concepts, Hive SQL, Python and Pyspark to cope up with the increasing volume of data. Designed and implemented Jenkins pipelines for CI/CD processes. Worked on NoSQL databases such as HBase and MongoDB and strong Knowledge on Cassandra. Experienced in Static and Dynamic Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance. Worked in different software methodologies like SDLC and Agile SCRUM. Created Design, Process documents and reviewing, merging the code into github made by the team.
Having 12+ years of overall IT experience with strong emphasis on Design, Implementation, Development, Testing, Deployment of Software Applications using GCP, AWS, Azure, Hadoop, HDFS, Python, Spark and Scala, Kafka, MongoDB, Hive, Impala, HBase, RDBMS and other Hadoop ecosystem tools.