• GCP Data Engineer with 9+ years of experience and expertise in GCP services including Cloud Storage, BigQuery, Dataflow, Dataproc, Pub/sub, Cloud run, IAM and other tools G-cloud function, cloud shell, GSUTIL and BQ command line utilities.
• Competent in creating new data pipelines on GCP using Dataflow and converting existing on-premises data pipelines to Google Cloud Platform using Dataproc / DataFlow.
• Good understanding of Hadoop Distributed File System and Eco System (PIG, HIVE, HBase, Sqoop, Spark, ZooKeeper).
• Well versed in configuring the Hadoop cluster using major Hadoop distributions like Cloudera, MapR, Horton Works.
• Hands on experience with data management strategy formulation, architectural blueprinting, and effort estimation
• Experience in analyzing large amounts of data writing PIG Latin Scripts using and using Hive Query Language.
• Successfully done in importing and exporting data between RDBMS into HDFS using Sqoop.
• Used Flume to channel data from different resources into HDFS.
• Experience in AVRO and Parquet file formats.
• Experience in writing Logical implementation and interaction with HBase.
• Very Good understanding of SQL, ETL and Data Warehousing Technologies.
• Worked on real time data integration using Kafka, Spark streaming and HBase.
• Experience in Hive partitioning, bucketing and perform different types of joins on Hive tables.
• Worked on developing Spark jobs using Python to test environment for faster data processing and used Spark SQL for querying.
• Experience in Spark and good knowledge on Spark-SQL, RDD’s, Lazy transformation and actions.
• Good working knowledge on NoSQL databases HBase, MongoDB.
• Experience on using Talend ETL tool.
• Involved all aspects of Software Development Life Cycle (Analysis, System Design, Development, testing and maintenance) using Waterfall and Agile methodologies.
• Hands on experience in Built tools like MAVEN and used Tekton, Jenkins for continuous Integration.
• Highly adept at promptly and thoroughly mastering new technologies with a keen awareness of new industry developments and the evolution of next generation programming solutions.