8 years of IT experience with BigData Hadoop & Spark Development. Experience with Hadoop ecosystem components like HDFS, MapReduce, Spark, Hive, Pig, Sqoop, Scala, Flume, Kafka, Oozie, Java and HBase. Working knowledge of Architecture of Distributed systems and Parallel processing frameworks. In-depth understanding of Spark execution model and internals of MapReduce framework. Good working experience in developing production ready Spark applications utilizing Spark-Core, Data-frames, Spark-SQL and Spark-Streaming API’s. Experience with Hadoop distributions like Cloudera (Cloudera distribution CDH4 and 5). Worked extensively in fine-tuning resources for long running Spark Applications to utilize better parallelism and executor memory for more caching. Good experience working with both batch and real-time processing using Spark frameworks. Proficient knowledge of Apache Spark and programming Scala to analyze large datasets using Spark to process real time data. Good working knowledge of developing Pig Latin Scripts and using Hive Query Language. Good working experience of performance tuning Hive queries and troubleshooting various issues related to Joins, memory exceptions in Hive. Good understanding of Partitions, bucketing concepts in Hive and designed both internal and external tables in Hive to optimize performance. Good experience using different file formats like Avro, RCFile, ORC and Parquet formats. Good working experience in optimizing MapReduce algorithms by using Combiners and custom partitioners. Experience with NoSQL Column – Oriented Databases like HBase, Cassandra, MongoDB and it’s Integration with Hadoop cluster. Experience with scripting language like Shell, Bash Scripts. Experience with data processing like collecting, aggregating, moving from various sources using Apache Flume and Kafka. In-depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Name Node, Data Node, Secondary Name Node, MapReduce programming paradigm. Worked with Sqoop to move (import / export) data from a relational database into Hadoop. Well versed with Agile-Scrum working environment using JIRA and version control tools like GIT. Flexible, enthusiastic and project-oriented team player with excellent communication skills.