9+ years of extensive development in all phases of Software Development Life Cycle (SDLC) with skills in data analysis, design, development, testing and deployment of software systems. strong experience, working on Apache Hadoop ecosystem components like HDFS, Hive, Sqoop, Pyspark/Spark and Amazon Web Services like IAM, EC2, VPC, AMI, SNS, SQS, EMR, LAMBDA, GLUE, ATHENA, REDSHIFT, Cloud Watch, Auto Scaling, S3. Good experience in working with cloud environments like Amazon Web Services (AWS) EMR, EC2, and S3. Experience in using analytic data warehouse like Snowflake. Experience in using Databricks for handling all analytical processes from ETL to all data modeling by leveraging familiar tools, languages, and skills, via interactive notebooks or APIs. Experience in Apache Airflow to author workflows as directed acyclic graphs (DAGs), to visualize batch and real-time data pipelines running in production, monitor progress, and troubleshoot issues when needed. Experience in installation, configuring, supporting, and managing Hadoop Clusters using Apache Cloudera (CDH 5.X) distributions on Amazon web services (AWS). Experience in Amazon AWS services such as EMR, EC2, S3 and RedShift which provides fast and efficient processing of Big Data. Imported the data from different sources like AWS S3, Local file system into Spark RDD. Experience with developing and maintaining Applications written for Amazon Simple Storage, AWS Elastic Map Reduce, and AWS Cloud Formation. Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN. Developed spark application for filtering JSON source data in AWS S3 location and store it into HDFS with partitions and used Spark to extract schema of JSON files. Strong experience and knowledge of real time data analytics using Spark Streaming, Kafka and Flume. Experience in designing and developing applications in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle. In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, MR, Hadoop GEN2 Federation, High Availability and YARN architecture and good understanding of workload management, scalability and distributed platform architectures. Experienced in moving data from different sources using Kafka producers, consumers and preprocess data. Experience on importing and exporting data using stream processing platforms like Flume and Kafka. Good knowledge on various scripting languages like Linux/Unix, shell scripting and Python. Continuous integration and automated deployment and management using Jenkins. Hands On experience on developing UDF, DATA Frames and SQL Queries in Spark SQL. Proficient in Data Warehousing, Data Mining concepts and ETL transformations from source to target systems. Diverse experience in working with variety of Database like Oracle, MySQL, SQL Server. Experience with NumPy, Matplotlib, Pandas, Seaborn, and Cufflink’s python libraries. Experience with Python Web UI Frameworks like Flask and Django. Worked on large datasets by using Pyspark, NumPy and pandas. Good Experience in Agile Engineering practices, Scrum methodologies, and Test-Driven Development and Waterfall methodologies. Good knowledge in Core Java and J2EE technologies such as JDBC, EJB, Servlets, JSP, JavaScript, Struts and Spring. Experienced in using IDEs and Tools like Eclipse, NetBeans, GitHub, Jenkins, Maven. Strong team player, ability to work independently and in a team, ability to adapt to a rapidly changing environment, commitment towards learning, Possess excellent communication, project management, documentation, interpersonal skills. Practical database engineer possessing in-depth knowledge of data manipulation techniques and computer programming paired with expertise in integrating and implementing new software packages and new products into system. Offering several-year background managing various aspects of development, design and delivery of database solutions. Tech-savvy and independent professional bringing outstanding communication and organizational abilities.