Experienced professional with over 5 years of expertise in developing data-intensive applications leveraging the Hadoop Ecosystem, Big Data, Cloud Data Engineering, Data Warehousing, and Data Visualization. Skilled in end-to-end solution implementation on various Big Data platforms including Cloudera and Hortonworks. Proficient in Hadoop and its ecosystem tools such as MapReduce, Pig, Spark, Hive, Sqoop, Flume, HBase, Cassandra, MongoDB, Kafka, Zookeeper, and Oozie for ETL operations and Big Data analysis. Proficiency in Scala and Apache Spark, along with data analysis using SQL, Hive, and Spark SQL. Extensive use of Python libraries like NumPy, Pandas, PySpark, Matplotlib, and Scikit-learn. Hands-on experience in developing data pipelines using AWS services and IICS Data Integration. I am skilled in real-time data streaming and pipeline creation using Kafka and Spark, with knowledge of Google Cloud Platform. Expertise in SQL, database design, and migration to Azure Data services. Experience in Impala views, Erwin data modeling, and Delta Lake. Proficient in Shell scripting, MapReduce jobs, Hive analytics, and Tableau for data visualization. Capable in Python scripting for data manipulation and statistical analysis. Familiarity with Kubernetes and Docker for CI/CD systems and experience with ETL tools like Informatica, DataStage, and Snowflake. Well-versed in database normalization and denormalization techniques for optimal performance.