Over 9 years of software development experience, 6 years of experience working on Big Data technologies within the Apache Spark and Hadoop Ecosystem. Proficient in building ETL and Machine Learning data pipelines using Azure Databricks and Azure Data Factory. Skilled in utilizing ML Flow model registry, Delta Lakehouse, and Auto ML features in Azure Databricks. Extensive experience in end-to-end machine learning projects, from design to delivery. Expertise in working with cloud-based Data Warehouses like Snowflake and Delta Lakehouse. Accomplished constructing automated data pipelines from landing zones to refined tables in the Snowflake data warehouse. Hands-on experience in Big Data Analytics, encompassing Data Extraction, Transformation, Loading, and Analysis using Databricks, Cloudera, and Hortonworks platforms. Proficient in Java, Hadoop Map Reduce, HDFS, Pig, Hive, Oozie, Sqoop, HBase, Scala, Python, Kafka, and NoSQL Databases. Worked on on-premises and cloud computing servers, including Azure, AWS, and Google Cloud. Expertise in Cloud Big Data tools on Azure and AWS. Familiarity with Azure Data Engineering tools such as Azure Cosmos DB (SQL, Table, Cassandra, MongoDB, Graph), Azure Synapse Analytics, Azure Streaming Analytics, Azure Databricks, Azure Data Lake Storage Gen2, and Azure Storage accounts services. Experience with AWS Big Data tools including Kinesis Data Streams, Kinesis Data Firehose, Kinesis Data Analytics, SQS, S3 bucket, EMR, DynamoDB, Redshift, Aurora DB, Glue, and quick sight services. In-depth understanding of Hadoop architecture and components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, and Map Reduce programming paradigm. Extensive experience in working with structured data using Hive, writing custom UDFs, and optimizing Hive Queries. Proficient in importing/exporting data using Sqoop from HDFS to various Relational Databases like Oracle, Teradata, Netezza, and SQL Server. Strong knowledge of NoSQL databases like Mongo DB, HBase, Cassandra, AWS Dynamo DB, and Azure Cosmos DB. Well-versed in Data Warehousing concepts, facts, dimensional tables, and diverse data formats (CSV, text, Avro, orc, JSON, Parquet, and Delta). Managed and monitored Apache Hadoop clusters using Ambari. Proficient in data ingestion tools like Azure Data Factory, Apache Sqoop, and AWS Kinesis service. Experience with Spark, Spark Streaming (Scala and Python), and hands-on experience with Azure Databricks. Hands-on experience in Data mining techniques and machine learning algorithms using Python libraries: scikit-learn, Seaborn, Matplotlib, NumPy, and Pandas. Strong understanding of NoSQL databases, with hands-on experience writing applications on HBase and working on real-time processing using Rest API. Experience using build tools Maven, ANT, Jenkins, Bamboo, Gitlab, and Azure DevOps for deploying automated builds in different environments and familiarity with CI/CD pipelines.