Experienced Data Engineer with a focus on designing, developing and maintaining highly scalable, secure and reliable data structures. Accustomed to working closely with system architects, software architects and design analysts to understand business or industry requirements to develop comprehensive data models. Proficient at developing database architectural strategies at the modeling, design and implementation stages. Utilizes advanced SQL and Python skills to create and maintain robust data architectures. Track record of implementing scalable solutions that enhance data integrity and support informed decision-making.
Tools Used: Hadoop, Scala, Spark, Hive, Scala, Sqoop, ADF, Databricks HBase, Kafka, YAML, Flume, Ambari, Scala, MS SQL, MySQL, Snowflake, MongoDB, Cassandra, Git, Data Storage Explorer, SAS, Java, Python, GCP, GCS, GKE, Teradata, Apache Flume, Apache Drill, HDFS, ETL, Flink
Tools Used: AWS (EC2, S3, EMR, RDS, Glue, Athena, CLI), Lambda, Kinesis, Redshift, Cloud Formation, CloudWatch), Ansible, Flink, ANT, MAVEN, Jenkins CI/CD, Spark, Scala, Hive, Sqoop, HDFS, Mongo DB, OLAP, Power BI, Kafka, Hadoop, Supnik, Bitbucket, GIT, JIRA, Java, Python, SSH, Shell Scripting, Snowflake, Informatica, Talend, Docker, JSON, Pyspark, Kubernetes, Linux, Kibana
Tools Used: Cloudera CDH4.3, Hadoop, AWS, Java, R, Pig, Hive, Informatica, HBase, Kafka, Tableau, Azure Data Storage, Map Reduce, HDFS, Python, SQL, Sqoop, Spark, DataMart, Git, Teradata, DataStage
Tools Used: Python, Pandas, Shell, Hadoop, Sqoop, MapReduce, SQL, Teradata, Snowflake, Hive, Pig, SQL, Azure, Data Bricks, Kafka, Azure Data Factory, Glue, HBase, Apache, Eclipse, Airflow, Informatica
Tools Used: Python, Pandas, Matplotlib, Scikit-learn, SciPy, Machine Learning, K-Means, Tableau, Hadoop, ETL, SQL, Oracle, Agile