
Experienced Data Engineer with 11 years in the field, currently pursuing an MS in Business Analytics. Proven expertise in Data Warehouse, ETL processes, SQL, Netezza, PySPARK, HIVE, Impala, Airflow, UNIX, Oracle, Teradata, Business Analysis, Data Analysis, MS Office, Splunk, AWS Glue, Amazon Redshift, Hadoop, Docker, Kubernetes, and other cloud and big data technologies. Adept at managing and optimizing data systems for efficiency and accuracy. Excited about integrating technical skills with business insights gained from ongoing academic pursuits. Seeking opportunities in Senior Data Engineer roles at AWS or Meta to leverage my comprehensive skill set and contribute to cutting-edge data solutions.
Programming Languages:
Python
Big Data Technologies:
Apache Hadoop ecosystem (HDFS, MapReduce, Hive, Pig, HBase)
Apache Spark
Cloud Services:
AWS (Amazon Web Services):
Amazon EMR
Amazon S3
Amazon Redshift
AWS Glue
Database Systems:
SQL
Relational databases - Netezza, Oracle, Teradata, DB2, HIVE and Impala
NoSQL databases - HBASE
ETL (Extract, Transform, Load):
AWS Glue
Informatica
DataStage
Containerization and Orchestration:
Docker
Streaming Data Processing:
Apache Kafka
Spark Streaming
Data Warehousing:
Amazon Redshift
Data Modeling:
Designing and implementing data models
Security and Compliance:
Data security best practices
Encryption
Compliance requirements
Data Mining:
Knowledge of data mining techniques and tools
Data Analysis:
Proficient in data analysis methods
Tools such as Pandas, NumPy, Seaborn and Matplotlib or equivalent
Collaboration and Communication:
Strong communication skills
Problem-Solving and Troubleshooting:
Identifying and solving complex data engineering challenges
Version Control:
Git
Automation and Scripting:
Bash
Python
Continuous Learning:
Staying updated on emerging technologies and industry trends
Agile Methodology:
Experience working in an Agile/Scrum development environment