
Big Data tech stack
Hadoop, HDFS, Apache Spark, Hive, Impala, HBase, Sqoop, Kafka
File Formats
Parquet, Fixed Width, ASCII, JSON, EBCDIC, CSV, AVRO
Programming/Scripting Languages/ML stack
Python, Java, SQL, Unix/Linux Shell Scripting, Pandas, NumPy, Matplotlib, Seaborn, Scikit-Learn, spaCy, XGBoost
Containerization Platforms/Engine
OpenShift, Docker
Job Orchestration Tools
Airflow, Control-M
Databases/Data warehouses
MySQL, PostgreSQL, SQL Server, Netezza, HBase, MongoDB, Snowflake, Hive