1 2#------------------------------------------------------------------------------ 3# $File: apache,v 1.3 2025/05/30 13:25:13 christos Exp $ 4# apache: file(1) magic for Apache Big Data formats 5 6# Avro files 70 string Obj\001 Apache Avro, version 1 8 9# ORC files 10# Important information is in file footer, which we can't index to :( 110 string ORC Apache ORC 12 13# Apache arrow file format 14# MIME: https://www.iana.org/assignments/media-types/application/vnd.apache.arrow.stream 15# Description: https://arrow.apache.org/docs/format/Columnar.html 160 string ARROW1 Apache Arrow columnar file 17!:mime application/vnd.apache.arrow.file 18!:ext arrow/feather 19 20# Apache parquet file format 21# MIME: https://www.iana.org/assignments/media-types/application/vnd.apache.parquet 22# Description: https://parquet.apache.org/docs/file-format/ 230 string PAR1 Apache Parquet file 24!:mime application/vnd.apache.parquet 25!:ext parquet 26 27# Hive RC files 280 string RCF Apache Hive RC file 29>3 byte x version %d 30 31# Sequence files (and the careless first version of RC file) 32 330 string SEQ 34>3 byte <6 Apache Hadoop Sequence file version %d 35>3 byte >6 Apache Hadoop Sequence file version %d 36>3 byte =6 37>>5 string org.apache.hadoop.hive.ql.io.RCFile$KeyBuffer Apache Hive RC file version 0 38>>3 default x Apache Hadoop Sequence file version 6 39