xref: /linux/fs/cramfs/README (revision ea1754a084760e68886f5b725c8eaada9cc57155)
11da177e4SLinus TorvaldsNotes on Filesystem Layout
21da177e4SLinus Torvalds--------------------------
31da177e4SLinus Torvalds
41da177e4SLinus TorvaldsThese notes describe what mkcramfs generates.  Kernel requirements are
51da177e4SLinus Torvaldsa bit looser, e.g. it doesn't care if the <file_data> items are
61da177e4SLinus Torvaldsswapped around (though it does care that directory entries (inodes) in
71da177e4SLinus Torvaldsa given directory are contiguous, as this is used by readdir).
81da177e4SLinus Torvalds
91da177e4SLinus TorvaldsAll data is currently in host-endian format; neither mkcramfs nor the
101da177e4SLinus Torvaldskernel ever do swabbing.  (See section `Block Size' below.)
111da177e4SLinus Torvalds
121da177e4SLinus Torvalds<filesystem>:
131da177e4SLinus Torvalds	<superblock>
141da177e4SLinus Torvalds	<directory_structure>
151da177e4SLinus Torvalds	<data>
161da177e4SLinus Torvalds
171da177e4SLinus Torvalds<superblock>: struct cramfs_super (see cramfs_fs.h).
181da177e4SLinus Torvalds
191da177e4SLinus Torvalds<directory_structure>:
201da177e4SLinus Torvalds	For each file:
211da177e4SLinus Torvalds		struct cramfs_inode (see cramfs_fs.h).
221da177e4SLinus Torvalds		Filename.  Not generally null-terminated, but it is
231da177e4SLinus Torvalds		 null-padded to a multiple of 4 bytes.
241da177e4SLinus Torvalds
251da177e4SLinus TorvaldsThe order of inode traversal is described as "width-first" (not to be
261da177e4SLinus Torvaldsconfused with breadth-first); i.e. like depth-first but listing all of
271da177e4SLinus Torvaldsa directory's entries before recursing down its subdirectories: the
281da177e4SLinus Torvaldssame order as `ls -AUR' (but without the /^\..*:$/ directory header
291da177e4SLinus Torvaldslines); put another way, the same order as `find -type d -exec
301da177e4SLinus Torvaldsls -AU1 {} \;'.
311da177e4SLinus Torvalds
321da177e4SLinus TorvaldsBeginning in 2.4.7, directory entries are sorted.  This optimization
331da177e4SLinus Torvaldsallows cramfs_lookup to return more quickly when a filename does not
341da177e4SLinus Torvaldsexist, speeds up user-space directory sorts, etc.
351da177e4SLinus Torvalds
361da177e4SLinus Torvalds<data>:
371da177e4SLinus Torvalds	One <file_data> for each file that's either a symlink or a
381da177e4SLinus Torvalds	 regular file of non-zero st_size.
391da177e4SLinus Torvalds
401da177e4SLinus Torvalds<file_data>:
411da177e4SLinus Torvalds	nblocks * <block_pointer>
421da177e4SLinus Torvalds	 (where nblocks = (st_size - 1) / blksize + 1)
431da177e4SLinus Torvalds	nblocks * <block>
441da177e4SLinus Torvalds	padding to multiple of 4 bytes
451da177e4SLinus Torvalds
461da177e4SLinus TorvaldsThe i'th <block_pointer> for a file stores the byte offset of the
471da177e4SLinus Torvalds*end* of the i'th <block> (i.e. one past the last byte, which is the
481da177e4SLinus Torvaldssame as the start of the (i+1)'th <block> if there is one).  The first
491da177e4SLinus Torvalds<block> immediately follows the last <block_pointer> for the file.
501da177e4SLinus Torvalds<block_pointer>s are each 32 bits long.
511da177e4SLinus Torvalds
521da177e4SLinus TorvaldsThe order of <file_data>'s is a depth-first descent of the directory
531da177e4SLinus Torvaldstree, i.e. the same order as `find -size +0 \( -type f -o -type l \)
541da177e4SLinus Torvalds-print'.
551da177e4SLinus Torvalds
561da177e4SLinus Torvalds
571da177e4SLinus Torvalds<block>: The i'th <block> is the output of zlib's compress function
581da177e4SLinus Torvaldsapplied to the i'th blksize-sized chunk of the input data.
591da177e4SLinus Torvalds(For the last <block> of the file, the input may of course be smaller.)
601da177e4SLinus TorvaldsEach <block> may be a different size.  (See <block_pointer> above.)
611da177e4SLinus Torvalds<block>s are merely byte-aligned, not generally u32-aligned.
621da177e4SLinus Torvalds
631da177e4SLinus Torvalds
641da177e4SLinus TorvaldsHoles
651da177e4SLinus Torvalds-----
661da177e4SLinus Torvalds
671da177e4SLinus TorvaldsThis kernel supports cramfs holes (i.e. [efficient representation of]
681da177e4SLinus Torvaldsblocks in uncompressed data consisting entirely of NUL bytes), but by
691da177e4SLinus Torvaldsdefault mkcramfs doesn't test for & create holes, since cramfs in
701da177e4SLinus Torvaldskernels up to at least 2.3.39 didn't support holes.  Run mkcramfs
711da177e4SLinus Torvaldswith -z if you want it to create files that can have holes in them.
721da177e4SLinus Torvalds
731da177e4SLinus Torvalds
741da177e4SLinus TorvaldsTools
751da177e4SLinus Torvalds-----
761da177e4SLinus Torvalds
771da177e4SLinus TorvaldsThe cramfs user-space tools, including mkcramfs and cramfsck, are
781da177e4SLinus Torvaldslocated at <http://sourceforge.net/projects/cramfs/>.
791da177e4SLinus Torvalds
801da177e4SLinus Torvalds
811da177e4SLinus TorvaldsFuture Development
821da177e4SLinus Torvalds==================
831da177e4SLinus Torvalds
841da177e4SLinus TorvaldsBlock Size
851da177e4SLinus Torvalds----------
861da177e4SLinus Torvalds
871da177e4SLinus Torvalds(Block size in cramfs refers to the size of input data that is
881da177e4SLinus Torvaldscompressed at a time.  It's intended to be somewhere around
89*ea1754a0SKirill A. ShutemovPAGE_SIZE for cramfs_readpage's convenience.)
901da177e4SLinus Torvalds
911da177e4SLinus TorvaldsThe superblock ought to indicate the block size that the fs was
921da177e4SLinus Torvaldswritten for, since comments in <linux/pagemap.h> indicate that
93*ea1754a0SKirill A. ShutemovPAGE_SIZE may grow in future (if I interpret the comment
941da177e4SLinus Torvaldscorrectly).
951da177e4SLinus Torvalds
96*ea1754a0SKirill A. ShutemovCurrently, mkcramfs #define's PAGE_SIZE as 4096 and uses that
97*ea1754a0SKirill A. Shutemovfor blksize, whereas Linux-2.3.39 uses its PAGE_SIZE, which in
981da177e4SLinus Torvaldsturn is defined as PAGE_SIZE (which can be as large as 32KB on arm).
991da177e4SLinus TorvaldsThis discrepancy is a bug, though it's not clear which should be
1001da177e4SLinus Torvaldschanged.
1011da177e4SLinus Torvalds
102*ea1754a0SKirill A. ShutemovOne option is to change mkcramfs to take its PAGE_SIZE from
1031da177e4SLinus Torvalds<asm/page.h>.  Personally I don't like this option, but it does
1041da177e4SLinus Torvaldsrequire the least amount of change: just change `#define
105*ea1754a0SKirill A. ShutemovPAGE_SIZE (4096)' to `#include <asm/page.h>'.  The disadvantage
1061da177e4SLinus Torvaldsis that the generated cramfs cannot always be shared between different
1071da177e4SLinus Torvaldskernels, not even necessarily kernels of the same architecture if
108*ea1754a0SKirill A. ShutemovPAGE_SIZE is subject to change between kernel versions
1091da177e4SLinus Torvalds(currently possible with arm and ia64).
1101da177e4SLinus Torvalds
1111da177e4SLinus TorvaldsThe remaining options try to make cramfs more sharable.
1121da177e4SLinus Torvalds
1131da177e4SLinus TorvaldsOne part of that is addressing endianness.  The two options here are
1141da177e4SLinus Torvalds`always use little-endian' (like ext2fs) or `writer chooses
1151da177e4SLinus Torvaldsendianness; kernel adapts at runtime'.  Little-endian wins because of
1161da177e4SLinus Torvaldscode simplicity and little CPU overhead even on big-endian machines.
1171da177e4SLinus Torvalds
1181da177e4SLinus TorvaldsThe cost of swabbing is changing the code to use the le32_to_cpu
1191da177e4SLinus Torvaldsetc. macros as used by ext2fs.  We don't need to swab the compressed
1201da177e4SLinus Torvaldsdata, only the superblock, inodes and block pointers.
1211da177e4SLinus Torvalds
1221da177e4SLinus Torvalds
1231da177e4SLinus TorvaldsThe other part of making cramfs more sharable is choosing a block
1241da177e4SLinus Torvaldssize.  The options are:
1251da177e4SLinus Torvalds
1261da177e4SLinus Torvalds  1. Always 4096 bytes.
1271da177e4SLinus Torvalds
1281da177e4SLinus Torvalds  2. Writer chooses blocksize; kernel adapts but rejects blocksize >
129*ea1754a0SKirill A. Shutemov     PAGE_SIZE.
1301da177e4SLinus Torvalds
1311da177e4SLinus Torvalds  3. Writer chooses blocksize; kernel adapts even to blocksize >
132*ea1754a0SKirill A. Shutemov     PAGE_SIZE.
1331da177e4SLinus Torvalds
1341da177e4SLinus TorvaldsIt's easy enough to change the kernel to use a smaller value than
135*ea1754a0SKirill A. ShutemovPAGE_SIZE: just make cramfs_readpage read multiple blocks.
1361da177e4SLinus Torvalds
137*ea1754a0SKirill A. ShutemovThe cost of option 1 is that kernels with a larger PAGE_SIZE
1381da177e4SLinus Torvaldsvalue don't get as good compression as they can.
1391da177e4SLinus Torvalds
1401da177e4SLinus TorvaldsThe cost of option 2 relative to option 1 is that the code uses
1411da177e4SLinus Torvaldsvariables instead of #define'd constants.  The gain is that people
142*ea1754a0SKirill A. Shutemovwith kernels having larger PAGE_SIZE can make use of that if
1431da177e4SLinus Torvaldsthey don't mind their cramfs being inaccessible to kernels with
144*ea1754a0SKirill A. Shutemovsmaller PAGE_SIZE values.
1451da177e4SLinus Torvalds
1461da177e4SLinus TorvaldsOption 3 is easy to implement if we don't mind being CPU-inefficient:
1471da177e4SLinus Torvaldse.g. get readpage to decompress to a buffer of size MAX_BLKSIZE (which
1481da177e4SLinus Torvaldsmust be no larger than 32KB) and discard what it doesn't need.
1491da177e4SLinus TorvaldsGetting readpage to read into all the covered pages is harder.
1501da177e4SLinus Torvalds
1511da177e4SLinus TorvaldsThe main advantage of option 3 over 1, 2, is better compression.  The
1521da177e4SLinus Torvaldscost is greater complexity.  Probably not worth it, but I hope someone
1531da177e4SLinus Torvaldswill disagree.  (If it is implemented, then I'll re-use that code in
1541da177e4SLinus Torvaldse2compr.)
1551da177e4SLinus Torvalds
1561da177e4SLinus Torvalds
1571da177e4SLinus TorvaldsAnother cost of 2 and 3 over 1 is making mkcramfs use a different
1581da177e4SLinus Torvaldsblock size, but that just means adding and parsing a -b option.
1591da177e4SLinus Torvalds
1601da177e4SLinus Torvalds
1611da177e4SLinus TorvaldsInode Size
1621da177e4SLinus Torvalds----------
1631da177e4SLinus Torvalds
1641da177e4SLinus TorvaldsGiven that cramfs will probably be used for CDs etc. as well as just
1651da177e4SLinus Torvaldssilicon ROMs, it might make sense to expand the inode a little from
1661da177e4SLinus Torvaldsits current 12 bytes.  Inodes other than the root inode are followed
1671da177e4SLinus Torvaldsby filename, so the expansion doesn't even have to be a multiple of 4
1681da177e4SLinus Torvaldsbytes.
169