From 971f9eb7201640a76423648d1cec64c31081ddf2 Mon Sep 17 00:00:00 2001 From: Gao Xiang Date: Thu, 8 Aug 2024 13:08:18 +0800 Subject: [PATCH] erofs-utils: update README for the upcoming 1.8 Add descriptions to multi-threaded compression and reproducible builds. Signed-off-by: Gao Xiang Link: https://lore.kernel.org/r/20240808050818.1822583-1-hsiangkao@linux.alibaba.com --- README | 94 +++++++++++++++++++++++++++++++++++++++++----------------- 1 file changed, 67 insertions(+), 27 deletions(-) diff --git a/README b/README index e224b23..077b62b 100644 --- a/README +++ b/README @@ -54,51 +54,91 @@ mkfs.erofs Two main kinds of EROFS images can be generated: (un)compressed images. - - For uncompressed images, there will be none of compresssed files in - these images. However, it can decide whether the tail block of a - file should be inlined or not properly [1]. + - For uncompressed images, there will be no compressed files in these + images. However, an EROFS image can contain files which consist of + various aligned data blocks and then a tail that is stored inline in + order to compact images [1]. - - For compressed images, it'll try to use the given algorithms first + - For compressed images, it will try to use the given algorithms first for each regular file and see if storage space can be saved with - compression. If not, fallback to an uncompressed file. + compression. If not, it will fall back to an uncompressed file. -How to generate EROFS images (LZ4 for Linux 5.3+, LZMA for Linux 5.16+) -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Note that EROFS supports per-file compression configuration, proper +configuration options need to be enabled to parse compressed files by +the Linux kernel. -Currently lz4(hc) and lzma are available for compression, e.g. +How to generate EROFS images +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Compression algorithms could be specified with the command-line option +`-z` to build a compressed EROFS image from a local directory: $ mkfs.erofs -zlz4hc foo.erofs.img foo/ -Or leave all files uncompressed as an option: +Supported algorithms by the Linux kernel: + - LZ4 (Linux 5.3+); + - LZMA (Linux 5.16+); + - DEFLATE (Linux 6.6+); + - Zstandard (Linux 6.10+). + +Alternatively, generate an uncompressed EROFS from a local directory: $ mkfs.erofs foo.erofs.img foo/ -In addition, you could specify a higher compression level to get a -(slightly) better compression ratio than the default level, e.g. +Additionally, you can specify a higher compression level to get a +(slightly) smaller image than the default level: $ mkfs.erofs -zlz4hc,12 foo.erofs.img foo/ -Note that all compressors are still single-threaded for now, thus it -could take more time on the multiprocessor platform. Multi-threaded -approach is already in our TODO list. +Multi-threaded support can be explicitly enabled with the ./configure +option `--enable-multithreading`; otherwise, single-threaded compression +will be used for now. It may take more time on multiprocessor platforms +if multi-threaded support is not enabled. + +Currently, both `-Efragments` (not `-Eall-fragments`) and `-Ededupe` +don't support multi-threading due to time limitations. + +Reproducible builds +~~~~~~~~~~~~~~~~~~~ + +Reproducible builds are typically used for verification and security, +ensuring the same binaries/distributions to be reproduced in a +deterministic way. + +Images generated by the same version of `mkfs.erofs` will be identical +to previous runs if the same input is specified, and the same options +are used. + +Specifically, variable timestamps and filesystem UUIDs can result in +unreproducible EROFS images. `-T` and `-U` can be used to fix them. How to generate EROFS big pcluster images (Linux 5.13+) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -In order to get much better compression ratios (thus better sequential -read performance for common storage devices), big pluster feature has -been introduced since linux-5.13, which is not forward-compatible with -old kernels. - -In details, -C is used to specify the maximum size of each big pcluster -in bytes, e.g. +By default, EROFS formatter compresses data into separate one-block +(e.g. 4KiB) filesystem physical clusters for outstanding random read +performance. In other words, each EROFS filesystem block can be +independently decompressed. However, other similar filesystems +typically compress data into "blocks" of 128KiB or more for much smaller +images. Users may prefer smaller images for archiving purposes, even if +random performance is compromised with those configurations, and even +worse when using 4KiB blocks. + +In order to fulfill users' needs, big plusters has been introduced +since Linux 5.13, in which each physical clusters will be more than one +blocks. + +Specifically, `-C` is used to specify the maximum size of each pcluster +in bytes: $ mkfs.erofs -zlz4hc -C65536 foo.erofs.img foo/ -So in that case, pcluster size can be 64KiB at most. +Thus, in this case, pcluster sizes can be up to 64KiB. -Note that large pcluster size can cause bad random performance, so -please evaluate carefully in advance. Or make your own per-(sub)file -compression strategies according to file access patterns if needed. +Note that large pcluster size can degrade random performance (though it +may improve sequential read performance for typical storage devices), so +please evaluate carefully in advance. Alternatively, you can make +per-(sub)file compression strategies according to file access patterns +if needed. -How to generate EROFS images with multiple algorithms (Linux 5.16+) -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +How to generate EROFS images with multiple algorithms +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ It's possible to generate an EROFS image with files in different algorithms due to various purposes. For example, LZMA for archival -- 2.34.1