platform/upstream/erofs-utils.git
10 months agoerofs-utils: lib: use filesystem UUID if the device name is not specified
Gao Xiang [Wed, 12 Jun 2024 16:18:23 +0000 (00:18 +0800)]
erofs-utils: lib: use filesystem UUID if the device name is not specified

The device name is not always valid.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240612161826.711279-2-hsiangkao@linux.alibaba.com
10 months agoerofs-utils: lib: get rid of erofs_prepare_dir_layout()
Gao Xiang [Wed, 12 Jun 2024 16:18:22 +0000 (00:18 +0800)]
erofs-utils: lib: get rid of erofs_prepare_dir_layout()

Just open-code the previous erofs_prepare_dir_file() and rename
`erofs_prepare_dir_layout()` to `erofs_prepare_dir_file()`.

No logic changes.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240612161826.711279-1-hsiangkao@linux.alibaba.com
10 months agoerofs-utils: fix incorrect i_nlink in the unified rebuild logic
Gao Xiang [Thu, 13 Jun 2024 02:37:06 +0000 (10:37 +0800)]
erofs-utils: fix incorrect i_nlink in the unified rebuild logic

Fixes: 203c847cc7d1 ("erofs-utils: unify the tree traversal for the rebuild mode")
Closes: https://github.com/erofs/erofsnightly/actions/runs/9492427961/job/26159566596
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240613023706.1269816-1-hsiangkao@linux.alibaba.com
10 months agoerofs-utils: add I/O control for tarerofs stream via `erofs_vfile`
Hongzhen Luo [Wed, 12 Jun 2024 03:11:24 +0000 (11:11 +0800)]
erofs-utils: add I/O control for tarerofs stream via `erofs_vfile`

This adds I/O control for tarerofs stream.

Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240612031124.1227558-1-hongzhen@linux.alibaba.com
[ Gao Xiang: code styling fixups. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
10 months agoerofs-utils: fix the current rebuild mode
Gao Xiang [Wed, 12 Jun 2024 02:16:17 +0000 (10:16 +0800)]
erofs-utils: fix the current rebuild mode

`inode->with_diskbuf` can be false in the rebuild mode since
inode data has been mapped before.

Fixes: 203c847cc7d1 ("erofs-utils: unify the tree traversal for the rebuild mode")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240612021617.4025762-1-hsiangkao@linux.alibaba.com
10 months agoerofs-utils: lib: split erofs_iflush()
Gao Xiang [Fri, 7 Jun 2024 09:53:19 +0000 (17:53 +0800)]
erofs-utils: lib: split erofs_iflush()

So that external programs can directly use it.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240607095319.2169172-2-hsiangkao@linux.alibaba.com
10 months agoerofs-utils: move erofs_writesb() into lib/
Gao Xiang [Fri, 7 Jun 2024 09:53:18 +0000 (17:53 +0800)]
erofs-utils: move erofs_writesb() into lib/

So that external programs can directly use it.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240607095319.2169172-1-hsiangkao@linux.alibaba.com
10 months agoerofs-utils: lib: support virtual files
Hongzhen Luo [Thu, 6 Jun 2024 11:18:33 +0000 (19:18 +0800)]
erofs-utils: lib: support virtual files

The current erofs-utils I/O implementation is through file descriptors.
The new `erofs_vfile` provides a more flexible way to perform I/Os.

Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240606111833.2389455-1-hongzhen@linux.alibaba.com
[ Gao Xiang: minor styling fixes. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
10 months agoerofs-utils: build: support building static library liberofsfuse
ComixHe [Thu, 6 Jun 2024 07:39:48 +0000 (15:39 +0800)]
erofs-utils: build: support building static library liberofsfuse

Add new option '--enable-static-fuse' so that we
could import erofsfuse as a static library directly
into other projects

Signed-off-by: ComixHe <heyuming@deepin.org>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/3399126AB01D5AB6+2bad5767fc035a7a2234408b0fffa53b3a07aa51.1717659178.git.heyuming@deepin.org
[ Gao Xiang: the target static library shouldn't have dependencies. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
10 months agoerofs-utils: support Intel Query Processing Library
Gao Xiang [Wed, 5 Jun 2024 12:32:33 +0000 (20:32 +0800)]
erofs-utils: support Intel Query Processing Library

This adds the preliminary Intel QPL [1] support to enable built-in
In-Memory Analytics Accelerator [2] started from Sapphire Rapids.

It just leverages the synchronous APIs for the sake of simplicity
for now, thus performance for small compressed clusters can still
be improved in the future if needed anyway.

[ QPL 1.5.0+ is strictly needed for pkg-config detection and
  it can be explicitly enabled by `--with-qpl`. ]

Here are some performance numbers for reference:

Processors: Intel(R) Xeon(R) Platinum 8475B (192 cores)
Memory:     512 GiB
Dataset:    enwik9 (1000000000) [3]

Single-threaded decompression:
 ______________________________________________________________
|                 |_ Cluster size _|_ Image size _|_ Time (s) _|
| LZ4             |     65536      |  391581696   |   0.364    |
| LZ4             |    1048576     |  373309440   |   0.376    |
| Intel QPL (IAA) |    1048576     |  374816768   |   0.386    |
| Intel QPL (IAA) |     65536      |  376057856   |   0.396    |
| Intel QPL (IAA) |      4096      |  399650816   |   0.675    |
| libdeflate (4k) |    1048576     |  374816768   |   1.862    |
| libdeflate (4k) |     65536      |  376057856   |   1.859    |
| libdeflate (4k) |      4096      |  399749120   |   2.203    |
| libdeflate      |    1048576     |  323457024   |   1.318    |
| libdeflate      |     65536      |  328712192   |   1.358    |
| libdeflate      |      4096      |  389943296   |   2.103    |
| Zstd            |      N/A       |  312548986   |   1.047    |
| Zstd (fast)     |      N/A       |  453096980   |   0.740    |
|_________________|________________|______________|____________|

LZ4 1.9.4: [ mkfs.erofs -zlz4hc,12 -C65536 ]
           [ mkfs.erofs -zlz4hc,12 -C1048576 ]
    time fsck/fsck.erofs --extract

QPL 1.5.0 (IAA) / libdeflate 1.20 (4k):
           [ mkfs.erofs -zdeflate,level=9,dictsize=4096 -C1048576 ]
           [ mkfs.erofs -zdeflate,level=9,dictsize=4096 -C65536 ]
           [ mkfs.erofs -zdeflate,level=9,dictsize=4096 -C4096 ]
    time fsck/fsck.erofs --extract

libdeflate 1.20:
           [ mkfs.erofs -zdeflate,level=9 -C1048576 ]
           [ mkfs.erofs -zdeflate,level=9 -C65536 ]
           [ mkfs.erofs -zdeflate,level=9 -C4096 ]
    time fsck/fsck.erofs --extract

Zstd 1.5.6: [ zstd -k ] [ zstd -k --fast ]
    time zstd -d -k -f -c --no-progress > /dev/null

[1] https://github.com/intel/qpl
[2] https://www.intel.com/content/www/us/en/products/docs/accelerator-engines/in-memory-analytics-accelerator.html
[3] https://www.mattmahoney.net/dc/textdata.html

Cc: "Feghali, Wajdi K" <wajdi.k.feghali@intel.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240605123233.3833332-1-hsiangkao@linux.alibaba.com
10 months agoerofs-utils: introduce z_erofs_parse_cfgs()
Gao Xiang [Wed, 5 Jun 2024 12:14:47 +0000 (20:14 +0800)]
erofs-utils: introduce z_erofs_parse_cfgs()

This userspace implementation will be mainly used for the upcoming
Intel In-Memory Analytics Accelerator integration.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240605121448.3816160-1-hsiangkao@linux.alibaba.com
10 months agoerofs-utils: record sb_size instead of sb_extslots
Gao Xiang [Tue, 4 Jun 2024 08:40:15 +0000 (16:40 +0800)]
erofs-utils: record sb_size instead of sb_extslots

Just follow the kernel implementation.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240604084015.2291157-2-hsiangkao@linux.alibaba.com
10 months agoerofs-utils: lib: wrap up zeropadding calculation
Gao Xiang [Tue, 4 Jun 2024 08:40:14 +0000 (16:40 +0800)]
erofs-utils: lib: wrap up zeropadding calculation

Use a simple helper instead of open-coding.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240604084015.2291157-1-hsiangkao@linux.alibaba.com
10 months agoerofs-utils: lib: fix incorrect xattr sharing
Gao Xiang [Fri, 31 May 2024 07:13:05 +0000 (15:13 +0800)]
erofs-utils: lib: fix incorrect xattr sharing

There are off-by-one issues after refactoring, and the size of kvbuf
should be calculated by EROFS_XATTR_KVSIZE instead.

Fixes: 5df285cf405d ("erofs-utils: lib: refactor extended attribute name prefixes")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240531071305.1183728-1-hsiangkao@linux.alibaba.com
10 months agoerofs-utils: fix false-positive errors on gcc 4.8.5
Gao Xiang [Tue, 28 May 2024 06:43:13 +0000 (14:43 +0800)]
erofs-utils: fix false-positive errors on gcc 4.8.5

Just old compiler bugs.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240528064313.1352565-1-hsiangkao@linux.alibaba.com
10 months agoerofs-utils: lib: improve freeing hashmap in erofs_blob_exit()
Sandeep Dhavale [Thu, 23 May 2024 21:01:31 +0000 (14:01 -0700)]
erofs-utils: lib: improve freeing hashmap in erofs_blob_exit()

Depending on size of the filesystem being built there can be huge number
of elements in the hashmap. Currently we call hashmap_iter_first() in
while loop to iterate and free the elements. However technically
correct, this is inefficient in 2 aspects.

- As we are iterating the elements for removal, we do not need overhead of
rehashing.
- Second part which contributes hugely to the performance is using
hashmap_iter_first() as it starts scanning from index 0 throwing away
the previous successful scan. For sparsely populated hashmap this becomes
O(n^2) in worst case.

Lets fix this by disabling hashmap shrink which avoids rehashing
and use hashmap_iter_next() which is now guaranteed to iterate over
all the elements while removing while avoiding the performance pitfalls
of using hashmap_iter_first().

Test with random data shows performance improvement as:

fs_size  Before   After
1G   23s    7s
2G   81s      15s
4G  272s     31s
8G   1252s   61s

Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240523210131.3126753-3-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
10 months agoerofs-utils: lib: provide helper to disable hashmap shrinking
Sandeep Dhavale [Thu, 23 May 2024 21:01:30 +0000 (14:01 -0700)]
erofs-utils: lib: provide helper to disable hashmap shrinking

This helper sets hasmap.shrink_at to 0. This is helpful to iterate over
hashmap using hashmap_iter_next() and use hashmap_remove() in single
pass efficeintly.

Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240523210131.3126753-2-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
10 months agoerofs-utils: lib: fix uncompressed packed inode
Gao Xiang [Thu, 23 May 2024 02:55:50 +0000 (10:55 +0800)]
erofs-utils: lib: fix uncompressed packed inode

Currently, packed inode can also be used in the unencoded way
such as xattr prefixes.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240523025550.2447091-1-hsiangkao@linux.alibaba.com
10 months agoerofs-utils: unify the tree traversal for the rebuild mode
Gao Xiang [Mon, 20 May 2024 06:03:01 +0000 (14:03 +0800)]
erofs-utils: unify the tree traversal for the rebuild mode

Let's drop the legacy approach and `tarerofs` will be applied too.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240520060301.2642650-1-hsiangkao@linux.alibaba.com
10 months agoerofs-utils: mkfs: add `--zfeature-bits` option
Gao Xiang [Fri, 17 May 2024 09:00:48 +0000 (17:00 +0800)]
erofs-utils: mkfs: add `--zfeature-bits` option

Thus, we could traverse all compression features with continuous
numbers easily in the testcases.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240517090048.3039594-1-hsiangkao@linux.alibaba.com
11 months agoerofs-utils: add preliminary zstd support [x]
Gao Xiang [Wed, 15 May 2024 05:16:41 +0000 (13:16 +0800)]
erofs-utils: add preliminary zstd support [x]

This patch just adds a preliminary Zstandard support to erofs-utils
since currently Zstandard doesn't support fixed-sized output compression
officially.  Mkfs could take more time to finish but it works at least.

The built-in zstd compressor for erofs-utils is slowly WIP, therefore
apparently it will take more efforts.

[ TODO: Later I tend to add another way to generate fixed-sized input
        pclusters temporarily for relatively large pcluster sizes as
        an option since it will have minor impacts to the results. ]

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240515051641.3929058-1-hsiangkao@linux.alibaba.com
11 months agoerofs-utils: pretty root directory progressinfo
Gao Xiang [Wed, 15 May 2024 17:23:13 +0000 (01:23 +0800)]
erofs-utils: pretty root directory progressinfo

Avoid `Processing  ...` or `file  dumped (mode 40755)` messages..

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240515172313.661530-1-hsiangkao@linux.alibaba.com
11 months agoerofs-utils: correct the default number of workers in the usage
Gao Xiang [Wed, 15 May 2024 17:22:34 +0000 (01:22 +0800)]
erofs-utils: correct the default number of workers in the usage

Fixes: 59c36e7a4008 ("erofs-utils: mkfs: use all available processors by default")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240515172236.661035-1-hsiangkao@linux.alibaba.com
11 months agoerofs-utils: optimize pthread_cond_signal calling
Noboru Asai [Wed, 1 May 2024 02:24:20 +0000 (11:24 +0900)]
erofs-utils: optimize pthread_cond_signal calling

Call pthread_cond_signal once per file.

Signed-off-by: Noboru Asai <asai@sijam.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240501022420.1881305-1-asai@sijam.com
[ Gao Xiang: add potential overflow detection. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
11 months agoerofs-utils: lib: adjust MicroLZMA default dictionary size
Gao Xiang [Tue, 30 Apr 2024 06:37:30 +0000 (14:37 +0800)]
erofs-utils: lib: adjust MicroLZMA default dictionary size

If dict_size is not given, it will be set as max(32k, pclustersize * 8)
but no more than Z_EROFS_LZMA_MAX_DICT_SIZE.

Also kill an obsolete warning since multi-threaded support is landed.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240430063730.599937-2-hsiangkao@linux.alibaba.com
11 months agoerofs-utils: record pclustersize in bytes instead of blocks
Gao Xiang [Wed, 1 May 2024 04:54:10 +0000 (12:54 +0800)]
erofs-utils: record pclustersize in bytes instead of blocks

So that we don't need to handle blocksizes everywhere.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240501045410.1808086-1-hsiangkao@linux.alibaba.com
11 months agoerofs-utils: mkfs: use all available processors by default
Gao Xiang [Sat, 27 Apr 2024 06:25:52 +0000 (14:25 +0800)]
erofs-utils: mkfs: use all available processors by default

Fulfill the needs of most users.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240427062552.744810-1-hsiangkao@linux.alibaba.com
12 months agoerofs-utils: mkfs: enable inter-file multi-threaded compression
Gao Xiang [Mon, 22 Apr 2024 00:34:50 +0000 (08:34 +0800)]
erofs-utils: mkfs: enable inter-file multi-threaded compression

Dispatch deferred ops in another per-sb worker thread.  Note that
deferred ops are strictly FIFOed.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-8-xiang@kernel.org
12 months agoerofs-utils: lib: introduce non-directory jobitem context
Gao Xiang [Mon, 22 Apr 2024 00:34:49 +0000 (08:34 +0800)]
erofs-utils: lib: introduce non-directory jobitem context

It will describe EROFS_MKFS_JOB_NDIR defer work.  Also, start
compression before queueing EROFS_MKFS_JOB_NDIR.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-7-xiang@kernel.org
12 months agoerofs-utils: mkfs: prepare inter-file multi-threaded compression
Yifan Zhao [Mon, 22 Apr 2024 00:34:48 +0000 (08:34 +0800)]
erofs-utils: mkfs: prepare inter-file multi-threaded compression

This patch separates the compression process into two parts.

Specifically, erofs_begin_compressed_file() will trigger compression.
erofs_write_compressed_file() will wait for the compression finish and
write compressed (meta)data.

Note that it's possible that erofs_begin_compressed_file() and
erofs_write_compressed_file() run with different threads even the
global inode context is used, thus add another synchronization point.

Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Co-authored-by: Tong Xin <xin_tong@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-6-xiang@kernel.org
12 months agoerofs-utils: lib: split up z_erofs_mt_compress()
Gao Xiang [Mon, 22 Apr 2024 00:34:47 +0000 (08:34 +0800)]
erofs-utils: lib: split up z_erofs_mt_compress()

The on-disk compressed data write will be moved into a new function
erofs_mt_write_compressed_file().

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-5-xiang@kernel.org
12 months agoerofs-utils: rearrange several fields for multi-threaded mkfs
Gao Xiang [Mon, 22 Apr 2024 00:34:46 +0000 (08:34 +0800)]
erofs-utils: rearrange several fields for multi-threaded mkfs

They should be located in `struct z_erofs_compress_ictx`.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-4-xiang@kernel.org
12 months agoerofs-utils: lib: split out erofs_commit_compressed_file()
Gao Xiang [Mon, 22 Apr 2024 00:34:45 +0000 (08:34 +0800)]
erofs-utils: lib: split out erofs_commit_compressed_file()

Just split out on-disk compressed metadata commit logic.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-3-xiang@kernel.org
12 months agoerofs-utils: lib: prepare for later deferred work
Gao Xiang [Mon, 22 Apr 2024 00:34:44 +0000 (08:34 +0800)]
erofs-utils: lib: prepare for later deferred work

Split out ordered metadata operations and add the following helpers:

 - erofs_mkfs_jobfn()

 - erofs_mkfs_go()

to handle these mkfs job items for multi-threadding support.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-2-xiang@kernel.org
12 months agoerofs-utils: use erofs_atomic_t for inode->i_count
Gao Xiang [Mon, 22 Apr 2024 00:34:43 +0000 (08:34 +0800)]
erofs-utils: use erofs_atomic_t for inode->i_count

Since `inode->i_count` can be touched for more than one thread if
multi-threading is enabled.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-1-xiang@kernel.org
12 months agoerofs-utils: fsck: extract chunk-based file with hole correctly
Yifan Zhao [Mon, 22 Apr 2024 11:31:32 +0000 (19:31 +0800)]
erofs-utils: fsck: extract chunk-based file with hole correctly

Currently fsck skips file extraction if it finds that EROFS_MAP_MAPPED
is unset, which is not the case for chunk-based files with holes.

This patch handles the corner case correctly.

Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
12 months agoerofs-utils: add missing block counting
Noboru Asai [Wed, 24 Apr 2024 05:59:23 +0000 (14:59 +0900)]
erofs-utils: add missing block counting

Add missing block counting when the data to be inlined is not inlined.

Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/ZijhA4IJFSO7FYUy@debian
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
12 months agoerofs-utils: lib: refine on-disk meta arrangement again
Gao Xiang [Sat, 6 Apr 2024 05:37:17 +0000 (13:37 +0800)]
erofs-utils: lib: refine on-disk meta arrangement again

Use DFS instead of BFS since most workloads like `ls -R` and `tar -c`
traverse in depth-first mode.  However, it still arranges sub-directory
inodes closely so that it isn't a simple reversion compared to pre-BFS
old versions.

Also the build performance out of linux-6.1.53 source code is greatly
improved by 91.2% (33.040s -> 2.861s) as well as the new image size is
decreased by 0.0094% (120 KiB), which is minor through.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240406053717.565119-2-hsiangkao@linux.alibaba.com
12 months agoerofs-utils: lib: split out several helpers in inode.c
Yifan Zhao [Sat, 6 Apr 2024 05:37:16 +0000 (13:37 +0800)]
erofs-utils: lib: split out several helpers in inode.c

The following new helpers are added to prepare for the upcoming
multi-threaded inter-file compression:
 - erofs_mkfs_handle_{non,}directory;
 - erofs_write_unencoded_file.

Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240406053717.565119-1-hsiangkao@linux.alibaba.com
12 months agoerofs-utils: mkfs: skip the redundant write for ztailpacking block
Yifan Zhao [Thu, 18 Apr 2024 12:23:12 +0000 (20:23 +0800)]
erofs-utils: mkfs: skip the redundant write for ztailpacking block

z_erofs_merge_segment() doesn't consider the ztailpacking block in the
extent list and unnecessarily writes it back to the disk. This patch
fixes this issue by introducing a new `inlined` field in the struct
`z_erofs_inmem_extent`.

Fixes: 830b27bc2334 ("erofs-utils: mkfs: introduce inner-file multi-threaded compression")
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
[ Gao Xiang: simplify a bit. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240418122312.99282-1-xiang@kernel.org
12 months agoerofs-utils: lib: treat data blocks filled with 0s as a hole
Sandeep Dhavale [Wed, 17 Apr 2024 23:48:44 +0000 (16:48 -0700)]
erofs-utils: lib: treat data blocks filled with 0s as a hole

Add optimization to treat data blocks filled with 0s as a hole.
Even though diskspace savings are comparable to chunk based or dedupe,
having no block assigned saves us redundant disk IOs during read.

To detect blocks filled with zeros during chunking, we insert block
filled with zeros (zerochunk) in the hashmap. If we detect a possible
dedupe, we map it to the hole so there is no physical block assigned.

Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240417234845.2758882-1-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
12 months agoerofs-utils: dump: print filesystem blocksize
Sandeep Dhavale [Thu, 18 Apr 2024 00:00:54 +0000 (17:00 -0700)]
erofs-utils: dump: print filesystem blocksize

mkfs.erofs supports creating filesystem images with different
blocksizes. Add filesystem blocksize in super block dump so
its easier to inspect the filesystem.

The field is added after FS magic, so the output now looks like:

Filesystem magic number:                      0xE0F5E1E2
Filesystem blocksize:                         65536
Filesystem blocks:                            21
Filesystem inode metadata start block:        0
Filesystem shared xattr metadata start block: 0
Filesystem root nid:                          36
Filesystem lz4_max_distance:                  65535
Filesystem sb_extslots:                       0
Filesystem inode count:                       10
Filesystem created:                           Wed Apr 17 16:53:10 2024
Filesystem features:                          sb_csum mtime 0padding
Filesystem UUID:                              e66f6dd1-6882-48c3-9770-fee7c4841a93

Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240418000054.2769023-1-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
12 months agoerofs-utils: change ztailpacking temporary buffer to non-static
Noboru Asai [Mon, 8 Apr 2024 09:16:26 +0000 (18:16 +0900)]
erofs-utils: change ztailpacking temporary buffer to non-static

In multi-threaded mode, each thread must use a different buffer in
tryrecompress_trailing(), so change this buffer to non static.

Signed-off-by: Noboru Asai <asai@sijam.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240408091627.336554-1-asai@sijam.com
[ Gao Xiang: slightly refine the subject line. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
12 months agoerofs-utils: lib: fix tarerofs 32-bit overflows
Gao Xiang [Thu, 11 Apr 2024 10:00:39 +0000 (18:00 +0800)]
erofs-utils: lib: fix tarerofs 32-bit overflows

Otherwise, large files won't be imported properly.

Fixes: e3dfe4b8db26 ("erofs-utils: mkfs: support tgz streams for tarerofs")
Fixes: 95d315fd7958 ("erofs-utils: introduce tarerofs")
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Sandeep Dhavale <dhavale@google.com>
Link: https://lore.kernel.org/r/20240411100039.197417-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
12 months agoerofs-utils: lib: Fix calculation of minextblks when working with sparse files
Sandeep Dhavale [Wed, 3 Apr 2024 07:07:00 +0000 (00:07 -0700)]
erofs-utils: lib: Fix calculation of minextblks when working with sparse files

When we work with sparse files (files with holes), we need to consider
when the contiguous data block starts after each hole to correctly calculate
minextblks so we can merge consecutive chunks later.
Now that we need to recalculate minextblks multiple places, put the logic
in helper function for avoiding repetition and easier reading.

Fixes: 7b46f7a0160a ("erofs-utils: lib: merge consecutive chunks if possible")
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240403070700.1716252-1-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
12 months agoerofs-utils: set opaque flag for directories in tarerofs mode
Gao Xiang [Tue, 2 Apr 2024 02:58:58 +0000 (10:58 +0800)]
erofs-utils: set opaque flag for directories in tarerofs mode

Opaque dir flag is needed if the tar tree is used immediately for
the upcoming append mode.

Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240402025858.1729161-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
12 months agoerofs: fix compression fallback in tarerofs mode
Gao Xiang [Wed, 27 Mar 2024 00:46:14 +0000 (08:46 +0800)]
erofs: fix compression fallback in tarerofs mode

The return value of `lseek(fd, fpos, SEEK_SET)` can overflow the `int`
type.  Fix this.

Fixes: 376fb2dbe66d ("erofs-utils: lib: introduce diskbuf")
Link: https://lore.kernel.org/r/20240327004614.1465889-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
12 months agoerofs-utils: tar: all regular inodes should be zeroed in headerball mode
Gao Xiang [Fri, 22 Mar 2024 08:50:07 +0000 (16:50 +0800)]
erofs-utils: tar: all regular inodes should be zeroed in headerball mode

.. Instead of reporting IO errors which implies a corrupted image.

Fixes: 6894ca9623e7 ("erofs-utils: mkfs: Support tar source without data")
Link: https://lore.kernel.org/r/20240322085007.2592729-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
12 months agoerofs-utils: move pclustersize to `struct z_erofs_compress_sctx`
Noboru Asai [Fri, 22 Mar 2024 12:24:31 +0000 (20:24 +0800)]
erofs-utils: move pclustersize to `struct z_erofs_compress_sctx`

With -E(all-)fragments, pclustersize has a different value per segment,
so move it to `struct z_erofs_compress_sctx`.

Signed-off-by: Noboru Asai <asai@sijam.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240321070236.2396573-1-asai@sijam.com
12 months agoerofs-utils: lib: fix multi-threaded compression in tarerofs mode
Gao Xiang [Tue, 19 Mar 2024 08:24:55 +0000 (16:24 +0800)]
erofs-utils: lib: fix multi-threaded compression in tarerofs mode

Since pread() can be used during multi-threaded compression, it's
necessary to pass `fpos` in to indicate the absolute offset.

Fixes: aec8487dce4c ("erofs-utils: mkfs: introduce inner-file multi-threaded compression")
Link: https://lore.kernel.org/r/20240319082455.4115493-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
12 months agoerofs-utils: mkfs: introduce inner-file multi-threaded compression
Yifan Zhao [Fri, 15 Mar 2024 01:10:19 +0000 (09:10 +0800)]
erofs-utils: mkfs: introduce inner-file multi-threaded compression

Currently, the creation of EROFS compressed image creation is
single-threaded, which suffers from performance issues. This patch
attempts to address it by compressing the large file in parallel.

Specifically, each input file larger than 16MiB is splited into
segments, and each worker thread compresses a segment as if it were
a separate file.  Finally, the main thread merges all the compressed
segments.

Multi-threaded compression is not compatible with -Ededupe,
-E(all-)fragments for now.

Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Co-authored-by: Tong Xin <xin_tong@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240315011019.610442-5-hsiangkao@linux.alibaba.com
Link: https://lore.kernel.org/r/ZfaW3oLe8Q2621DV@debian
13 months agoerofs-utils: lib: introduce atomic operations
Gao Xiang [Fri, 15 Mar 2024 01:10:18 +0000 (09:10 +0800)]
erofs-utils: lib: introduce atomic operations

Add some helpers (relaxed semantics) in order to prepare for the
upcoming multi-threaded support.

For example, compressor may be initialized more than once in different
worker threads, resulting in noisy warnings.

This patch makes sure that each message will be printed only once by
adding `__warnonce` atomic booleans to each erofs_compressor_init().

Cc: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240315011019.610442-4-hsiangkao@linux.alibaba.com
13 months agoerofs-utils: mkfs: add `--workers=#` parameter
Yifan Zhao [Fri, 15 Mar 2024 01:10:17 +0000 (09:10 +0800)]
erofs-utils: mkfs: add `--workers=#` parameter

This patch introduces `--workers=#` parameter for the incoming
multi-threaded compression support.

It also introduces a concept called `segment size` to split large
inodes for multi-threaded compression, which has the fixed value
16MiB and cannot be modified for now.

Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240315011019.610442-3-hsiangkao@linux.alibaba.com
13 months agoerofs-utils: add a helper to get available processors
Gao Xiang [Fri, 15 Mar 2024 01:10:16 +0000 (09:10 +0800)]
erofs-utils: add a helper to get available processors

In order to prepare for multi-threaded decompression.

Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240315011019.610442-2-hsiangkao@linux.alibaba.com
13 months agoerofs-utils: introduce multi-threading framework
Yifan Zhao [Fri, 15 Mar 2024 01:10:15 +0000 (09:10 +0800)]
erofs-utils: introduce multi-threading framework

Add a workqueue implementation for multi-threading support inspired by
xfsprogs.

Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Suggested-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240315011019.610442-1-hsiangkao@linux.alibaba.com
13 months agoerofs-utils: support xz/lzma/lzip streams for tarerofs
Gao Xiang [Sun, 3 Mar 2024 14:35:30 +0000 (22:35 +0800)]
erofs-utils: support xz/lzma/lzip streams for tarerofs

Similar to commit e3dfe4b8db26 ("erofs-utils: mkfs: support tgz streams
for tarerofs"), let's add xz/lzma/lzip support by using liblzma.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240303143530.4077607-1-hsiangkao@linux.alibaba.com
13 months agoerofs-utils: mkfs: Support tar source without data
Mike Baynton [Tue, 27 Feb 2024 08:42:21 +0000 (16:42 +0800)]
erofs-utils: mkfs: Support tar source without data

This improves performance of meta-only image creation in cases where the
source is a tarball stream that is not seekable. The writer may now use
`--tar=headerball` and omit the file data. Previously, the stream writer
was forced to send the file's size worth of null bytes or any data after
each tar header which was simply discarded by mkfs.erofs.

Signed-off-by: Mike Baynton <mike@mbaynton.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240227084221.342635-1-hsiangkao@linux.alibaba.com
13 months agoerofs-utils: update my outdated misleading email address
Gao Xiang [Fri, 8 Mar 2024 07:24:29 +0000 (15:24 +0800)]
erofs-utils: update my outdated misleading email address

The @kernel.org one is always preferred through.

Signed-off-by: Gao Xiang <xiang@kernel.org>
13 months agoerofs-utils: support dumping raw tar streams together
Gao Xiang [Thu, 22 Feb 2024 09:01:45 +0000 (17:01 +0800)]
erofs-utils: support dumping raw tar streams together

Since commit e3dfe4b8db26 ("erofs-utils: mkfs: support tgz streams for
tarerofs"), tgz streams can be converted to EROFS directly.

However, many use cases also require raw tar streams.  Let's add
support for dumping raw streams with `--ungzip=FILE` option.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240222090145.709808-1-hsiangkao@linux.alibaba.com
13 months agoerofs-utils: support liblzma auto-detection
Gao Xiang [Wed, 21 Feb 2024 05:51:44 +0000 (13:51 +0800)]
erofs-utils: support liblzma auto-detection

The new XZ Utils 5.4 is now available in most Linux distributions.

Let's enable liblzma auto-detection as well as get rid of MicroLZMA
EXPERIMENTAL warning.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240221055144.4054806-1-hsiangkao@linux.alibaba.com
14 months agoerofs-utils: support zlib auto-detection
Gao Xiang [Tue, 20 Feb 2024 17:16:59 +0000 (01:16 +0800)]
erofs-utils: support zlib auto-detection

Fix explicit `--with-zlib` so that it errors out when zlib
is unavailable.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240220171700.3693176-1-hsiangkao@linux.alibaba.com
14 months agoerofs-utils: lib: fix incorrect usage of `erofs_strerror`
Tianyi Liu [Thu, 8 Feb 2024 13:59:09 +0000 (21:59 +0800)]
erofs-utils: lib: fix incorrect usage of `erofs_strerror`

`erofs_strerror` accepts a negative argument,
so `errno` should be inverted before passing to it.

Signed-off-by: Tianyi Liu <i.pear@outlook.com>
Link: https://lore.kernel.org/r/SY6P282MB3193657433D35C3A7799CA5F9D442@SY6P282MB3193.AUSP282.PROD.OUTLOOK.COM
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
15 months agoerofs-utils: lib: reset HC to avoid 32-bit overflow of kite-deflate
Gao Xiang [Wed, 24 Jan 2024 09:16:21 +0000 (17:16 +0800)]
erofs-utils: lib: reset HC to avoid 32-bit overflow of kite-deflate

Yifan reported a "segmentation fault (core dumped)" error days ago
with a large dataset (enwik9 x 5).   Let's fix it.

Reported-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Fixes: 861037f4fc15 ("erofs-utils: add a built-in DEFLATE compressor")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Tested-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Link: https://lore.kernel.org/r/20240124091621.2413606-1-hsiangkao@linux.alibaba.com
15 months agoerofs-utils: mkfs: reorganize logic in erofs_compressor_init()
Yifan Zhao [Sun, 21 Jan 2024 12:29:02 +0000 (20:29 +0800)]
erofs-utils: mkfs: reorganize logic in erofs_compressor_init()

Currently, the initialization of compressors follows an unusual order:
`.init()` is called first, followed by `.setlevel()`, and then
`.setdictsize()`.  However, the actual initialization occurs within the
last-called `.setdictsize()`, for the MicroLZMA, DEFLATE, and libdeflate
compressors.

This patch reorders these functions, with `.init()` now being invoked
last, allowing it to use the compression level and dictsize already set
so that the behavior of the functions matches their names.

Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Link: https://lore.kernel.org/r/20240121122902.207756-1-zhaoyifan@sjtu.edu.cn
[ Gao Xiang: refine the commit message. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
15 months agoerofs-utils: mkfs: allow to specify dictionary size for compression algorithms
Yifan Zhao [Sat, 20 Jan 2024 11:53:19 +0000 (19:53 +0800)]
erofs-utils: mkfs: allow to specify dictionary size for compression algorithms

Currently, the dictionary size for compression algorithms is fixed. This
patch allows to specify different ones with new -zX,dictsize=<dictsize>
options.

This patch also changes the way to specify compression levels. Now, the
compression level is specified with -zX,level=<level> options and could
be specified together with dictsize. The old -zX,<level> form is still
supported for compatibility.

Suggested-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Link: https://lore.kernel.org/r/20240120115319.152366-1-zhaoyifan@sjtu.edu.cn
[ Gao Xiang: minor update. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
15 months agoerofs-utils: mkfs: merge erofs_compressor_setlevel() into erofs_compressor_init()
Yifan Zhao [Sat, 20 Jan 2024 11:53:14 +0000 (19:53 +0800)]
erofs-utils: mkfs: merge erofs_compressor_setlevel() into erofs_compressor_init()

Currently erofs_compressor_setlevel() is only called once just after
erofs_compressor_init() while initializing compressors. Let's just hide
this interface and set the compression level in erofs_compressor_init().

Besides, we do not need to assign the {default,best}_level for an
algorithm which does not support the compression level in its
erofs_compressor struct.

Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Link: https://lore.kernel.org/r/20240120115314.152285-1-zhaoyifan@sjtu.edu.cn
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
15 months agoerofs-utils: avoid noisy prints if stdout is not a tty
Gao Xiang [Tue, 16 Jan 2024 04:26:42 +0000 (12:26 +0800)]
erofs-utils: avoid noisy prints if stdout is not a tty

As Daan reported, "mkfs.erofs is super verbose as \r doesn't go to
the beginning of the line".  Don't print messages like this if
stdout is not a tty.

Reported-by: Daan De Meyer <daan.j.demeyer@gmail.com>
Closes: https://lore.kernel.org/r/CAO8sHcmnq+foWo7AZYbkxJXHfSeZkd73Dq+1dQSZYBE6QxL8JQ@mail.gmail.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240116042642.3124559-1-hsiangkao@linux.alibaba.com
15 months agoerofs-utils: lib: use dummy_pivot to dedupe the beginnings of files
Gao Xiang [Mon, 15 Jan 2024 15:05:50 +0000 (23:05 +0800)]
erofs-utils: lib: use dummy_pivot to dedupe the beginnings of files

The beginnings of files are incorrectly skipped for deduplication, which
causes unexpected image size regression. Fix it.

Reported-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Fixes: 8ead5f8bd38c ("erofs-utils: lib: generate compression indexes in memory first")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240115150550.1961455-1-hsiangkao@linux.alibaba.com
15 months agoerofs-utils: lib: generate compression indexes in memory first
Yifan Zhao [Mon, 18 Dec 2023 14:57:10 +0000 (22:57 +0800)]
erofs-utils: lib: generate compression indexes in memory first

Currently, mkfs generates the on-disk indexes of each compressed extent
on the fly during compressing, which is inflexible if we'd like to merge
sub-indexes of a file later for the multi-threaded scenarios.

Let's generate on-disk indexes after the compression is completed.

Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231218145710.132164-3-hsiangkao@linux.alibaba.com
15 months agoerofs-utils: lib: split vle_compress_one()
Gao Xiang [Mon, 18 Dec 2023 14:57:09 +0000 (22:57 +0800)]
erofs-utils: lib: split vle_compress_one()

Split compression for each extent into a new helper for later reworking.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231218145710.132164-2-hsiangkao@linux.alibaba.com
15 months agoerofs-utils: lib: add z_erofs_need_refill()
Gao Xiang [Mon, 18 Dec 2023 14:57:08 +0000 (22:57 +0800)]
erofs-utils: lib: add z_erofs_need_refill()

Let's remove redundant logic.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231218145710.132164-1-hsiangkao@linux.alibaba.com
15 months agoerofs-utils: fuse: support FUSE 2/3 multi-threading
Li Yiyan [Tue, 12 Dec 2023 11:54:27 +0000 (19:54 +0800)]
erofs-utils: fuse: support FUSE 2/3 multi-threading

Support multi-threading for erofsfuse and adjust the configure.ac to
allow users of FUSE 3(> 3.2) to use API version 32, while maintaining
compatibility with API version 30 for FUSE 3 (3.0/3.1) and API version
26 for FUSE 2.

Signed-off-by: Li Yiyan <lyy0627@sjtu.edu.cn>
Link: https://lore.kernel.org/r/20231212115427.2779792-1-lyy0627@sjtu.edu.cn
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
15 months agoerofs-utils: mkfs: fix a misspelling
Yifan Zhao [Sun, 14 Jan 2024 04:42:29 +0000 (12:42 +0800)]
erofs-utils: mkfs: fix a misspelling

Fix a misspelling in the version() function of mkfs.erofs.

Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Link: https://lore.kernel.org/r/20240114044229.626815-1-zhaoyifan@sjtu.edu.cn
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
15 months agoerofs-utils: mkfs: support compact indexes for smaller block sizes
Gao Xiang [Thu, 23 Nov 2023 05:22:45 +0000 (13:22 +0800)]
erofs-utils: mkfs: support compact indexes for smaller block sizes

This commit also adds mkfs support of compact indexes for smaller
block sizes (less than 4096).

Reviewed-by: Yue Hu <huyue2@coolpad.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231123052245.868698-2-hsiangkao@linux.alibaba.com
15 months agoerofs-utils: lib: fix up compact indexes for block size < 4096
Gao Xiang [Thu, 23 Nov 2023 05:22:44 +0000 (13:22 +0800)]
erofs-utils: lib: fix up compact indexes for block size < 4096

Let's keep in sync with kernel commit 8d2517aaeea3 ("erofs: fix up
compacted indexes for block size < 4096").

Original kernel commit:

Previously, the block size always equaled to PAGE_SIZE, therefore
`lclusterbits` couldn't be less than 12.

Since sub-page compressed blocks are now considered, `lobits` for
a lcluster in each pack cannot always be `lclusterbits` as before.
Otherwise, there is no enough room for the special value
`Z_EROFS_LI_D0_CBLKCNT`.

To support smaller block sizes, `lobits` for each compacted lcluster is
now calculated as:
   lobits = max(lclusterbits, ilog2(Z_EROFS_LI_D0_CBLKCNT) + 1)

Reviewed-by: Yue Hu <huyue2@coolpad.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231123052245.868698-1-hsiangkao@linux.alibaba.com
17 months agoerofs-utils: lib: `fragment_size` should be 64 bits
Gao Xiang [Wed, 15 Nov 2023 02:49:52 +0000 (10:49 +0800)]
erofs-utils: lib: `fragment_size` should be 64 bits

`-Eall-fragments` will be broken if i_size is more than 32 bits.

Fixes: fcaa988a6ef6 ("erofs-utils: add `-Eall-fragments` option")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Link: https://lore.kernel.org/r/20231115024952.1256243-1-hsiangkao@linux.alibaba.com
17 months agoerofs-utils: mkfs,fsck,dump: support `--offset` option
Gao Xiang [Tue, 7 Nov 2023 07:55:55 +0000 (15:55 +0800)]
erofs-utils: mkfs,fsck,dump: support `--offset` option

Add `--offset` option to allows users to specify an offset in the file
where the filesystem will begin.

Suggested-by: Pavel Otchertsov <pavel.otchertsov@gmail.com>
Closes: https://lore.kernel.org/r/CAAxnTOGTD2NkKnBphZ+vEr7NVnWvT0u02E+c8pN8ZVFcXp5uhg@mail.gmail.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231107075555.2554444-1-hsiangkao@linux.alibaba.com
17 months agoerofs-utils: fsck: Add -a, -A, and -y flags
Luke T. Shumaker [Thu, 2 Nov 2023 19:31:22 +0000 (13:31 -0600)]
erofs-utils: fsck: Add -a, -A, and -y flags

Other fsck.${filesystem} commands generally take -a or -p and
sometimes -A to automatically repair a filesystem, and -y to either
repair agree to all prompts about repairing.

For example:

 - fsck.ext{2,3,4} takes -a or -p to repair, and -y to agree
 - fsck.xfs takes -y to repair; and -a, -A, or -p to silence a warning
   about repairing
 - fsck.btrfs takes -a, -A, -p, or -y to silence a warning about repairing

So, like fsck.btrfs, we should accept these flags as no-ops, for
compatibility with programs that expect to be able to pass these to
fsck.  In particular, Arch Linux's mkinitcpio (when fsck is enabled)
unconditionally passes -a to `fsck`.

Naturally, I'd have liked to include '-p' in the list, but it already
does something different for fsck.erofs.  I'd like to call out the
fsck.ext4 manual, which says:

       -a   This option does the same thing as the -p option. It is
            provided for backwards compatibility only; it is
            suggested that people use -p option whenever possible.

Signed-off-by: Luke T. Shumaker <lukeshu@umorpha.io>
Link: https://lore.kernel.org/r/20231102193122.140921-4-lukeshu@lukeshu.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
17 months agoerofs-utils: improve the usage and version text of non-fuse commands
Luke T. Shumaker [Thu, 2 Nov 2023 19:31:21 +0000 (13:31 -0600)]
erofs-utils: improve the usage and version text of non-fuse commands

For each command:

 - Change the format of --help to be closer to the usual GNU format
 - Have the --version text mention that it is part of erofs-utils
 - Include compile-time feature flags in -V
 - Have --help and --version print on stdout not stderr
 - Exit with 0 from --help and --version
 - Have flag errors print a message saying to use --help instead of
   printing the full help text

For fsck.erofs:

 - Consolidate the descriptions of --[no-]preserve[-<owner|perms>
 - Clarify the range that -d accepts

For mkfs.erofs:

 - Print supported algorithms and their level ranges+defaults
 - Clarify the range that -d accepts

For mkfs.erofs to have access to the algorithms' level ranges and
defaults, it is necessary to modify
z_erofs_list_available_compressors() to return the full `struct
erofs_algorithm` instead of just the `->name`.

Signed-off-by: Luke T. Shumaker <lukeshu@umorpha.io>
Link: https://lore.kernel.org/r/20231102193122.140921-3-lukeshu@lukeshu.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
17 months agoerofs-utils: have each non-fuse command take -h, --help, -V, and --version
Luke T. Shumaker [Thu, 2 Nov 2023 19:31:20 +0000 (13:31 -0600)]
erofs-utils: have each non-fuse command take -h, --help, -V, and --version

Consistency is nice.

erofsfuse isn't included here because adjusting its flag handling is
more involved because of the interaction with libfuse; I anticipate
similar changes to erofsfuse in a future patchset.

Signed-off-by: Luke T. Shumaker <lukeshu@umorpha.io>
Link: https://lore.kernel.org/r/20231102193122.140921-2-lukeshu@lukeshu.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
17 months agoerofs-utils: mkfs: fix potential memory leak
Yifan Zhao [Sat, 4 Nov 2023 06:50:41 +0000 (14:50 +0800)]
erofs-utils: mkfs: fix potential memory leak

Valgrind reports 2 potential memory leaks in mkfs:

    Command: mkfs.erofs -zlz4 test.img testdir/

    4 bytes in 1 blocks are still reachable in loss record 1 of 2
       at 0x4841848: malloc (vg_replace_malloc.c:431)
       by 0x49633DE: strdup (strdup.c:42)
       by 0x10C483: mkfs_parse_compress_algs (main.c:287)
       by 0x10C483: mkfs_parse_options_cfg (main.c:316)
       by 0x10C483: main (main.c:936)

    34 bytes in 1 blocks are still reachable in loss record 2 of 2
       at 0x4841848: malloc (vg_replace_malloc.c:431)
       by 0x49633DE: strdup (strdup.c:42)
       by 0x48FFE2B: realpath_stk (canonicalize.c:409)
       by 0x48FFE2B: realpath@@GLIBC_2.3 (canonicalize.c:431)
       by 0x10B7EB: mkfs_parse_options_cfg (main.c:587)
       by 0x10B7EB: main (main.c:936)

Fix it by freeing the memory allocated by strdup() and realpath().

Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231104065041.129680-1-zhaoyifan@sjtu.edu.cn
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
17 months agoerofs-utils: lib: tidy up erofs_compress_destsize()
Gao Xiang [Fri, 27 Oct 2023 07:06:06 +0000 (15:06 +0800)]
erofs-utils: lib: tidy up erofs_compress_destsize()

Drop the old workaround logic to prepare for the following development.

(I've checked the Linux 6.1.53 source code and an AOSP system image
 "system.raven.87e115a1" without any image size change or strange
 behavior.)

Link: https://lore.kernel.org/r/20231027070606.1558363-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
18 months agoerofs-utils: fuse: switch to FUSE 2/3 lowlevel APIs
Li Yiyan [Mon, 18 Sep 2023 09:03:06 +0000 (17:03 +0800)]
erofs-utils: fuse: switch to FUSE 2/3 lowlevel APIs

Support FUSE low-level APIs for erofsfuse. Lowlevel APIs offer improved
performance compared to the previous high-level APIs, while maintaining
compatibility with libfuse version 2 (>=2.6) and 3 (>=3.0).

Dataset: linux 5.15
Compression algorithm: lz4hc,12
Additional options: -T0 -C16384
Test options: --warmup 3 -p "echo 3 > /proc/sys/vm/drop_caches; sleep 1"

Evaluation result (highlevel->lowlevel avg time):
- Sequence metadata: 777.3 ms->180.9 ms
- Sequence data: 3.282 s->818.1 ms
- Random metadata: 1.571 s->928.3 ms
- Random data: 2.461 s->597.8 ms

Signed-off-by: Li Yiyan <lyy0627@sjtu.edu.cn>
Link: https://lore.kernel.org/r/20230918090306.2524624-1-lyy0627@sjtu.edu.cn
[ Gao Xiang: minor code style adjustments and MacOS compilation fix. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
18 months agoerofs-utils: lib: propagate return code for erofs_bflush()
Gao Xiang [Wed, 25 Oct 2023 05:05:31 +0000 (13:05 +0800)]
erofs-utils: lib: propagate return code for erofs_bflush()

Instead of just using a boolean.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231025050531.1507163-1-hsiangkao@linux.alibaba.com
18 months agoerofs-utils: get rid of .preflush()
Gao Xiang [Mon, 23 Oct 2023 08:12:40 +0000 (16:12 +0800)]
erofs-utils: get rid of .preflush()

It's actually never used.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231023081241.1946579-2-hsiangkao@linux.alibaba.com
18 months agoerofs-utils: lib: use BLK_ROUND_UP() for __erofs_battach()
Gao Xiang [Mon, 23 Oct 2023 08:12:39 +0000 (16:12 +0800)]
erofs-utils: lib: use BLK_ROUND_UP() for __erofs_battach()

Also avoid division in BLK_ROUND_UP().

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231023081241.1946579-1-hsiangkao@linux.alibaba.com
18 months agoerofs-utils: lib: switch dedupe_{sub,}tree[] to a hash table
Gao Xiang [Fri, 22 Sep 2023 18:30:54 +0000 (02:30 +0800)]
erofs-utils: lib: switch dedupe_{sub,}tree[] to a hash table

This rb-tree implementation is too slow and there is no benefit of it.

As a result, it could decrease time by 81.1% (5m7.755s -> 0m58.255s)
with the same dataset, sigh.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230922183055.1583756-2-hsiangkao@linux.alibaba.com
18 months agoerofs-utils: lib: use xxh64() for faster filtering first for dedupe
Gao Xiang [Fri, 22 Sep 2023 18:30:53 +0000 (02:30 +0800)]
erofs-utils: lib: use xxh64() for faster filtering first for dedupe

Let's check if xxh64 equals when rolling back on global compressed
deduplication.

As a result, it could decrease time by 26.4% (6m57.990s -> 5m7.755s)
on a dataset with "-Ededupe -C8192".

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230922183055.1583756-1-hsiangkao@linux.alibaba.com
18 months agoerofs-utils: release 1.7.1
Gao Xiang [Thu, 19 Oct 2023 22:45:01 +0000 (06:45 +0800)]
erofs-utils: release 1.7.1

Signed-off-by: Gao Xiang <xiang@kernel.org>
18 months agoerofs-utils: fix reference leak in erofs_mkfs_build_tree_from_path()
Gao Xiang [Thu, 19 Oct 2023 22:43:28 +0000 (06:43 +0800)]
erofs-utils: fix reference leak in erofs_mkfs_build_tree_from_path()

commit 8cbc205185a1 ("erofs-utils: mkfs: fix corrupted directories
with hardlinks") introduced a reference leak although it has no real
impact to users.  Fix it now.

Signed-off-by: Gao Xiang <xiang@kernel.org>
Link: https://lore.kernel.org/r/20231019224328.26015-1-xiang@kernel.org
18 months agoerofs-utils: mkfs: fix corrupted directories with hardlinks
Gao Xiang [Tue, 17 Oct 2023 14:44:20 +0000 (22:44 +0800)]
erofs-utils: mkfs: fix corrupted directories with hardlinks

An inode with hard links may belong to several directories. It's
invalid to update `subdirs_queued` for hard-link inodes since it
only records one of the parent directories.

References: https://github.com/NixOS/nixpkgs/issues/261394
Fixes: 21d84349e79a ("erofs-utils: rearrange on-disk metadata")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231017144420.289469-1-hsiangkao@linux.alibaba.com
18 months agoerofs-utils: errno shouldn't set to a negative value in lib/tar.c
Erik Sjölund [Mon, 2 Oct 2023 17:36:08 +0000 (19:36 +0200)]
erofs-utils: errno shouldn't set to a negative value in lib/tar.c

`errno` should be set to a non-negative value here.

Link: https://lore.kernel.org/r/CAB+1q0Q3+7s1Lt8uW6DWZ7vfjhEKhG7O7MAQhCuH-C10cr9F4g@mail.gmail.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
18 months agoerofs-utils: Fix cross compile with autoconf
Sandeep Dhavale [Thu, 5 Oct 2023 22:40:08 +0000 (15:40 -0700)]
erofs-utils: Fix cross compile with autoconf

AC_RUN_IFELSE expects the action if cross compiling. If not provided
cross compilation fails with error "configure: error: cannot run test
program while cross compiling".
Use 4096 as the buest guess PAGESIZE if cross compiling as it is still
the most common page size.

Reported-in: https://lore.kernel.org/all/0101018aec71b531-0a354b1a-0b70-47a1-8efc-fea8c439304c-000000@us-west-2.amazonses.com/
Fixes: 8ee2e591dfd6 ("erofs-utils: support detecting maximum block size")
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Link: https://lore.kernel.org/r/20231005224008.817830-1-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
19 months agoerofs-utils: release 1.7
Gao Xiang [Thu, 21 Sep 2023 03:30:00 +0000 (11:30 +0800)]
erofs-utils: release 1.7

Signed-off-by: Gao Xiang <xiang@kernel.org>
19 months agoerofs-utils: fix the previous pcluster CBLKCNT missing for big pcluster dedupe
Gao Xiang [Thu, 21 Sep 2023 03:24:17 +0000 (11:24 +0800)]
erofs-utils: fix the previous pcluster CBLKCNT missing for big pcluster dedupe

Similar to 876bec09e48a ("erofs-utils: lib: fix missing CBLKCNT for
big pcluster dedupe"), the previous CBLKCNT cannot be dropped due to
the extent shortening process.

It may cause data corruption on specific data patterns only if both
big pcluster and dedupe features are enabled.

Link: https://lore.kernel.org/r/20230921032417.82739-1-hsiangkao@linux.alibaba.com
Fixes: f3f9a2ce3137 ("erofs-utils: mkfs: introduce global compressed data deduplication")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
19 months agoerofs-utils: fix build error when `-Waddress-of-temporary` is on
Gao Xiang [Wed, 20 Sep 2023 20:03:13 +0000 (04:03 +0800)]
erofs-utils: fix build error when `-Waddress-of-temporary` is on

Actually, it's false positive and only used for build assertion.

Reported-by: Kelvin Zhang <zhangkelvin@google.com>
Signed-off-by: Gao Xiang <xiang@kernel.org>
Link: https://lore.kernel.org/r/20230920200314.9193-1-hsiangkao@aol.com
19 months agoerofs-utils: mkfs: limit total shared xattrs of a single inode
Gao Xiang [Wed, 20 Sep 2023 19:02:20 +0000 (03:02 +0800)]
erofs-utils: mkfs: limit total shared xattrs of a single inode

Don't output more than 255 shared xattrs for a single inode due to the
EROFS on-disk format limitation.

Fixes: 116ac0a254fc ("erofs-utils: introduce shared xattr support")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230920190220.1837650-1-hsiangkao@linux.alibaba.com
19 months agoerofs-utils: manpages: update new options of mkfs.erofs
Gao Xiang [Wed, 20 Sep 2023 16:41:28 +0000 (00:41 +0800)]
erofs-utils: manpages: update new options of mkfs.erofs

 -b block-size
 -E ^xattr-name-filter
 --gzip
 --tar=[fi]
 --xattr-prefix=X

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230920164128.1637554-1-hsiangkao@linux.alibaba.com
19 months agoerofs-utils: lib: fix --force-{g,u}id support for tarerofs
Gao Xiang [Wed, 20 Sep 2023 05:12:23 +0000 (13:12 +0800)]
erofs-utils: lib: fix --force-{g,u}id support for tarerofs

Temporarily move the common part into __erofs_fill_inode() for tarerofs.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230920051223.657008-1-hsiangkao@linux.alibaba.com
19 months agoerofs-utils: mkfs: support exporting GNU tar archive labels
Gao Xiang [Wed, 20 Sep 2023 03:51:41 +0000 (11:51 +0800)]
erofs-utils: mkfs: support exporting GNU tar archive labels

GNU tar volume labels (by using `-V`) will be applied to EROFS.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230920035141.533474-1-hsiangkao@linux.alibaba.com