Sandeep Dhavale [Thu, 18 Jul 2024 20:22:04 +0000 (13:22 -0700)]
erofs-utils: misc: Fix potential memory leak in realloc failure path
As realloc returns NULL on failure, the original value will be
overwritten if it is used as lvalue. Fix this by using a temporary
variable to hold the return value and exit with -ENOMEM in case of
failure. This patch fixes 2 of the realloc blocks with similar fix.
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Link: https://lore.kernel.org/r/20240718202204.1224620-1-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Mon, 15 Jul 2024 03:38:29 +0000 (11:38 +0800)]
erofs-utils: manpage: add more description for --extract option
Especially, extract files to a specific directory.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240715033829.2338056-1-hsiangkao@linux.alibaba.com
Gao Xiang [Sun, 14 Jul 2024 04:41:19 +0000 (12:41 +0800)]
erofs-utils: lib: tar: fix garbage ns timestamps
Some "#if" directives were used incorrectly.
Fixes: 95d315fd7958 ("erofs-utils: introduce tarerofs")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240714044119.1119717-1-hsiangkao@linux.alibaba.com
Gao Xiang [Fri, 12 Jul 2024 09:38:08 +0000 (17:38 +0800)]
erofs-utils: mkfs: fix -U option
`-U <UUID>` option cannot work properly now.
Fixes: 7550a30c332c ("erofs-utils: enable incremental builds")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240712093808.2986196-2-hsiangkao@linux.alibaba.com
Gao Xiang [Fri, 12 Jul 2024 09:38:07 +0000 (17:38 +0800)]
erofs-utils: fix reproducible builds for multi-threaded libdeflate
`last_uncompressed_size` should be reset on the basis of segments.
Fixes: 830b27bc2334 ("erofs-utils: mkfs: introduce inner-file multi-threaded compression")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240712093808.2986196-1-hsiangkao@linux.alibaba.com
Hongzhen Luo [Wed, 10 Jul 2024 08:29:06 +0000 (16:29 +0800)]
erofs-utils: add per-sbi buffer support
It updates all relevant function definitions and callers to get rid of
the global g_sbi, which will be used for external use.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240710082906.203180-2-hongzhen@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Hongzhen Luo [Wed, 10 Jul 2024 08:29:05 +0000 (16:29 +0800)]
erofs-utils: lib/cache.c: replace &g_sbi with sbi
Prepare for the upcoming per-sbi buffers.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240710082906.203180-1-hongzhen@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Sat, 13 Jul 2024 06:40:28 +0000 (14:40 +0800)]
erofs-utils: tar: support ddtaridx format informally
`ddtaridx` is a customized tar meta-only format implemented in
Alibaba's OverlayBD project [1].
Please don't use it externally if you have no idea of this except for
the OverlayBD project. It will be removed if some better way exists.
[1] https://github.com/containerd/overlaybd
Cc: Yifan Yuan <tuji.yyf@alibaba-inc.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240713064028.4134602-1-hsiangkao@linux.alibaba.com
Hongzhen Luo [Tue, 9 Jul 2024 07:38:19 +0000 (15:38 +0800)]
erofs-utils: fix bitops fls_long()
`__builtin_clz` is for unsigned int, while it is now applied to
unsigned long. This fixes it by using `__builtin_clzl`.
`roundup_pow_of_two()` in `erofs_init_devices()` could give wrong
results although the current compiler optimization level "-O2"
covers it up.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240709073819.3061805-1-hongzhen@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Hongzhen Luo [Thu, 4 Jul 2024 05:02:59 +0000 (13:02 +0800)]
erofs-utils: rename the global sbi to g_sbi
Rename the global `sbi` to `g_sbi` to prepare for
the upcoming per-sbi buffer management.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240704050259.520618-2-hongzhen@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Hongzhen Luo [Thu, 4 Jul 2024 05:02:58 +0000 (13:02 +0800)]
erofs-utils: lib: get rid of global sbi in lib/inode.c
Get rid of the global sbi when EROFS_MT_ENABLED is defined.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240704050259.520618-1-hongzhen@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Hongzhen Luo [Wed, 3 Jul 2024 03:03:27 +0000 (11:03 +0800)]
erofs-utils: lib: change function definition of erofs_blocklist_open()
Modify erofs_blocklist_open() to accept a file pointer instead of
a file path, making it suitable for external use in liberofs.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240703030327.3280503-1-hongzhen@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Tue, 2 Jul 2024 08:31:44 +0000 (16:31 +0800)]
erofs-utils: rebuild: only update dev/i_ino[1] pairs for directories
Since the underlying dev/i_ino[1] pairs are only useful for merged
sub-directories, don't bother with other types of inodes.
Otherwise, the original i_ino[1] could be overwritten unexpectedly,
which impacts resvsp mode at least..
Fixes: f64d9d02576b ("erofs-utils: introduce incremental builds")
Reported-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240702083144.2120808-1-hsiangkao@linux.alibaba.com
Gao Xiang [Thu, 27 Jun 2024 03:13:43 +0000 (11:13 +0800)]
erofs-utils: fix "non-trivial designated initializers not supported"
This partially reverts commit
79f6e168d94c ("erofs-utils: improve
compatibility and reduce header conflicts") since some C++ compiler
will complain:
include/erofs_fs.h: In function 'void erofs_check_ondisk_layout_definitions()':
include/erofs_fs.h:460:2: sorry, unimplemented: non-trivial designated initializers not supported
Let's just bypass this compile-time check for the C++ language since
only external programs may be written in C++.
Fixes: 79f6e168d94c ("erofs-utils: improve compatibility and reduce header conflicts")
Cc: Hongzhen Luo <hongzhen@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240627031343.3424030-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Hongzhen Luo [Thu, 27 Jun 2024 02:27:41 +0000 (10:27 +0800)]
erofs-utils: lib: add erofs_get_configure()
This adds `erofs_get_configure()` to get the global configuration
`cfg`. It allows external entities to change the global configuration
through this helper, thereby controlling the EROFS mkfs process.
It is just a temporary helper for liberofs and it will be deprecated
in the future. Don't rely on it too much.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240627022741.3912785-1-hongzhen@linux.alibaba.com
[ Gao Xiang: minor commit message update. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Hongzhen Luo [Tue, 25 Jun 2024 06:24:58 +0000 (14:24 +0800)]
erofs-utils: lib: add erofs_{rebuild_make_root,enable_sb_chksum}
Move erofs_sb_csum_set() and erofs_mkfs_alloc_root() into liberofs
for external use.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240625062458.1514209-1-hongzhen@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Hongzhen Luo [Mon, 24 Jun 2024 06:32:17 +0000 (14:32 +0800)]
erofs-utils: improve compatibility and reduce header conflicts
Adjust initializers of union in erofs-utils to ensure compatibility
with various compilers. The original C99 designated initializer style
was not supported in other compilers (e.g., C++11), leading to build
failures. Additionally, change the codebase to minimize potential
conflicts with headers from other projects.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240624063217.170251-1-hongzhen@linux.alibaba.com
[ Gao Xiang: minor commit message update. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Mon, 24 Jun 2024 11:59:23 +0000 (19:59 +0800)]
erofs-utils: introduce `payload` field in `struct erofs_vfile`
Allow customized `vfile` with non-NULL `ops` utilizing `payload`
for additional information.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240624115923.4090196-2-hsiangkao@linux.alibaba.com
Gao Xiang [Mon, 24 Jun 2024 11:59:22 +0000 (19:59 +0800)]
erofs-utils: derive i_srcpath for erofs_rebuild_mkdir()
Also add missing erofs_iput() on errors.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240624115923.4090196-1-hsiangkao@linux.alibaba.com
Gao Xiang [Sun, 23 Jun 2024 11:59:32 +0000 (19:59 +0800)]
erofs-utils: optimize write_uncompressed_file_from_fd()
Utilize copy offloading to speed up copying data from the source
filesystem to the target EROFS filesystem.
This method improves build speed by approximately 9% (tested with
Linux 5.4.140 source code dataset).
Reported-by: Daan De Meyer <daan.j.demeyer@gmail.com>
Closes: https://lore.kernel.org/r/CAO8sHcmZZORnrJXA=QzmGkYNkNWn7M+amAK_DZ19-WL4kLUvpw@mail.gmail.com
Link: https://lore.kernel.org/r/20240623115932.2696312-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Tue, 18 Jun 2024 11:17:00 +0000 (19:17 +0800)]
erofs-utils: skip all unidentified xattrs from local paths
Just warn out but continue. Don't over-complicate for now.
Reported-by: Gael Donval <gael.donval@manchester.ac.uk>
Closes: https://lore.kernel.org/r/4abed942399fb29933f0fa85cc55d3d795ae8bcd.camel@manchester.ac.uk
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240618111700.267702-1-hsiangkao@linux.alibaba.com
Gao Xiang [Tue, 18 Jun 2024 08:24:14 +0000 (16:24 +0800)]
erofs-utils: enable mapfile for `--tar=f`
The data offsets in the tar streams can always be looked up now:
mkfs.erofs --tar=f,MAPFILE IMAGE TARBALL
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240618082414.47876-9-hsiangkao@linux.alibaba.com
Gao Xiang [Tue, 18 Jun 2024 08:24:13 +0000 (16:24 +0800)]
erofs-utils: enable incremental builds
`--incremental=<data|rvsp>` are now supported for tarerofs but
only `--incremental=rvsp` works for the rebuild mode.
For example:
$ mkfs.erofs --tar=f --gzip --aufs --clean=data foo.erofs f0.tgz
$ mkfs.erofs --tar=f --gzip --aufs --incremental=data foo.erofs f1.tgz
...
$ mkfs.erofs --tar=f --gzip --aufs --incremental=data foo.erofs fn-1.tgz
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240618082414.47876-8-hsiangkao@linux.alibaba.com
Gao Xiang [Tue, 18 Jun 2024 08:24:12 +0000 (16:24 +0800)]
erofs-utils: fix incremental builds for tarerofs index mode
The blob data area should be considered in the total block number to
prevent overlap during incremental builds.
Fixes: b6749839e710 ("erofs-utils: generate preallocated extents for tarerofs")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240618082414.47876-7-hsiangkao@linux.alibaba.com
Gao Xiang [Tue, 18 Jun 2024 08:24:11 +0000 (16:24 +0800)]
erofs-utils: support building image with reserved space
A new mode is prepared in order to preallocate/reserve data blocks only
since some applications tend to fill data after EROFS images are built.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240618082414.47876-6-hsiangkao@linux.alibaba.com
Gao Xiang [Tue, 18 Jun 2024 08:24:10 +0000 (16:24 +0800)]
erofs-utils: fix up unchanged directory pNIDs for incremental builds
For incremental builds, it's unnecessary to dump all unchanged
directories, yet the pNIDs of those directories need to be fixed to
the new parent on-disk inodes.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240618082414.47876-5-hsiangkao@linux.alibaba.com
Gao Xiang [Tue, 18 Jun 2024 08:24:09 +0000 (16:24 +0800)]
erofs-utils: introduce incremental builds
This introduces incremental build support for mkfs, where new on-disk
(meta)data will be appended in a log-structured manner, except for the
root inode (due to current on-disk limitations), as illustrated below:
___________________________________________
| base | delta 0 | delta 1 | .. | delta n-1 |
|______|_________|_________|____|___________|
---> image/data growth direction
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240618082414.47876-4-hsiangkao@linux.alibaba.com
Gao Xiang [Tue, 18 Jun 2024 08:24:08 +0000 (16:24 +0800)]
erofs-utils: fix up root inode for incremental builds
Move the new root inode to the original location if it cannot
be accessed by the super block.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240618082414.47876-3-hsiangkao@linux.alibaba.com
Gao Xiang [Tue, 18 Jun 2024 08:24:07 +0000 (16:24 +0800)]
erofs-utils: mkfs: minor cleanup & rearrangement
Introduce erofs_flush_packed_inode() and more for exporting liberofs
APIs later.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240618082414.47876-2-hsiangkao@linux.alibaba.com
Gao Xiang [Tue, 18 Jun 2024 08:24:06 +0000 (16:24 +0800)]
erofs-utils: simplify erofs_insert_ihash
Get rid of unnecessary arguments for simplicity.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240618082414.47876-1-hsiangkao@linux.alibaba.com
Hongzhen Luo [Wed, 19 Jun 2024 02:50:24 +0000 (10:50 +0800)]
erofs-utils: fix erofs_io_p{read,write} and erofs_dev_close
erofs_io_p{read,write} should return the number of bytes
successfully {read,write}.
This also fixes `erofs_dev_close` which could close random
fds if `vf->ops` is NULL.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240619025024.1109782-1-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 12 Jun 2024 16:18:26 +0000 (00:18 +0800)]
erofs-utils: lib: get rid of global sbi in lib/inode.c
In order to prepare for incremental builds.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240612161826.711279-5-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 12 Jun 2024 16:18:25 +0000 (00:18 +0800)]
erofs-utils: mkfs: assign root NID in the main thread
Thus it can be customized (skipped), especially for incremental builds.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240612161826.711279-4-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 12 Jun 2024 16:18:24 +0000 (00:18 +0800)]
erofs-utils: wrap up superblock reservation for incremental builds
Refactor `erofs_buffer_init()` to wrap up necessary operations for full
builds.
Introduce another `erofs_buffer_init()` to specify start block address
for the upcoming incremental builds.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240612161826.711279-3-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 12 Jun 2024 16:18:23 +0000 (00:18 +0800)]
erofs-utils: lib: use filesystem UUID if the device name is not specified
The device name is not always valid.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240612161826.711279-2-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 12 Jun 2024 16:18:22 +0000 (00:18 +0800)]
erofs-utils: lib: get rid of erofs_prepare_dir_layout()
Just open-code the previous erofs_prepare_dir_file() and rename
`erofs_prepare_dir_layout()` to `erofs_prepare_dir_file()`.
No logic changes.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240612161826.711279-1-hsiangkao@linux.alibaba.com
Gao Xiang [Thu, 13 Jun 2024 02:37:06 +0000 (10:37 +0800)]
erofs-utils: fix incorrect i_nlink in the unified rebuild logic
Fixes: 203c847cc7d1 ("erofs-utils: unify the tree traversal for the rebuild mode")
Closes: https://github.com/erofs/erofsnightly/actions/runs/9492427961/job/26159566596
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240613023706.1269816-1-hsiangkao@linux.alibaba.com
Hongzhen Luo [Wed, 12 Jun 2024 03:11:24 +0000 (11:11 +0800)]
erofs-utils: add I/O control for tarerofs stream via `erofs_vfile`
This adds I/O control for tarerofs stream.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240612031124.1227558-1-hongzhen@linux.alibaba.com
[ Gao Xiang: code styling fixups. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Wed, 12 Jun 2024 02:16:17 +0000 (10:16 +0800)]
erofs-utils: fix the current rebuild mode
`inode->with_diskbuf` can be false in the rebuild mode since
inode data has been mapped before.
Fixes: 203c847cc7d1 ("erofs-utils: unify the tree traversal for the rebuild mode")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240612021617.4025762-1-hsiangkao@linux.alibaba.com
Gao Xiang [Fri, 7 Jun 2024 09:53:19 +0000 (17:53 +0800)]
erofs-utils: lib: split erofs_iflush()
So that external programs can directly use it.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240607095319.2169172-2-hsiangkao@linux.alibaba.com
Gao Xiang [Fri, 7 Jun 2024 09:53:18 +0000 (17:53 +0800)]
erofs-utils: move erofs_writesb() into lib/
So that external programs can directly use it.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240607095319.2169172-1-hsiangkao@linux.alibaba.com
Hongzhen Luo [Thu, 6 Jun 2024 11:18:33 +0000 (19:18 +0800)]
erofs-utils: lib: support virtual files
The current erofs-utils I/O implementation is through file descriptors.
The new `erofs_vfile` provides a more flexible way to perform I/Os.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240606111833.2389455-1-hongzhen@linux.alibaba.com
[ Gao Xiang: minor styling fixes. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
ComixHe [Thu, 6 Jun 2024 07:39:48 +0000 (15:39 +0800)]
erofs-utils: build: support building static library liberofsfuse
Add new option '--enable-static-fuse' so that we
could import erofsfuse as a static library directly
into other projects
Signed-off-by: ComixHe <heyuming@deepin.org>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/3399126AB01D5AB6+2bad5767fc035a7a2234408b0fffa53b3a07aa51.1717659178.git.heyuming@deepin.org
[ Gao Xiang: the target static library shouldn't have dependencies. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Wed, 5 Jun 2024 12:32:33 +0000 (20:32 +0800)]
erofs-utils: support Intel Query Processing Library
This adds the preliminary Intel QPL [1] support to enable built-in
In-Memory Analytics Accelerator [2] started from Sapphire Rapids.
It just leverages the synchronous APIs for the sake of simplicity
for now, thus performance for small compressed clusters can still
be improved in the future if needed anyway.
[ QPL 1.5.0+ is strictly needed for pkg-config detection and
it can be explicitly enabled by `--with-qpl`. ]
Here are some performance numbers for reference:
Processors: Intel(R) Xeon(R) Platinum 8475B (192 cores)
Memory: 512 GiB
Dataset: enwik9 (
1000000000) [3]
Single-threaded decompression:
______________________________________________________________
| |_ Cluster size _|_ Image size _|_ Time (s) _|
| LZ4 | 65536 |
391581696 | 0.364 |
| LZ4 |
1048576 |
373309440 | 0.376 |
| Intel QPL (IAA) |
1048576 |
374816768 | 0.386 |
| Intel QPL (IAA) | 65536 |
376057856 | 0.396 |
| Intel QPL (IAA) | 4096 |
399650816 | 0.675 |
| libdeflate (4k) |
1048576 |
374816768 | 1.862 |
| libdeflate (4k) | 65536 |
376057856 | 1.859 |
| libdeflate (4k) | 4096 |
399749120 | 2.203 |
| libdeflate |
1048576 |
323457024 | 1.318 |
| libdeflate | 65536 |
328712192 | 1.358 |
| libdeflate | 4096 |
389943296 | 2.103 |
| Zstd | N/A |
312548986 | 1.047 |
| Zstd (fast) | N/A |
453096980 | 0.740 |
|_________________|________________|______________|____________|
LZ4 1.9.4: [ mkfs.erofs -zlz4hc,12 -C65536 ]
[ mkfs.erofs -zlz4hc,12 -
C1048576 ]
time fsck/fsck.erofs --extract
QPL 1.5.0 (IAA) / libdeflate 1.20 (4k):
[ mkfs.erofs -zdeflate,level=9,dictsize=4096 -
C1048576 ]
[ mkfs.erofs -zdeflate,level=9,dictsize=4096 -C65536 ]
[ mkfs.erofs -zdeflate,level=9,dictsize=4096 -C4096 ]
time fsck/fsck.erofs --extract
libdeflate 1.20:
[ mkfs.erofs -zdeflate,level=9 -
C1048576 ]
[ mkfs.erofs -zdeflate,level=9 -C65536 ]
[ mkfs.erofs -zdeflate,level=9 -C4096 ]
time fsck/fsck.erofs --extract
Zstd 1.5.6: [ zstd -k ] [ zstd -k --fast ]
time zstd -d -k -f -c --no-progress > /dev/null
[1] https://github.com/intel/qpl
[2] https://www.intel.com/content/www/us/en/products/docs/accelerator-engines/in-memory-analytics-accelerator.html
[3] https://www.mattmahoney.net/dc/textdata.html
Cc: "Feghali, Wajdi K" <wajdi.k.feghali@intel.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240605123233.3833332-1-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 5 Jun 2024 12:14:47 +0000 (20:14 +0800)]
erofs-utils: introduce z_erofs_parse_cfgs()
This userspace implementation will be mainly used for the upcoming
Intel In-Memory Analytics Accelerator integration.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240605121448.3816160-1-hsiangkao@linux.alibaba.com
Gao Xiang [Tue, 4 Jun 2024 08:40:15 +0000 (16:40 +0800)]
erofs-utils: record sb_size instead of sb_extslots
Just follow the kernel implementation.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240604084015.2291157-2-hsiangkao@linux.alibaba.com
Gao Xiang [Tue, 4 Jun 2024 08:40:14 +0000 (16:40 +0800)]
erofs-utils: lib: wrap up zeropadding calculation
Use a simple helper instead of open-coding.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240604084015.2291157-1-hsiangkao@linux.alibaba.com
Gao Xiang [Fri, 31 May 2024 07:13:05 +0000 (15:13 +0800)]
erofs-utils: lib: fix incorrect xattr sharing
There are off-by-one issues after refactoring, and the size of kvbuf
should be calculated by EROFS_XATTR_KVSIZE instead.
Fixes: 5df285cf405d ("erofs-utils: lib: refactor extended attribute name prefixes")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240531071305.1183728-1-hsiangkao@linux.alibaba.com
Gao Xiang [Tue, 28 May 2024 06:43:13 +0000 (14:43 +0800)]
erofs-utils: fix false-positive errors on gcc 4.8.5
Just old compiler bugs.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240528064313.1352565-1-hsiangkao@linux.alibaba.com
Sandeep Dhavale [Thu, 23 May 2024 21:01:31 +0000 (14:01 -0700)]
erofs-utils: lib: improve freeing hashmap in erofs_blob_exit()
Depending on size of the filesystem being built there can be huge number
of elements in the hashmap. Currently we call hashmap_iter_first() in
while loop to iterate and free the elements. However technically
correct, this is inefficient in 2 aspects.
- As we are iterating the elements for removal, we do not need overhead of
rehashing.
- Second part which contributes hugely to the performance is using
hashmap_iter_first() as it starts scanning from index 0 throwing away
the previous successful scan. For sparsely populated hashmap this becomes
O(n^2) in worst case.
Lets fix this by disabling hashmap shrink which avoids rehashing
and use hashmap_iter_next() which is now guaranteed to iterate over
all the elements while removing while avoiding the performance pitfalls
of using hashmap_iter_first().
Test with random data shows performance improvement as:
fs_size Before After
1G 23s 7s
2G 81s 15s
4G 272s 31s
8G 1252s 61s
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240523210131.3126753-3-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Sandeep Dhavale [Thu, 23 May 2024 21:01:30 +0000 (14:01 -0700)]
erofs-utils: lib: provide helper to disable hashmap shrinking
This helper sets hasmap.shrink_at to 0. This is helpful to iterate over
hashmap using hashmap_iter_next() and use hashmap_remove() in single
pass efficeintly.
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240523210131.3126753-2-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Thu, 23 May 2024 02:55:50 +0000 (10:55 +0800)]
erofs-utils: lib: fix uncompressed packed inode
Currently, packed inode can also be used in the unencoded way
such as xattr prefixes.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240523025550.2447091-1-hsiangkao@linux.alibaba.com
Gao Xiang [Mon, 20 May 2024 06:03:01 +0000 (14:03 +0800)]
erofs-utils: unify the tree traversal for the rebuild mode
Let's drop the legacy approach and `tarerofs` will be applied too.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240520060301.2642650-1-hsiangkao@linux.alibaba.com
Gao Xiang [Fri, 17 May 2024 09:00:48 +0000 (17:00 +0800)]
erofs-utils: mkfs: add `--zfeature-bits` option
Thus, we could traverse all compression features with continuous
numbers easily in the testcases.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240517090048.3039594-1-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 15 May 2024 05:16:41 +0000 (13:16 +0800)]
erofs-utils: add preliminary zstd support [x]
This patch just adds a preliminary Zstandard support to erofs-utils
since currently Zstandard doesn't support fixed-sized output compression
officially. Mkfs could take more time to finish but it works at least.
The built-in zstd compressor for erofs-utils is slowly WIP, therefore
apparently it will take more efforts.
[ TODO: Later I tend to add another way to generate fixed-sized input
pclusters temporarily for relatively large pcluster sizes as
an option since it will have minor impacts to the results. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240515051641.3929058-1-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 15 May 2024 17:23:13 +0000 (01:23 +0800)]
erofs-utils: pretty root directory progressinfo
Avoid `Processing ...` or `file dumped (mode 40755)` messages..
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240515172313.661530-1-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 15 May 2024 17:22:34 +0000 (01:22 +0800)]
erofs-utils: correct the default number of workers in the usage
Fixes: 59c36e7a4008 ("erofs-utils: mkfs: use all available processors by default")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240515172236.661035-1-hsiangkao@linux.alibaba.com
Noboru Asai [Wed, 1 May 2024 02:24:20 +0000 (11:24 +0900)]
erofs-utils: optimize pthread_cond_signal calling
Call pthread_cond_signal once per file.
Signed-off-by: Noboru Asai <asai@sijam.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240501022420.1881305-1-asai@sijam.com
[ Gao Xiang: add potential overflow detection. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Tue, 30 Apr 2024 06:37:30 +0000 (14:37 +0800)]
erofs-utils: lib: adjust MicroLZMA default dictionary size
If dict_size is not given, it will be set as max(32k, pclustersize * 8)
but no more than Z_EROFS_LZMA_MAX_DICT_SIZE.
Also kill an obsolete warning since multi-threaded support is landed.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240430063730.599937-2-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 1 May 2024 04:54:10 +0000 (12:54 +0800)]
erofs-utils: record pclustersize in bytes instead of blocks
So that we don't need to handle blocksizes everywhere.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240501045410.1808086-1-hsiangkao@linux.alibaba.com
Gao Xiang [Sat, 27 Apr 2024 06:25:52 +0000 (14:25 +0800)]
erofs-utils: mkfs: use all available processors by default
Fulfill the needs of most users.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240427062552.744810-1-hsiangkao@linux.alibaba.com
Gao Xiang [Mon, 22 Apr 2024 00:34:50 +0000 (08:34 +0800)]
erofs-utils: mkfs: enable inter-file multi-threaded compression
Dispatch deferred ops in another per-sb worker thread. Note that
deferred ops are strictly FIFOed.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-8-xiang@kernel.org
Gao Xiang [Mon, 22 Apr 2024 00:34:49 +0000 (08:34 +0800)]
erofs-utils: lib: introduce non-directory jobitem context
It will describe EROFS_MKFS_JOB_NDIR defer work. Also, start
compression before queueing EROFS_MKFS_JOB_NDIR.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-7-xiang@kernel.org
Yifan Zhao [Mon, 22 Apr 2024 00:34:48 +0000 (08:34 +0800)]
erofs-utils: mkfs: prepare inter-file multi-threaded compression
This patch separates the compression process into two parts.
Specifically, erofs_begin_compressed_file() will trigger compression.
erofs_write_compressed_file() will wait for the compression finish and
write compressed (meta)data.
Note that it's possible that erofs_begin_compressed_file() and
erofs_write_compressed_file() run with different threads even the
global inode context is used, thus add another synchronization point.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Co-authored-by: Tong Xin <xin_tong@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-6-xiang@kernel.org
Gao Xiang [Mon, 22 Apr 2024 00:34:47 +0000 (08:34 +0800)]
erofs-utils: lib: split up z_erofs_mt_compress()
The on-disk compressed data write will be moved into a new function
erofs_mt_write_compressed_file().
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-5-xiang@kernel.org
Gao Xiang [Mon, 22 Apr 2024 00:34:46 +0000 (08:34 +0800)]
erofs-utils: rearrange several fields for multi-threaded mkfs
They should be located in `struct z_erofs_compress_ictx`.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-4-xiang@kernel.org
Gao Xiang [Mon, 22 Apr 2024 00:34:45 +0000 (08:34 +0800)]
erofs-utils: lib: split out erofs_commit_compressed_file()
Just split out on-disk compressed metadata commit logic.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-3-xiang@kernel.org
Gao Xiang [Mon, 22 Apr 2024 00:34:44 +0000 (08:34 +0800)]
erofs-utils: lib: prepare for later deferred work
Split out ordered metadata operations and add the following helpers:
- erofs_mkfs_jobfn()
- erofs_mkfs_go()
to handle these mkfs job items for multi-threadding support.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-2-xiang@kernel.org
Gao Xiang [Mon, 22 Apr 2024 00:34:43 +0000 (08:34 +0800)]
erofs-utils: use erofs_atomic_t for inode->i_count
Since `inode->i_count` can be touched for more than one thread if
multi-threading is enabled.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-1-xiang@kernel.org
Yifan Zhao [Mon, 22 Apr 2024 11:31:32 +0000 (19:31 +0800)]
erofs-utils: fsck: extract chunk-based file with hole correctly
Currently fsck skips file extraction if it finds that EROFS_MAP_MAPPED
is unset, which is not the case for chunk-based files with holes.
This patch handles the corner case correctly.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Noboru Asai [Wed, 24 Apr 2024 05:59:23 +0000 (14:59 +0900)]
erofs-utils: add missing block counting
Add missing block counting when the data to be inlined is not inlined.
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/ZijhA4IJFSO7FYUy@debian
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Sat, 6 Apr 2024 05:37:17 +0000 (13:37 +0800)]
erofs-utils: lib: refine on-disk meta arrangement again
Use DFS instead of BFS since most workloads like `ls -R` and `tar -c`
traverse in depth-first mode. However, it still arranges sub-directory
inodes closely so that it isn't a simple reversion compared to pre-BFS
old versions.
Also the build performance out of linux-6.1.53 source code is greatly
improved by 91.2% (33.040s -> 2.861s) as well as the new image size is
decreased by 0.0094% (120 KiB), which is minor through.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240406053717.565119-2-hsiangkao@linux.alibaba.com
Yifan Zhao [Sat, 6 Apr 2024 05:37:16 +0000 (13:37 +0800)]
erofs-utils: lib: split out several helpers in inode.c
The following new helpers are added to prepare for the upcoming
multi-threaded inter-file compression:
- erofs_mkfs_handle_{non,}directory;
- erofs_write_unencoded_file.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240406053717.565119-1-hsiangkao@linux.alibaba.com
Yifan Zhao [Thu, 18 Apr 2024 12:23:12 +0000 (20:23 +0800)]
erofs-utils: mkfs: skip the redundant write for ztailpacking block
z_erofs_merge_segment() doesn't consider the ztailpacking block in the
extent list and unnecessarily writes it back to the disk. This patch
fixes this issue by introducing a new `inlined` field in the struct
`z_erofs_inmem_extent`.
Fixes: 830b27bc2334 ("erofs-utils: mkfs: introduce inner-file multi-threaded compression")
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
[ Gao Xiang: simplify a bit. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240418122312.99282-1-xiang@kernel.org
Sandeep Dhavale [Wed, 17 Apr 2024 23:48:44 +0000 (16:48 -0700)]
erofs-utils: lib: treat data blocks filled with 0s as a hole
Add optimization to treat data blocks filled with 0s as a hole.
Even though diskspace savings are comparable to chunk based or dedupe,
having no block assigned saves us redundant disk IOs during read.
To detect blocks filled with zeros during chunking, we insert block
filled with zeros (zerochunk) in the hashmap. If we detect a possible
dedupe, we map it to the hole so there is no physical block assigned.
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240417234845.2758882-1-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Sandeep Dhavale [Thu, 18 Apr 2024 00:00:54 +0000 (17:00 -0700)]
erofs-utils: dump: print filesystem blocksize
mkfs.erofs supports creating filesystem images with different
blocksizes. Add filesystem blocksize in super block dump so
its easier to inspect the filesystem.
The field is added after FS magic, so the output now looks like:
Filesystem magic number: 0xE0F5E1E2
Filesystem blocksize: 65536
Filesystem blocks: 21
Filesystem inode metadata start block: 0
Filesystem shared xattr metadata start block: 0
Filesystem root nid: 36
Filesystem lz4_max_distance: 65535
Filesystem sb_extslots: 0
Filesystem inode count: 10
Filesystem created: Wed Apr 17 16:53:10 2024
Filesystem features: sb_csum mtime 0padding
Filesystem UUID:
e66f6dd1-6882-48c3-9770-
fee7c4841a93
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240418000054.2769023-1-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Noboru Asai [Mon, 8 Apr 2024 09:16:26 +0000 (18:16 +0900)]
erofs-utils: change ztailpacking temporary buffer to non-static
In multi-threaded mode, each thread must use a different buffer in
tryrecompress_trailing(), so change this buffer to non static.
Signed-off-by: Noboru Asai <asai@sijam.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240408091627.336554-1-asai@sijam.com
[ Gao Xiang: slightly refine the subject line. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Thu, 11 Apr 2024 10:00:39 +0000 (18:00 +0800)]
erofs-utils: lib: fix tarerofs 32-bit overflows
Otherwise, large files won't be imported properly.
Fixes: e3dfe4b8db26 ("erofs-utils: mkfs: support tgz streams for tarerofs")
Fixes: 95d315fd7958 ("erofs-utils: introduce tarerofs")
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Sandeep Dhavale <dhavale@google.com>
Link: https://lore.kernel.org/r/20240411100039.197417-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Sandeep Dhavale [Wed, 3 Apr 2024 07:07:00 +0000 (00:07 -0700)]
erofs-utils: lib: Fix calculation of minextblks when working with sparse files
When we work with sparse files (files with holes), we need to consider
when the contiguous data block starts after each hole to correctly calculate
minextblks so we can merge consecutive chunks later.
Now that we need to recalculate minextblks multiple places, put the logic
in helper function for avoiding repetition and easier reading.
Fixes: 7b46f7a0160a ("erofs-utils: lib: merge consecutive chunks if possible")
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240403070700.1716252-1-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Tue, 2 Apr 2024 02:58:58 +0000 (10:58 +0800)]
erofs-utils: set opaque flag for directories in tarerofs mode
Opaque dir flag is needed if the tar tree is used immediately for
the upcoming append mode.
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240402025858.1729161-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Wed, 27 Mar 2024 00:46:14 +0000 (08:46 +0800)]
erofs: fix compression fallback in tarerofs mode
The return value of `lseek(fd, fpos, SEEK_SET)` can overflow the `int`
type. Fix this.
Fixes: 376fb2dbe66d ("erofs-utils: lib: introduce diskbuf")
Link: https://lore.kernel.org/r/20240327004614.1465889-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Fri, 22 Mar 2024 08:50:07 +0000 (16:50 +0800)]
erofs-utils: tar: all regular inodes should be zeroed in headerball mode
.. Instead of reporting IO errors which implies a corrupted image.
Fixes: 6894ca9623e7 ("erofs-utils: mkfs: Support tar source without data")
Link: https://lore.kernel.org/r/20240322085007.2592729-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Noboru Asai [Fri, 22 Mar 2024 12:24:31 +0000 (20:24 +0800)]
erofs-utils: move pclustersize to `struct z_erofs_compress_sctx`
With -E(all-)fragments, pclustersize has a different value per segment,
so move it to `struct z_erofs_compress_sctx`.
Signed-off-by: Noboru Asai <asai@sijam.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240321070236.2396573-1-asai@sijam.com
Gao Xiang [Tue, 19 Mar 2024 08:24:55 +0000 (16:24 +0800)]
erofs-utils: lib: fix multi-threaded compression in tarerofs mode
Since pread() can be used during multi-threaded compression, it's
necessary to pass `fpos` in to indicate the absolute offset.
Fixes: aec8487dce4c ("erofs-utils: mkfs: introduce inner-file multi-threaded compression")
Link: https://lore.kernel.org/r/20240319082455.4115493-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Yifan Zhao [Fri, 15 Mar 2024 01:10:19 +0000 (09:10 +0800)]
erofs-utils: mkfs: introduce inner-file multi-threaded compression
Currently, the creation of EROFS compressed image creation is
single-threaded, which suffers from performance issues. This patch
attempts to address it by compressing the large file in parallel.
Specifically, each input file larger than 16MiB is splited into
segments, and each worker thread compresses a segment as if it were
a separate file. Finally, the main thread merges all the compressed
segments.
Multi-threaded compression is not compatible with -Ededupe,
-E(all-)fragments for now.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Co-authored-by: Tong Xin <xin_tong@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240315011019.610442-5-hsiangkao@linux.alibaba.com
Link: https://lore.kernel.org/r/ZfaW3oLe8Q2621DV@debian
Gao Xiang [Fri, 15 Mar 2024 01:10:18 +0000 (09:10 +0800)]
erofs-utils: lib: introduce atomic operations
Add some helpers (relaxed semantics) in order to prepare for the
upcoming multi-threaded support.
For example, compressor may be initialized more than once in different
worker threads, resulting in noisy warnings.
This patch makes sure that each message will be printed only once by
adding `__warnonce` atomic booleans to each erofs_compressor_init().
Cc: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240315011019.610442-4-hsiangkao@linux.alibaba.com
Yifan Zhao [Fri, 15 Mar 2024 01:10:17 +0000 (09:10 +0800)]
erofs-utils: mkfs: add `--workers=#` parameter
This patch introduces `--workers=#` parameter for the incoming
multi-threaded compression support.
It also introduces a concept called `segment size` to split large
inodes for multi-threaded compression, which has the fixed value
16MiB and cannot be modified for now.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240315011019.610442-3-hsiangkao@linux.alibaba.com
Gao Xiang [Fri, 15 Mar 2024 01:10:16 +0000 (09:10 +0800)]
erofs-utils: add a helper to get available processors
In order to prepare for multi-threaded decompression.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240315011019.610442-2-hsiangkao@linux.alibaba.com
Yifan Zhao [Fri, 15 Mar 2024 01:10:15 +0000 (09:10 +0800)]
erofs-utils: introduce multi-threading framework
Add a workqueue implementation for multi-threading support inspired by
xfsprogs.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Suggested-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240315011019.610442-1-hsiangkao@linux.alibaba.com
Gao Xiang [Sun, 3 Mar 2024 14:35:30 +0000 (22:35 +0800)]
erofs-utils: support xz/lzma/lzip streams for tarerofs
Similar to commit
e3dfe4b8db26 ("erofs-utils: mkfs: support tgz streams
for tarerofs"), let's add xz/lzma/lzip support by using liblzma.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240303143530.4077607-1-hsiangkao@linux.alibaba.com
Mike Baynton [Tue, 27 Feb 2024 08:42:21 +0000 (16:42 +0800)]
erofs-utils: mkfs: Support tar source without data
This improves performance of meta-only image creation in cases where the
source is a tarball stream that is not seekable. The writer may now use
`--tar=headerball` and omit the file data. Previously, the stream writer
was forced to send the file's size worth of null bytes or any data after
each tar header which was simply discarded by mkfs.erofs.
Signed-off-by: Mike Baynton <mike@mbaynton.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240227084221.342635-1-hsiangkao@linux.alibaba.com
Gao Xiang [Fri, 8 Mar 2024 07:24:29 +0000 (15:24 +0800)]
erofs-utils: update my outdated misleading email address
The @kernel.org one is always preferred through.
Signed-off-by: Gao Xiang <xiang@kernel.org>
Gao Xiang [Thu, 22 Feb 2024 09:01:45 +0000 (17:01 +0800)]
erofs-utils: support dumping raw tar streams together
Since commit
e3dfe4b8db26 ("erofs-utils: mkfs: support tgz streams for
tarerofs"), tgz streams can be converted to EROFS directly.
However, many use cases also require raw tar streams. Let's add
support for dumping raw streams with `--ungzip=FILE` option.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240222090145.709808-1-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 21 Feb 2024 05:51:44 +0000 (13:51 +0800)]
erofs-utils: support liblzma auto-detection
The new XZ Utils 5.4 is now available in most Linux distributions.
Let's enable liblzma auto-detection as well as get rid of MicroLZMA
EXPERIMENTAL warning.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240221055144.4054806-1-hsiangkao@linux.alibaba.com
Gao Xiang [Tue, 20 Feb 2024 17:16:59 +0000 (01:16 +0800)]
erofs-utils: support zlib auto-detection
Fix explicit `--with-zlib` so that it errors out when zlib
is unavailable.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240220171700.3693176-1-hsiangkao@linux.alibaba.com
Tianyi Liu [Thu, 8 Feb 2024 13:59:09 +0000 (21:59 +0800)]
erofs-utils: lib: fix incorrect usage of `erofs_strerror`
`erofs_strerror` accepts a negative argument,
so `errno` should be inverted before passing to it.
Signed-off-by: Tianyi Liu <i.pear@outlook.com>
Link: https://lore.kernel.org/r/SY6P282MB3193657433D35C3A7799CA5F9D442@SY6P282MB3193.AUSP282.PROD.OUTLOOK.COM
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Wed, 24 Jan 2024 09:16:21 +0000 (17:16 +0800)]
erofs-utils: lib: reset HC to avoid 32-bit overflow of kite-deflate
Yifan reported a "segmentation fault (core dumped)" error days ago
with a large dataset (enwik9 x 5). Let's fix it.
Reported-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Fixes: 861037f4fc15 ("erofs-utils: add a built-in DEFLATE compressor")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Tested-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Link: https://lore.kernel.org/r/20240124091621.2413606-1-hsiangkao@linux.alibaba.com
Yifan Zhao [Sun, 21 Jan 2024 12:29:02 +0000 (20:29 +0800)]
erofs-utils: mkfs: reorganize logic in erofs_compressor_init()
Currently, the initialization of compressors follows an unusual order:
`.init()` is called first, followed by `.setlevel()`, and then
`.setdictsize()`. However, the actual initialization occurs within the
last-called `.setdictsize()`, for the MicroLZMA, DEFLATE, and libdeflate
compressors.
This patch reorders these functions, with `.init()` now being invoked
last, allowing it to use the compression level and dictsize already set
so that the behavior of the functions matches their names.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Link: https://lore.kernel.org/r/20240121122902.207756-1-zhaoyifan@sjtu.edu.cn
[ Gao Xiang: refine the commit message. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Yifan Zhao [Sat, 20 Jan 2024 11:53:19 +0000 (19:53 +0800)]
erofs-utils: mkfs: allow to specify dictionary size for compression algorithms
Currently, the dictionary size for compression algorithms is fixed. This
patch allows to specify different ones with new -zX,dictsize=<dictsize>
options.
This patch also changes the way to specify compression levels. Now, the
compression level is specified with -zX,level=<level> options and could
be specified together with dictsize. The old -zX,<level> form is still
supported for compatibility.
Suggested-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Link: https://lore.kernel.org/r/20240120115319.152366-1-zhaoyifan@sjtu.edu.cn
[ Gao Xiang: minor update. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Yifan Zhao [Sat, 20 Jan 2024 11:53:14 +0000 (19:53 +0800)]
erofs-utils: mkfs: merge erofs_compressor_setlevel() into erofs_compressor_init()
Currently erofs_compressor_setlevel() is only called once just after
erofs_compressor_init() while initializing compressors. Let's just hide
this interface and set the compression level in erofs_compressor_init().
Besides, we do not need to assign the {default,best}_level for an
algorithm which does not support the compression level in its
erofs_compressor struct.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Link: https://lore.kernel.org/r/20240120115314.152285-1-zhaoyifan@sjtu.edu.cn
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>