Gao Xiang [Thu, 13 Jun 2024 02:37:06 +0000 (10:37 +0800)]
erofs-utils: fix incorrect i_nlink in the unified rebuild logic
Fixes: 203c847cc7d1 ("erofs-utils: unify the tree traversal for the rebuild mode")
Closes: https://github.com/erofs/erofsnightly/actions/runs/9492427961/job/26159566596
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240613023706.1269816-1-hsiangkao@linux.alibaba.com
Hongzhen Luo [Wed, 12 Jun 2024 03:11:24 +0000 (11:11 +0800)]
erofs-utils: add I/O control for tarerofs stream via `erofs_vfile`
This adds I/O control for tarerofs stream.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240612031124.1227558-1-hongzhen@linux.alibaba.com
[ Gao Xiang: code styling fixups. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Wed, 12 Jun 2024 02:16:17 +0000 (10:16 +0800)]
erofs-utils: fix the current rebuild mode
`inode->with_diskbuf` can be false in the rebuild mode since
inode data has been mapped before.
Fixes: 203c847cc7d1 ("erofs-utils: unify the tree traversal for the rebuild mode")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240612021617.4025762-1-hsiangkao@linux.alibaba.com
Gao Xiang [Fri, 7 Jun 2024 09:53:19 +0000 (17:53 +0800)]
erofs-utils: lib: split erofs_iflush()
So that external programs can directly use it.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240607095319.2169172-2-hsiangkao@linux.alibaba.com
Gao Xiang [Fri, 7 Jun 2024 09:53:18 +0000 (17:53 +0800)]
erofs-utils: move erofs_writesb() into lib/
So that external programs can directly use it.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240607095319.2169172-1-hsiangkao@linux.alibaba.com
Hongzhen Luo [Thu, 6 Jun 2024 11:18:33 +0000 (19:18 +0800)]
erofs-utils: lib: support virtual files
The current erofs-utils I/O implementation is through file descriptors.
The new `erofs_vfile` provides a more flexible way to perform I/Os.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240606111833.2389455-1-hongzhen@linux.alibaba.com
[ Gao Xiang: minor styling fixes. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
ComixHe [Thu, 6 Jun 2024 07:39:48 +0000 (15:39 +0800)]
erofs-utils: build: support building static library liberofsfuse
Add new option '--enable-static-fuse' so that we
could import erofsfuse as a static library directly
into other projects
Signed-off-by: ComixHe <heyuming@deepin.org>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/3399126AB01D5AB6+2bad5767fc035a7a2234408b0fffa53b3a07aa51.1717659178.git.heyuming@deepin.org
[ Gao Xiang: the target static library shouldn't have dependencies. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Wed, 5 Jun 2024 12:32:33 +0000 (20:32 +0800)]
erofs-utils: support Intel Query Processing Library
This adds the preliminary Intel QPL [1] support to enable built-in
In-Memory Analytics Accelerator [2] started from Sapphire Rapids.
It just leverages the synchronous APIs for the sake of simplicity
for now, thus performance for small compressed clusters can still
be improved in the future if needed anyway.
[ QPL 1.5.0+ is strictly needed for pkg-config detection and
it can be explicitly enabled by `--with-qpl`. ]
Here are some performance numbers for reference:
Processors: Intel(R) Xeon(R) Platinum 8475B (192 cores)
Memory: 512 GiB
Dataset: enwik9 (
1000000000) [3]
Single-threaded decompression:
______________________________________________________________
| |_ Cluster size _|_ Image size _|_ Time (s) _|
| LZ4 | 65536 |
391581696 | 0.364 |
| LZ4 |
1048576 |
373309440 | 0.376 |
| Intel QPL (IAA) |
1048576 |
374816768 | 0.386 |
| Intel QPL (IAA) | 65536 |
376057856 | 0.396 |
| Intel QPL (IAA) | 4096 |
399650816 | 0.675 |
| libdeflate (4k) |
1048576 |
374816768 | 1.862 |
| libdeflate (4k) | 65536 |
376057856 | 1.859 |
| libdeflate (4k) | 4096 |
399749120 | 2.203 |
| libdeflate |
1048576 |
323457024 | 1.318 |
| libdeflate | 65536 |
328712192 | 1.358 |
| libdeflate | 4096 |
389943296 | 2.103 |
| Zstd | N/A |
312548986 | 1.047 |
| Zstd (fast) | N/A |
453096980 | 0.740 |
|_________________|________________|______________|____________|
LZ4 1.9.4: [ mkfs.erofs -zlz4hc,12 -C65536 ]
[ mkfs.erofs -zlz4hc,12 -
C1048576 ]
time fsck/fsck.erofs --extract
QPL 1.5.0 (IAA) / libdeflate 1.20 (4k):
[ mkfs.erofs -zdeflate,level=9,dictsize=4096 -
C1048576 ]
[ mkfs.erofs -zdeflate,level=9,dictsize=4096 -C65536 ]
[ mkfs.erofs -zdeflate,level=9,dictsize=4096 -C4096 ]
time fsck/fsck.erofs --extract
libdeflate 1.20:
[ mkfs.erofs -zdeflate,level=9 -
C1048576 ]
[ mkfs.erofs -zdeflate,level=9 -C65536 ]
[ mkfs.erofs -zdeflate,level=9 -C4096 ]
time fsck/fsck.erofs --extract
Zstd 1.5.6: [ zstd -k ] [ zstd -k --fast ]
time zstd -d -k -f -c --no-progress > /dev/null
[1] https://github.com/intel/qpl
[2] https://www.intel.com/content/www/us/en/products/docs/accelerator-engines/in-memory-analytics-accelerator.html
[3] https://www.mattmahoney.net/dc/textdata.html
Cc: "Feghali, Wajdi K" <wajdi.k.feghali@intel.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240605123233.3833332-1-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 5 Jun 2024 12:14:47 +0000 (20:14 +0800)]
erofs-utils: introduce z_erofs_parse_cfgs()
This userspace implementation will be mainly used for the upcoming
Intel In-Memory Analytics Accelerator integration.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240605121448.3816160-1-hsiangkao@linux.alibaba.com
Gao Xiang [Tue, 4 Jun 2024 08:40:15 +0000 (16:40 +0800)]
erofs-utils: record sb_size instead of sb_extslots
Just follow the kernel implementation.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240604084015.2291157-2-hsiangkao@linux.alibaba.com
Gao Xiang [Tue, 4 Jun 2024 08:40:14 +0000 (16:40 +0800)]
erofs-utils: lib: wrap up zeropadding calculation
Use a simple helper instead of open-coding.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240604084015.2291157-1-hsiangkao@linux.alibaba.com
Gao Xiang [Fri, 31 May 2024 07:13:05 +0000 (15:13 +0800)]
erofs-utils: lib: fix incorrect xattr sharing
There are off-by-one issues after refactoring, and the size of kvbuf
should be calculated by EROFS_XATTR_KVSIZE instead.
Fixes: 5df285cf405d ("erofs-utils: lib: refactor extended attribute name prefixes")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240531071305.1183728-1-hsiangkao@linux.alibaba.com
Gao Xiang [Tue, 28 May 2024 06:43:13 +0000 (14:43 +0800)]
erofs-utils: fix false-positive errors on gcc 4.8.5
Just old compiler bugs.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240528064313.1352565-1-hsiangkao@linux.alibaba.com
Sandeep Dhavale [Thu, 23 May 2024 21:01:31 +0000 (14:01 -0700)]
erofs-utils: lib: improve freeing hashmap in erofs_blob_exit()
Depending on size of the filesystem being built there can be huge number
of elements in the hashmap. Currently we call hashmap_iter_first() in
while loop to iterate and free the elements. However technically
correct, this is inefficient in 2 aspects.
- As we are iterating the elements for removal, we do not need overhead of
rehashing.
- Second part which contributes hugely to the performance is using
hashmap_iter_first() as it starts scanning from index 0 throwing away
the previous successful scan. For sparsely populated hashmap this becomes
O(n^2) in worst case.
Lets fix this by disabling hashmap shrink which avoids rehashing
and use hashmap_iter_next() which is now guaranteed to iterate over
all the elements while removing while avoiding the performance pitfalls
of using hashmap_iter_first().
Test with random data shows performance improvement as:
fs_size Before After
1G 23s 7s
2G 81s 15s
4G 272s 31s
8G 1252s 61s
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240523210131.3126753-3-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Sandeep Dhavale [Thu, 23 May 2024 21:01:30 +0000 (14:01 -0700)]
erofs-utils: lib: provide helper to disable hashmap shrinking
This helper sets hasmap.shrink_at to 0. This is helpful to iterate over
hashmap using hashmap_iter_next() and use hashmap_remove() in single
pass efficeintly.
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240523210131.3126753-2-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Thu, 23 May 2024 02:55:50 +0000 (10:55 +0800)]
erofs-utils: lib: fix uncompressed packed inode
Currently, packed inode can also be used in the unencoded way
such as xattr prefixes.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240523025550.2447091-1-hsiangkao@linux.alibaba.com
Gao Xiang [Mon, 20 May 2024 06:03:01 +0000 (14:03 +0800)]
erofs-utils: unify the tree traversal for the rebuild mode
Let's drop the legacy approach and `tarerofs` will be applied too.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240520060301.2642650-1-hsiangkao@linux.alibaba.com
Gao Xiang [Fri, 17 May 2024 09:00:48 +0000 (17:00 +0800)]
erofs-utils: mkfs: add `--zfeature-bits` option
Thus, we could traverse all compression features with continuous
numbers easily in the testcases.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240517090048.3039594-1-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 15 May 2024 05:16:41 +0000 (13:16 +0800)]
erofs-utils: add preliminary zstd support [x]
This patch just adds a preliminary Zstandard support to erofs-utils
since currently Zstandard doesn't support fixed-sized output compression
officially. Mkfs could take more time to finish but it works at least.
The built-in zstd compressor for erofs-utils is slowly WIP, therefore
apparently it will take more efforts.
[ TODO: Later I tend to add another way to generate fixed-sized input
pclusters temporarily for relatively large pcluster sizes as
an option since it will have minor impacts to the results. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240515051641.3929058-1-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 15 May 2024 17:23:13 +0000 (01:23 +0800)]
erofs-utils: pretty root directory progressinfo
Avoid `Processing ...` or `file dumped (mode 40755)` messages..
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240515172313.661530-1-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 15 May 2024 17:22:34 +0000 (01:22 +0800)]
erofs-utils: correct the default number of workers in the usage
Fixes: 59c36e7a4008 ("erofs-utils: mkfs: use all available processors by default")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240515172236.661035-1-hsiangkao@linux.alibaba.com
Noboru Asai [Wed, 1 May 2024 02:24:20 +0000 (11:24 +0900)]
erofs-utils: optimize pthread_cond_signal calling
Call pthread_cond_signal once per file.
Signed-off-by: Noboru Asai <asai@sijam.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240501022420.1881305-1-asai@sijam.com
[ Gao Xiang: add potential overflow detection. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Tue, 30 Apr 2024 06:37:30 +0000 (14:37 +0800)]
erofs-utils: lib: adjust MicroLZMA default dictionary size
If dict_size is not given, it will be set as max(32k, pclustersize * 8)
but no more than Z_EROFS_LZMA_MAX_DICT_SIZE.
Also kill an obsolete warning since multi-threaded support is landed.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240430063730.599937-2-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 1 May 2024 04:54:10 +0000 (12:54 +0800)]
erofs-utils: record pclustersize in bytes instead of blocks
So that we don't need to handle blocksizes everywhere.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240501045410.1808086-1-hsiangkao@linux.alibaba.com
Gao Xiang [Sat, 27 Apr 2024 06:25:52 +0000 (14:25 +0800)]
erofs-utils: mkfs: use all available processors by default
Fulfill the needs of most users.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240427062552.744810-1-hsiangkao@linux.alibaba.com
Gao Xiang [Mon, 22 Apr 2024 00:34:50 +0000 (08:34 +0800)]
erofs-utils: mkfs: enable inter-file multi-threaded compression
Dispatch deferred ops in another per-sb worker thread. Note that
deferred ops are strictly FIFOed.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-8-xiang@kernel.org
Gao Xiang [Mon, 22 Apr 2024 00:34:49 +0000 (08:34 +0800)]
erofs-utils: lib: introduce non-directory jobitem context
It will describe EROFS_MKFS_JOB_NDIR defer work. Also, start
compression before queueing EROFS_MKFS_JOB_NDIR.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-7-xiang@kernel.org
Yifan Zhao [Mon, 22 Apr 2024 00:34:48 +0000 (08:34 +0800)]
erofs-utils: mkfs: prepare inter-file multi-threaded compression
This patch separates the compression process into two parts.
Specifically, erofs_begin_compressed_file() will trigger compression.
erofs_write_compressed_file() will wait for the compression finish and
write compressed (meta)data.
Note that it's possible that erofs_begin_compressed_file() and
erofs_write_compressed_file() run with different threads even the
global inode context is used, thus add another synchronization point.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Co-authored-by: Tong Xin <xin_tong@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-6-xiang@kernel.org
Gao Xiang [Mon, 22 Apr 2024 00:34:47 +0000 (08:34 +0800)]
erofs-utils: lib: split up z_erofs_mt_compress()
The on-disk compressed data write will be moved into a new function
erofs_mt_write_compressed_file().
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-5-xiang@kernel.org
Gao Xiang [Mon, 22 Apr 2024 00:34:46 +0000 (08:34 +0800)]
erofs-utils: rearrange several fields for multi-threaded mkfs
They should be located in `struct z_erofs_compress_ictx`.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-4-xiang@kernel.org
Gao Xiang [Mon, 22 Apr 2024 00:34:45 +0000 (08:34 +0800)]
erofs-utils: lib: split out erofs_commit_compressed_file()
Just split out on-disk compressed metadata commit logic.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-3-xiang@kernel.org
Gao Xiang [Mon, 22 Apr 2024 00:34:44 +0000 (08:34 +0800)]
erofs-utils: lib: prepare for later deferred work
Split out ordered metadata operations and add the following helpers:
- erofs_mkfs_jobfn()
- erofs_mkfs_go()
to handle these mkfs job items for multi-threadding support.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-2-xiang@kernel.org
Gao Xiang [Mon, 22 Apr 2024 00:34:43 +0000 (08:34 +0800)]
erofs-utils: use erofs_atomic_t for inode->i_count
Since `inode->i_count` can be touched for more than one thread if
multi-threading is enabled.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240422003450.19132-1-xiang@kernel.org
Yifan Zhao [Mon, 22 Apr 2024 11:31:32 +0000 (19:31 +0800)]
erofs-utils: fsck: extract chunk-based file with hole correctly
Currently fsck skips file extraction if it finds that EROFS_MAP_MAPPED
is unset, which is not the case for chunk-based files with holes.
This patch handles the corner case correctly.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Noboru Asai [Wed, 24 Apr 2024 05:59:23 +0000 (14:59 +0900)]
erofs-utils: add missing block counting
Add missing block counting when the data to be inlined is not inlined.
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/ZijhA4IJFSO7FYUy@debian
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Sat, 6 Apr 2024 05:37:17 +0000 (13:37 +0800)]
erofs-utils: lib: refine on-disk meta arrangement again
Use DFS instead of BFS since most workloads like `ls -R` and `tar -c`
traverse in depth-first mode. However, it still arranges sub-directory
inodes closely so that it isn't a simple reversion compared to pre-BFS
old versions.
Also the build performance out of linux-6.1.53 source code is greatly
improved by 91.2% (33.040s -> 2.861s) as well as the new image size is
decreased by 0.0094% (120 KiB), which is minor through.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240406053717.565119-2-hsiangkao@linux.alibaba.com
Yifan Zhao [Sat, 6 Apr 2024 05:37:16 +0000 (13:37 +0800)]
erofs-utils: lib: split out several helpers in inode.c
The following new helpers are added to prepare for the upcoming
multi-threaded inter-file compression:
- erofs_mkfs_handle_{non,}directory;
- erofs_write_unencoded_file.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240406053717.565119-1-hsiangkao@linux.alibaba.com
Yifan Zhao [Thu, 18 Apr 2024 12:23:12 +0000 (20:23 +0800)]
erofs-utils: mkfs: skip the redundant write for ztailpacking block
z_erofs_merge_segment() doesn't consider the ztailpacking block in the
extent list and unnecessarily writes it back to the disk. This patch
fixes this issue by introducing a new `inlined` field in the struct
`z_erofs_inmem_extent`.
Fixes: 830b27bc2334 ("erofs-utils: mkfs: introduce inner-file multi-threaded compression")
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
[ Gao Xiang: simplify a bit. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240418122312.99282-1-xiang@kernel.org
Sandeep Dhavale [Wed, 17 Apr 2024 23:48:44 +0000 (16:48 -0700)]
erofs-utils: lib: treat data blocks filled with 0s as a hole
Add optimization to treat data blocks filled with 0s as a hole.
Even though diskspace savings are comparable to chunk based or dedupe,
having no block assigned saves us redundant disk IOs during read.
To detect blocks filled with zeros during chunking, we insert block
filled with zeros (zerochunk) in the hashmap. If we detect a possible
dedupe, we map it to the hole so there is no physical block assigned.
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240417234845.2758882-1-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Sandeep Dhavale [Thu, 18 Apr 2024 00:00:54 +0000 (17:00 -0700)]
erofs-utils: dump: print filesystem blocksize
mkfs.erofs supports creating filesystem images with different
blocksizes. Add filesystem blocksize in super block dump so
its easier to inspect the filesystem.
The field is added after FS magic, so the output now looks like:
Filesystem magic number: 0xE0F5E1E2
Filesystem blocksize: 65536
Filesystem blocks: 21
Filesystem inode metadata start block: 0
Filesystem shared xattr metadata start block: 0
Filesystem root nid: 36
Filesystem lz4_max_distance: 65535
Filesystem sb_extslots: 0
Filesystem inode count: 10
Filesystem created: Wed Apr 17 16:53:10 2024
Filesystem features: sb_csum mtime 0padding
Filesystem UUID:
e66f6dd1-6882-48c3-9770-
fee7c4841a93
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240418000054.2769023-1-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Noboru Asai [Mon, 8 Apr 2024 09:16:26 +0000 (18:16 +0900)]
erofs-utils: change ztailpacking temporary buffer to non-static
In multi-threaded mode, each thread must use a different buffer in
tryrecompress_trailing(), so change this buffer to non static.
Signed-off-by: Noboru Asai <asai@sijam.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240408091627.336554-1-asai@sijam.com
[ Gao Xiang: slightly refine the subject line. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Thu, 11 Apr 2024 10:00:39 +0000 (18:00 +0800)]
erofs-utils: lib: fix tarerofs 32-bit overflows
Otherwise, large files won't be imported properly.
Fixes: e3dfe4b8db26 ("erofs-utils: mkfs: support tgz streams for tarerofs")
Fixes: 95d315fd7958 ("erofs-utils: introduce tarerofs")
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Sandeep Dhavale <dhavale@google.com>
Link: https://lore.kernel.org/r/20240411100039.197417-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Sandeep Dhavale [Wed, 3 Apr 2024 07:07:00 +0000 (00:07 -0700)]
erofs-utils: lib: Fix calculation of minextblks when working with sparse files
When we work with sparse files (files with holes), we need to consider
when the contiguous data block starts after each hole to correctly calculate
minextblks so we can merge consecutive chunks later.
Now that we need to recalculate minextblks multiple places, put the logic
in helper function for avoiding repetition and easier reading.
Fixes: 7b46f7a0160a ("erofs-utils: lib: merge consecutive chunks if possible")
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240403070700.1716252-1-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Tue, 2 Apr 2024 02:58:58 +0000 (10:58 +0800)]
erofs-utils: set opaque flag for directories in tarerofs mode
Opaque dir flag is needed if the tar tree is used immediately for
the upcoming append mode.
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240402025858.1729161-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Wed, 27 Mar 2024 00:46:14 +0000 (08:46 +0800)]
erofs: fix compression fallback in tarerofs mode
The return value of `lseek(fd, fpos, SEEK_SET)` can overflow the `int`
type. Fix this.
Fixes: 376fb2dbe66d ("erofs-utils: lib: introduce diskbuf")
Link: https://lore.kernel.org/r/20240327004614.1465889-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Fri, 22 Mar 2024 08:50:07 +0000 (16:50 +0800)]
erofs-utils: tar: all regular inodes should be zeroed in headerball mode
.. Instead of reporting IO errors which implies a corrupted image.
Fixes: 6894ca9623e7 ("erofs-utils: mkfs: Support tar source without data")
Link: https://lore.kernel.org/r/20240322085007.2592729-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Noboru Asai [Fri, 22 Mar 2024 12:24:31 +0000 (20:24 +0800)]
erofs-utils: move pclustersize to `struct z_erofs_compress_sctx`
With -E(all-)fragments, pclustersize has a different value per segment,
so move it to `struct z_erofs_compress_sctx`.
Signed-off-by: Noboru Asai <asai@sijam.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240321070236.2396573-1-asai@sijam.com
Gao Xiang [Tue, 19 Mar 2024 08:24:55 +0000 (16:24 +0800)]
erofs-utils: lib: fix multi-threaded compression in tarerofs mode
Since pread() can be used during multi-threaded compression, it's
necessary to pass `fpos` in to indicate the absolute offset.
Fixes: aec8487dce4c ("erofs-utils: mkfs: introduce inner-file multi-threaded compression")
Link: https://lore.kernel.org/r/20240319082455.4115493-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Yifan Zhao [Fri, 15 Mar 2024 01:10:19 +0000 (09:10 +0800)]
erofs-utils: mkfs: introduce inner-file multi-threaded compression
Currently, the creation of EROFS compressed image creation is
single-threaded, which suffers from performance issues. This patch
attempts to address it by compressing the large file in parallel.
Specifically, each input file larger than 16MiB is splited into
segments, and each worker thread compresses a segment as if it were
a separate file. Finally, the main thread merges all the compressed
segments.
Multi-threaded compression is not compatible with -Ededupe,
-E(all-)fragments for now.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Co-authored-by: Tong Xin <xin_tong@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240315011019.610442-5-hsiangkao@linux.alibaba.com
Link: https://lore.kernel.org/r/ZfaW3oLe8Q2621DV@debian
Gao Xiang [Fri, 15 Mar 2024 01:10:18 +0000 (09:10 +0800)]
erofs-utils: lib: introduce atomic operations
Add some helpers (relaxed semantics) in order to prepare for the
upcoming multi-threaded support.
For example, compressor may be initialized more than once in different
worker threads, resulting in noisy warnings.
This patch makes sure that each message will be printed only once by
adding `__warnonce` atomic booleans to each erofs_compressor_init().
Cc: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240315011019.610442-4-hsiangkao@linux.alibaba.com
Yifan Zhao [Fri, 15 Mar 2024 01:10:17 +0000 (09:10 +0800)]
erofs-utils: mkfs: add `--workers=#` parameter
This patch introduces `--workers=#` parameter for the incoming
multi-threaded compression support.
It also introduces a concept called `segment size` to split large
inodes for multi-threaded compression, which has the fixed value
16MiB and cannot be modified for now.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240315011019.610442-3-hsiangkao@linux.alibaba.com
Gao Xiang [Fri, 15 Mar 2024 01:10:16 +0000 (09:10 +0800)]
erofs-utils: add a helper to get available processors
In order to prepare for multi-threaded decompression.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240315011019.610442-2-hsiangkao@linux.alibaba.com
Yifan Zhao [Fri, 15 Mar 2024 01:10:15 +0000 (09:10 +0800)]
erofs-utils: introduce multi-threading framework
Add a workqueue implementation for multi-threading support inspired by
xfsprogs.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Suggested-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240315011019.610442-1-hsiangkao@linux.alibaba.com
Gao Xiang [Sun, 3 Mar 2024 14:35:30 +0000 (22:35 +0800)]
erofs-utils: support xz/lzma/lzip streams for tarerofs
Similar to commit
e3dfe4b8db26 ("erofs-utils: mkfs: support tgz streams
for tarerofs"), let's add xz/lzma/lzip support by using liblzma.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240303143530.4077607-1-hsiangkao@linux.alibaba.com
Mike Baynton [Tue, 27 Feb 2024 08:42:21 +0000 (16:42 +0800)]
erofs-utils: mkfs: Support tar source without data
This improves performance of meta-only image creation in cases where the
source is a tarball stream that is not seekable. The writer may now use
`--tar=headerball` and omit the file data. Previously, the stream writer
was forced to send the file's size worth of null bytes or any data after
each tar header which was simply discarded by mkfs.erofs.
Signed-off-by: Mike Baynton <mike@mbaynton.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240227084221.342635-1-hsiangkao@linux.alibaba.com
Gao Xiang [Fri, 8 Mar 2024 07:24:29 +0000 (15:24 +0800)]
erofs-utils: update my outdated misleading email address
The @kernel.org one is always preferred through.
Signed-off-by: Gao Xiang <xiang@kernel.org>
Gao Xiang [Thu, 22 Feb 2024 09:01:45 +0000 (17:01 +0800)]
erofs-utils: support dumping raw tar streams together
Since commit
e3dfe4b8db26 ("erofs-utils: mkfs: support tgz streams for
tarerofs"), tgz streams can be converted to EROFS directly.
However, many use cases also require raw tar streams. Let's add
support for dumping raw streams with `--ungzip=FILE` option.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240222090145.709808-1-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 21 Feb 2024 05:51:44 +0000 (13:51 +0800)]
erofs-utils: support liblzma auto-detection
The new XZ Utils 5.4 is now available in most Linux distributions.
Let's enable liblzma auto-detection as well as get rid of MicroLZMA
EXPERIMENTAL warning.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240221055144.4054806-1-hsiangkao@linux.alibaba.com
Gao Xiang [Tue, 20 Feb 2024 17:16:59 +0000 (01:16 +0800)]
erofs-utils: support zlib auto-detection
Fix explicit `--with-zlib` so that it errors out when zlib
is unavailable.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240220171700.3693176-1-hsiangkao@linux.alibaba.com
Tianyi Liu [Thu, 8 Feb 2024 13:59:09 +0000 (21:59 +0800)]
erofs-utils: lib: fix incorrect usage of `erofs_strerror`
`erofs_strerror` accepts a negative argument,
so `errno` should be inverted before passing to it.
Signed-off-by: Tianyi Liu <i.pear@outlook.com>
Link: https://lore.kernel.org/r/SY6P282MB3193657433D35C3A7799CA5F9D442@SY6P282MB3193.AUSP282.PROD.OUTLOOK.COM
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Wed, 24 Jan 2024 09:16:21 +0000 (17:16 +0800)]
erofs-utils: lib: reset HC to avoid 32-bit overflow of kite-deflate
Yifan reported a "segmentation fault (core dumped)" error days ago
with a large dataset (enwik9 x 5). Let's fix it.
Reported-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Fixes: 861037f4fc15 ("erofs-utils: add a built-in DEFLATE compressor")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Tested-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Link: https://lore.kernel.org/r/20240124091621.2413606-1-hsiangkao@linux.alibaba.com
Yifan Zhao [Sun, 21 Jan 2024 12:29:02 +0000 (20:29 +0800)]
erofs-utils: mkfs: reorganize logic in erofs_compressor_init()
Currently, the initialization of compressors follows an unusual order:
`.init()` is called first, followed by `.setlevel()`, and then
`.setdictsize()`. However, the actual initialization occurs within the
last-called `.setdictsize()`, for the MicroLZMA, DEFLATE, and libdeflate
compressors.
This patch reorders these functions, with `.init()` now being invoked
last, allowing it to use the compression level and dictsize already set
so that the behavior of the functions matches their names.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Link: https://lore.kernel.org/r/20240121122902.207756-1-zhaoyifan@sjtu.edu.cn
[ Gao Xiang: refine the commit message. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Yifan Zhao [Sat, 20 Jan 2024 11:53:19 +0000 (19:53 +0800)]
erofs-utils: mkfs: allow to specify dictionary size for compression algorithms
Currently, the dictionary size for compression algorithms is fixed. This
patch allows to specify different ones with new -zX,dictsize=<dictsize>
options.
This patch also changes the way to specify compression levels. Now, the
compression level is specified with -zX,level=<level> options and could
be specified together with dictsize. The old -zX,<level> form is still
supported for compatibility.
Suggested-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Link: https://lore.kernel.org/r/20240120115319.152366-1-zhaoyifan@sjtu.edu.cn
[ Gao Xiang: minor update. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Yifan Zhao [Sat, 20 Jan 2024 11:53:14 +0000 (19:53 +0800)]
erofs-utils: mkfs: merge erofs_compressor_setlevel() into erofs_compressor_init()
Currently erofs_compressor_setlevel() is only called once just after
erofs_compressor_init() while initializing compressors. Let's just hide
this interface and set the compression level in erofs_compressor_init().
Besides, we do not need to assign the {default,best}_level for an
algorithm which does not support the compression level in its
erofs_compressor struct.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Link: https://lore.kernel.org/r/20240120115314.152285-1-zhaoyifan@sjtu.edu.cn
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Tue, 16 Jan 2024 04:26:42 +0000 (12:26 +0800)]
erofs-utils: avoid noisy prints if stdout is not a tty
As Daan reported, "mkfs.erofs is super verbose as \r doesn't go to
the beginning of the line". Don't print messages like this if
stdout is not a tty.
Reported-by: Daan De Meyer <daan.j.demeyer@gmail.com>
Closes: https://lore.kernel.org/r/CAO8sHcmnq+foWo7AZYbkxJXHfSeZkd73Dq+1dQSZYBE6QxL8JQ@mail.gmail.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240116042642.3124559-1-hsiangkao@linux.alibaba.com
Gao Xiang [Mon, 15 Jan 2024 15:05:50 +0000 (23:05 +0800)]
erofs-utils: lib: use dummy_pivot to dedupe the beginnings of files
The beginnings of files are incorrectly skipped for deduplication, which
causes unexpected image size regression. Fix it.
Reported-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Fixes: 8ead5f8bd38c ("erofs-utils: lib: generate compression indexes in memory first")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240115150550.1961455-1-hsiangkao@linux.alibaba.com
Yifan Zhao [Mon, 18 Dec 2023 14:57:10 +0000 (22:57 +0800)]
erofs-utils: lib: generate compression indexes in memory first
Currently, mkfs generates the on-disk indexes of each compressed extent
on the fly during compressing, which is inflexible if we'd like to merge
sub-indexes of a file later for the multi-threaded scenarios.
Let's generate on-disk indexes after the compression is completed.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231218145710.132164-3-hsiangkao@linux.alibaba.com
Gao Xiang [Mon, 18 Dec 2023 14:57:09 +0000 (22:57 +0800)]
erofs-utils: lib: split vle_compress_one()
Split compression for each extent into a new helper for later reworking.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231218145710.132164-2-hsiangkao@linux.alibaba.com
Gao Xiang [Mon, 18 Dec 2023 14:57:08 +0000 (22:57 +0800)]
erofs-utils: lib: add z_erofs_need_refill()
Let's remove redundant logic.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231218145710.132164-1-hsiangkao@linux.alibaba.com
Li Yiyan [Tue, 12 Dec 2023 11:54:27 +0000 (19:54 +0800)]
erofs-utils: fuse: support FUSE 2/3 multi-threading
Support multi-threading for erofsfuse and adjust the configure.ac to
allow users of FUSE 3(> 3.2) to use API version 32, while maintaining
compatibility with API version 30 for FUSE 3 (3.0/3.1) and API version
26 for FUSE 2.
Signed-off-by: Li Yiyan <lyy0627@sjtu.edu.cn>
Link: https://lore.kernel.org/r/20231212115427.2779792-1-lyy0627@sjtu.edu.cn
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Yifan Zhao [Sun, 14 Jan 2024 04:42:29 +0000 (12:42 +0800)]
erofs-utils: mkfs: fix a misspelling
Fix a misspelling in the version() function of mkfs.erofs.
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Link: https://lore.kernel.org/r/20240114044229.626815-1-zhaoyifan@sjtu.edu.cn
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Thu, 23 Nov 2023 05:22:45 +0000 (13:22 +0800)]
erofs-utils: mkfs: support compact indexes for smaller block sizes
This commit also adds mkfs support of compact indexes for smaller
block sizes (less than 4096).
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231123052245.868698-2-hsiangkao@linux.alibaba.com
Gao Xiang [Thu, 23 Nov 2023 05:22:44 +0000 (13:22 +0800)]
erofs-utils: lib: fix up compact indexes for block size < 4096
Let's keep in sync with kernel commit
8d2517aaeea3 ("erofs: fix up
compacted indexes for block size < 4096").
Original kernel commit:
Previously, the block size always equaled to PAGE_SIZE, therefore
`lclusterbits` couldn't be less than 12.
Since sub-page compressed blocks are now considered, `lobits` for
a lcluster in each pack cannot always be `lclusterbits` as before.
Otherwise, there is no enough room for the special value
`Z_EROFS_LI_D0_CBLKCNT`.
To support smaller block sizes, `lobits` for each compacted lcluster is
now calculated as:
lobits = max(lclusterbits, ilog2(Z_EROFS_LI_D0_CBLKCNT) + 1)
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231123052245.868698-1-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 15 Nov 2023 02:49:52 +0000 (10:49 +0800)]
erofs-utils: lib: `fragment_size` should be 64 bits
`-Eall-fragments` will be broken if i_size is more than 32 bits.
Fixes: fcaa988a6ef6 ("erofs-utils: add `-Eall-fragments` option")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Link: https://lore.kernel.org/r/20231115024952.1256243-1-hsiangkao@linux.alibaba.com
Gao Xiang [Tue, 7 Nov 2023 07:55:55 +0000 (15:55 +0800)]
erofs-utils: mkfs,fsck,dump: support `--offset` option
Add `--offset` option to allows users to specify an offset in the file
where the filesystem will begin.
Suggested-by: Pavel Otchertsov <pavel.otchertsov@gmail.com>
Closes: https://lore.kernel.org/r/CAAxnTOGTD2NkKnBphZ+vEr7NVnWvT0u02E+c8pN8ZVFcXp5uhg@mail.gmail.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231107075555.2554444-1-hsiangkao@linux.alibaba.com
Luke T. Shumaker [Thu, 2 Nov 2023 19:31:22 +0000 (13:31 -0600)]
erofs-utils: fsck: Add -a, -A, and -y flags
Other fsck.${filesystem} commands generally take -a or -p and
sometimes -A to automatically repair a filesystem, and -y to either
repair agree to all prompts about repairing.
For example:
- fsck.ext{2,3,4} takes -a or -p to repair, and -y to agree
- fsck.xfs takes -y to repair; and -a, -A, or -p to silence a warning
about repairing
- fsck.btrfs takes -a, -A, -p, or -y to silence a warning about repairing
So, like fsck.btrfs, we should accept these flags as no-ops, for
compatibility with programs that expect to be able to pass these to
fsck. In particular, Arch Linux's mkinitcpio (when fsck is enabled)
unconditionally passes -a to `fsck`.
Naturally, I'd have liked to include '-p' in the list, but it already
does something different for fsck.erofs. I'd like to call out the
fsck.ext4 manual, which says:
-a This option does the same thing as the -p option. It is
provided for backwards compatibility only; it is
suggested that people use -p option whenever possible.
Signed-off-by: Luke T. Shumaker <lukeshu@umorpha.io>
Link: https://lore.kernel.org/r/20231102193122.140921-4-lukeshu@lukeshu.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Luke T. Shumaker [Thu, 2 Nov 2023 19:31:21 +0000 (13:31 -0600)]
erofs-utils: improve the usage and version text of non-fuse commands
For each command:
- Change the format of --help to be closer to the usual GNU format
- Have the --version text mention that it is part of erofs-utils
- Include compile-time feature flags in -V
- Have --help and --version print on stdout not stderr
- Exit with 0 from --help and --version
- Have flag errors print a message saying to use --help instead of
printing the full help text
For fsck.erofs:
- Consolidate the descriptions of --[no-]preserve[-<owner|perms>
- Clarify the range that -d accepts
For mkfs.erofs:
- Print supported algorithms and their level ranges+defaults
- Clarify the range that -d accepts
For mkfs.erofs to have access to the algorithms' level ranges and
defaults, it is necessary to modify
z_erofs_list_available_compressors() to return the full `struct
erofs_algorithm` instead of just the `->name`.
Signed-off-by: Luke T. Shumaker <lukeshu@umorpha.io>
Link: https://lore.kernel.org/r/20231102193122.140921-3-lukeshu@lukeshu.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Luke T. Shumaker [Thu, 2 Nov 2023 19:31:20 +0000 (13:31 -0600)]
erofs-utils: have each non-fuse command take -h, --help, -V, and --version
Consistency is nice.
erofsfuse isn't included here because adjusting its flag handling is
more involved because of the interaction with libfuse; I anticipate
similar changes to erofsfuse in a future patchset.
Signed-off-by: Luke T. Shumaker <lukeshu@umorpha.io>
Link: https://lore.kernel.org/r/20231102193122.140921-2-lukeshu@lukeshu.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Yifan Zhao [Sat, 4 Nov 2023 06:50:41 +0000 (14:50 +0800)]
erofs-utils: mkfs: fix potential memory leak
Valgrind reports 2 potential memory leaks in mkfs:
Command: mkfs.erofs -zlz4 test.img testdir/
4 bytes in 1 blocks are still reachable in loss record 1 of 2
at 0x4841848: malloc (vg_replace_malloc.c:431)
by 0x49633DE: strdup (strdup.c:42)
by 0x10C483: mkfs_parse_compress_algs (main.c:287)
by 0x10C483: mkfs_parse_options_cfg (main.c:316)
by 0x10C483: main (main.c:936)
34 bytes in 1 blocks are still reachable in loss record 2 of 2
at 0x4841848: malloc (vg_replace_malloc.c:431)
by 0x49633DE: strdup (strdup.c:42)
by 0x48FFE2B: realpath_stk (canonicalize.c:409)
by 0x48FFE2B: realpath@@GLIBC_2.3 (canonicalize.c:431)
by 0x10B7EB: mkfs_parse_options_cfg (main.c:587)
by 0x10B7EB: main (main.c:936)
Fix it by freeing the memory allocated by strdup() and realpath().
Signed-off-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231104065041.129680-1-zhaoyifan@sjtu.edu.cn
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Fri, 27 Oct 2023 07:06:06 +0000 (15:06 +0800)]
erofs-utils: lib: tidy up erofs_compress_destsize()
Drop the old workaround logic to prepare for the following development.
(I've checked the Linux 6.1.53 source code and an AOSP system image
"system.raven.
87e115a1" without any image size change or strange
behavior.)
Link: https://lore.kernel.org/r/20231027070606.1558363-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Li Yiyan [Mon, 18 Sep 2023 09:03:06 +0000 (17:03 +0800)]
erofs-utils: fuse: switch to FUSE 2/3 lowlevel APIs
Support FUSE low-level APIs for erofsfuse. Lowlevel APIs offer improved
performance compared to the previous high-level APIs, while maintaining
compatibility with libfuse version 2 (>=2.6) and 3 (>=3.0).
Dataset: linux 5.15
Compression algorithm: lz4hc,12
Additional options: -T0 -C16384
Test options: --warmup 3 -p "echo 3 > /proc/sys/vm/drop_caches; sleep 1"
Evaluation result (highlevel->lowlevel avg time):
- Sequence metadata: 777.3 ms->180.9 ms
- Sequence data: 3.282 s->818.1 ms
- Random metadata: 1.571 s->928.3 ms
- Random data: 2.461 s->597.8 ms
Signed-off-by: Li Yiyan <lyy0627@sjtu.edu.cn>
Link: https://lore.kernel.org/r/20230918090306.2524624-1-lyy0627@sjtu.edu.cn
[ Gao Xiang: minor code style adjustments and MacOS compilation fix. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Wed, 25 Oct 2023 05:05:31 +0000 (13:05 +0800)]
erofs-utils: lib: propagate return code for erofs_bflush()
Instead of just using a boolean.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231025050531.1507163-1-hsiangkao@linux.alibaba.com
Gao Xiang [Mon, 23 Oct 2023 08:12:40 +0000 (16:12 +0800)]
erofs-utils: get rid of .preflush()
It's actually never used.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231023081241.1946579-2-hsiangkao@linux.alibaba.com
Gao Xiang [Mon, 23 Oct 2023 08:12:39 +0000 (16:12 +0800)]
erofs-utils: lib: use BLK_ROUND_UP() for __erofs_battach()
Also avoid division in BLK_ROUND_UP().
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231023081241.1946579-1-hsiangkao@linux.alibaba.com
Gao Xiang [Fri, 22 Sep 2023 18:30:54 +0000 (02:30 +0800)]
erofs-utils: lib: switch dedupe_{sub,}tree[] to a hash table
This rb-tree implementation is too slow and there is no benefit of it.
As a result, it could decrease time by 81.1% (5m7.755s -> 0m58.255s)
with the same dataset, sigh.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230922183055.1583756-2-hsiangkao@linux.alibaba.com
Gao Xiang [Fri, 22 Sep 2023 18:30:53 +0000 (02:30 +0800)]
erofs-utils: lib: use xxh64() for faster filtering first for dedupe
Let's check if xxh64 equals when rolling back on global compressed
deduplication.
As a result, it could decrease time by 26.4% (6m57.990s -> 5m7.755s)
on a dataset with "-Ededupe -C8192".
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230922183055.1583756-1-hsiangkao@linux.alibaba.com
Gao Xiang [Thu, 19 Oct 2023 22:45:01 +0000 (06:45 +0800)]
erofs-utils: release 1.7.1
Signed-off-by: Gao Xiang <xiang@kernel.org>
Gao Xiang [Thu, 19 Oct 2023 22:43:28 +0000 (06:43 +0800)]
erofs-utils: fix reference leak in erofs_mkfs_build_tree_from_path()
commit
8cbc205185a1 ("erofs-utils: mkfs: fix corrupted directories
with hardlinks") introduced a reference leak although it has no real
impact to users. Fix it now.
Signed-off-by: Gao Xiang <xiang@kernel.org>
Link: https://lore.kernel.org/r/20231019224328.26015-1-xiang@kernel.org
Gao Xiang [Tue, 17 Oct 2023 14:44:20 +0000 (22:44 +0800)]
erofs-utils: mkfs: fix corrupted directories with hardlinks
An inode with hard links may belong to several directories. It's
invalid to update `subdirs_queued` for hard-link inodes since it
only records one of the parent directories.
References: https://github.com/NixOS/nixpkgs/issues/261394
Fixes: 21d84349e79a ("erofs-utils: rearrange on-disk metadata")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231017144420.289469-1-hsiangkao@linux.alibaba.com
Erik Sjölund [Mon, 2 Oct 2023 17:36:08 +0000 (19:36 +0200)]
erofs-utils: errno shouldn't set to a negative value in lib/tar.c
`errno` should be set to a non-negative value here.
Link: https://lore.kernel.org/r/CAB+1q0Q3+7s1Lt8uW6DWZ7vfjhEKhG7O7MAQhCuH-C10cr9F4g@mail.gmail.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Sandeep Dhavale [Thu, 5 Oct 2023 22:40:08 +0000 (15:40 -0700)]
erofs-utils: Fix cross compile with autoconf
AC_RUN_IFELSE expects the action if cross compiling. If not provided
cross compilation fails with error "configure: error: cannot run test
program while cross compiling".
Use 4096 as the buest guess PAGESIZE if cross compiling as it is still
the most common page size.
Reported-in: https://lore.kernel.org/all/
0101018aec71b531-
0a354b1a-0b70-47a1-8efc-
fea8c439304c-000000@us-west-2.amazonses.com/
Fixes: 8ee2e591dfd6 ("erofs-utils: support detecting maximum block size")
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Link: https://lore.kernel.org/r/20231005224008.817830-1-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Thu, 21 Sep 2023 03:30:00 +0000 (11:30 +0800)]
erofs-utils: release 1.7
Signed-off-by: Gao Xiang <xiang@kernel.org>
Gao Xiang [Thu, 21 Sep 2023 03:24:17 +0000 (11:24 +0800)]
erofs-utils: fix the previous pcluster CBLKCNT missing for big pcluster dedupe
Similar to
876bec09e48a ("erofs-utils: lib: fix missing CBLKCNT for
big pcluster dedupe"), the previous CBLKCNT cannot be dropped due to
the extent shortening process.
It may cause data corruption on specific data patterns only if both
big pcluster and dedupe features are enabled.
Link: https://lore.kernel.org/r/20230921032417.82739-1-hsiangkao@linux.alibaba.com
Fixes: f3f9a2ce3137 ("erofs-utils: mkfs: introduce global compressed data deduplication")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Wed, 20 Sep 2023 20:03:13 +0000 (04:03 +0800)]
erofs-utils: fix build error when `-Waddress-of-temporary` is on
Actually, it's false positive and only used for build assertion.
Reported-by: Kelvin Zhang <zhangkelvin@google.com>
Signed-off-by: Gao Xiang <xiang@kernel.org>
Link: https://lore.kernel.org/r/20230920200314.9193-1-hsiangkao@aol.com
Gao Xiang [Wed, 20 Sep 2023 19:02:20 +0000 (03:02 +0800)]
erofs-utils: mkfs: limit total shared xattrs of a single inode
Don't output more than 255 shared xattrs for a single inode due to the
EROFS on-disk format limitation.
Fixes: 116ac0a254fc ("erofs-utils: introduce shared xattr support")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230920190220.1837650-1-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 20 Sep 2023 16:41:28 +0000 (00:41 +0800)]
erofs-utils: manpages: update new options of mkfs.erofs
-b block-size
-E ^xattr-name-filter
--gzip
--tar=[fi]
--xattr-prefix=X
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230920164128.1637554-1-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 20 Sep 2023 05:12:23 +0000 (13:12 +0800)]
erofs-utils: lib: fix --force-{g,u}id support for tarerofs
Temporarily move the common part into __erofs_fill_inode() for tarerofs.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230920051223.657008-1-hsiangkao@linux.alibaba.com
Gao Xiang [Wed, 20 Sep 2023 03:51:41 +0000 (11:51 +0800)]
erofs-utils: mkfs: support exporting GNU tar archive labels
GNU tar volume labels (by using `-V`) will be applied to EROFS.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230920035141.533474-1-hsiangkao@linux.alibaba.com
Sandeep Dhavale [Tue, 19 Sep 2023 21:02:20 +0000 (14:02 -0700)]
erofs-utils: lib: Restore memory address before free()
We move `idx` pointer as we iterate through for loop based on `count`. If
we error out from the loop, use the original pointer of allocated memory
when calling free().
Fixes: 39147b48b76d ("erofs-utils: lib: add erofs_rebuild_load_tree() helper")
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Link: https://lore.kernel.org/r/20230919210220.3657736-1-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Gao Xiang [Tue, 19 Sep 2023 18:59:47 +0000 (02:59 +0800)]
erofs-utils: mkfs: support tgz streams for tarerofs
Introduce iostream to wrap up the input tarball stream for tarerofs.
Besides, add builtin tgz support if zlib is linked to mkfs.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230919185947.3996843-1-hsiangkao@linux.alibaba.com