Yu Watanabe [Thu, 5 Oct 2017 01:21:50 +0000 (10:21 +0900)]
sysusers: do not create unneeded users
Yu Watanabe [Fri, 6 Oct 2017 07:06:21 +0000 (16:06 +0900)]
unit: enable DynamicUser= for journal-upload
Yu Watanabe [Fri, 6 Oct 2017 07:05:20 +0000 (16:05 +0900)]
timesyncd: enable DynamicUser=
Yu Watanabe [Fri, 6 Oct 2017 07:03:33 +0000 (16:03 +0900)]
mkdir: introduce follow_symlink flag to mkdir_safe{,_label}()
Frederic Crozat [Thu, 5 Oct 2017 23:28:19 +0000 (01:28 +0200)]
tmpfiles: remove old ICE and X11 sockets at boot (#6979)
tmpfiles: remove old ICE and X11 sockets at boot
When not using tmpfs based /tmp, leftover sockets
might prevent X startup. Ensure directory is clean at boot time.
g0tar [Thu, 5 Oct 2017 20:17:51 +0000 (22:17 +0200)]
pass currently completed word to systemctl list-unit-files/list-units (#6927)
This change noticeably increases completion performance at the expense
of preventing possible _correct, _approximate or any matcher-list rules.
Still, responsiveness increase so huge seems to make it worth the price.
Zbigniew Jędrzejewski-Szmek [Thu, 5 Oct 2017 20:06:39 +0000 (22:06 +0200)]
Merge pull request #6999 from poettering/seccomp-newgroups
add three new syscall groups, and port @privileged to make use of more existing ones
Zbigniew Jędrzejewski-Szmek [Thu, 5 Oct 2017 19:51:13 +0000 (21:51 +0200)]
Merge pull request #7008 from poettering/sorevision235
bump so revision for 235 and mailmap updates
Zbigniew Jędrzejewski-Szmek [Thu, 5 Oct 2017 19:24:36 +0000 (21:24 +0200)]
Merge pull request #6949 from poettering/restart-servers
Automatically forget learnt DNS server information when network config changes
Lennart Poettering [Thu, 5 Oct 2017 16:26:02 +0000 (18:26 +0200)]
Merge pull request #6909 from sourcejedi/units
Unit dependency fixes (and cleanups)
Lennart Poettering [Thu, 5 Oct 2017 15:38:40 +0000 (17:38 +0200)]
update .mailmap a bit more
Lennart Poettering [Thu, 5 Oct 2017 15:23:17 +0000 (17:23 +0200)]
NEWS: one more addition
Lennart Poettering [Thu, 5 Oct 2017 15:14:04 +0000 (17:14 +0200)]
build-sys: bump so revisions for prepation of 235
Lennart Poettering [Thu, 5 Oct 2017 14:53:32 +0000 (16:53 +0200)]
resolved: include DNS server feature level info in SIGUSR1 status dump
let's make the status dump more useful for tracking down server issues.
Lennart Poettering [Fri, 29 Sep 2017 19:19:54 +0000 (21:19 +0200)]
resolved: add support for explicitly forgetting everything we learnt about DNS server feature levels
This adds "systemd-resolve --reset-server-features" for explicitly
forgetting what we learnt. This might be useful for debugging
purposes, and to force systemd-resolved to restart its learning logic
for all DNS servers.
Lennart Poettering [Fri, 29 Sep 2017 19:18:29 +0000 (21:18 +0200)]
resolved: automatically forget all learnt DNS server information when the network configuration changes
When the network configuration changes we should relearn everything
there is to know about the configured DNS servers, because we might talk
to the same addresses, but there might be different servers behind them.
Lennart Poettering [Mon, 2 Oct 2017 07:16:50 +0000 (09:16 +0200)]
seccomp: port @privileged to use @reboot + @swap
Let's reuse two groups we already defined to make @privileged a bit
shorter.
Lennart Poettering [Wed, 4 Oct 2017 19:09:52 +0000 (21:09 +0200)]
seccomp: there is no "kexec" syscall
it's called "kexec_load".
Lennart Poettering [Sat, 30 Sep 2017 12:34:50 +0000 (14:34 +0200)]
seccomp: add three more seccomp groups
@aio → asynchronous IO calls
@sync → msync/fsync/... and friends
@chown → changing file ownership
(Also, change @privileged to reference @chown now, instead of the
individual syscalls it contains)
Zbigniew Jędrzejewski-Szmek [Thu, 5 Oct 2017 13:41:33 +0000 (15:41 +0200)]
Update mailmap and contributor list (#7006)
Also fix typo (by using a word that is a better git anyway.)
Lennart Poettering [Thu, 5 Oct 2017 13:05:02 +0000 (15:05 +0200)]
units: restore User=systemd-journal-gateway in systemd-journal-gatewayd.service (#7005)
After the discussions around #7003 I think we should restore the
User=systemd-journal-gateway line for systemd-journal-gatewayd.service,
too, so that we continue to use the state user if it exists, and create
it as dynamic user only when it does not.
Note that undoes part of a change made after 234, i.e. a never released
change.
Zbigniew Jędrzejewski-Szmek [Thu, 5 Oct 2017 13:04:19 +0000 (15:04 +0200)]
core: make gc_marker unsigned (#7004)
This matches the definition in unit.h.
Djalal Harouni [Thu, 5 Oct 2017 12:46:41 +0000 (14:46 +0200)]
seccomp: remove 'gettid' syscall from '@process' syscall set (#6989)
The gettid syscall is one of the most basic syscalls, it never fails and
it operates on current thread. Most applications are not suposed to use
it, however even if it is used there is no much justification on blocking
it. This patch removes it from '@process' set so if users blacklist this
set to block setns or clone syscalls, the gettid syscall will still be
available. Of course they can always block gettid explicitly.
Note that the gettid is already in the '@default' set.
Zbigniew Jędrzejewski-Szmek [Thu, 5 Oct 2017 12:42:29 +0000 (14:42 +0200)]
Merge pull request #6931 from poettering/job-timeout-sec
Zbigniew Jędrzejewski-Szmek [Thu, 5 Oct 2017 11:16:31 +0000 (13:16 +0200)]
NEWS: some nitpicking and bike-shedding
Zbigniew Jędrzejewski-Szmek [Thu, 5 Oct 2017 11:22:03 +0000 (13:22 +0200)]
Merge pull request #6952 from poettering/seccomp-getrlimit
a seccomp fix regarding ugetrlimit/prlimit64
Lennart Poettering [Wed, 27 Sep 2017 15:33:09 +0000 (17:33 +0200)]
generator: when we insert a '\n', actually place a proper newline, too
Lennart Poettering [Wed, 27 Sep 2017 15:30:50 +0000 (17:30 +0200)]
unit: when JobTimeoutSec= is turned off, implicitly turn off JobRunningTimeoutSec= too
We added JobRunningTimeoutSec= late, and Dracut configured only
JobTimeoutSec= to turn of root device timeouts before. With this change
we'll propagate a reset of JobTimeoutSec= into JobRunningTimeoutSec=,
but only if the latter wasn't set explicitly.
This should restore compatibility with older systemd versions.
Fixes: #6402
Andrew Jeddeloh [Thu, 5 Oct 2017 10:58:02 +0000 (03:58 -0700)]
Revert "networkd: change UseMTU default to true. (#6837)" (#6950)
This reverts commit
22043e4317ecd2bc7834b48a6d364de76bb26d91.
UseMTU is broken on real hardware and should not be enabled by default.
Zbigniew Jędrzejewski-Szmek [Thu, 5 Oct 2017 10:35:24 +0000 (12:35 +0200)]
Merge pull request #6988 from poettering/dns-stub-truncate
rework how resolved's dns stub deals with truncation
Lennart Poettering [Wed, 4 Oct 2017 10:35:48 +0000 (12:35 +0200)]
resolved: rework how we handle truncation in the stub resolver
When we a reply message gets longer than the client supports we need to
truncate the response and set the TC bit, and we already do that.
However, we are not supposed to send incomplete RRs in that case, but
instead truncate right at a record boundary. Do that.
This fixes the "Message parser reports malformed message packet."
warning the venerable "host" tool outputs when a very large response is
requested.
See: #6520
Lennart Poettering [Wed, 4 Oct 2017 09:57:10 +0000 (11:57 +0200)]
resolved: take benefit of log_xyz_errno() returning the negative error code
Just some modernizations.
Lennart Poettering [Thu, 5 Oct 2017 09:26:09 +0000 (11:26 +0200)]
seccomp: ignore (and debug log) errors by all invocations of seccomp_rule_add_exact()
System calls might exist on some archs but not on others, or might be
multiplexed but not on others. Ignore such errors when putting together
a filter at this location like we already do it on all others.
Lennart Poettering [Thu, 5 Oct 2017 09:24:51 +0000 (11:24 +0200)]
seccomp: always handle seccomp_load() failing the same way
Unfortunately libseccomp doesn't return (nor document) clean error
codes, hence until then only check for specific error codes that we
propagate, but ignore (but debug log) all others. Do this at one more
place, we are already doing that at all others.
Lennart Poettering [Thu, 5 Oct 2017 09:23:07 +0000 (11:23 +0200)]
seccomp: react gracefully if we can't translate a syscall name
When a libseccomp implementation doesn't know a syscall yet, that's no
reason for us to fail completely. Instead, debug log, and proceed.
This hopefully fixes the preadv2/pwritev2 issues pointed out here:
https://github.com/systemd/systemd/pull/6952#issuecomment-
334302923
Lennart Poettering [Sat, 30 Sep 2017 12:08:26 +0000 (14:08 +0200)]
seccomp: include prlimit64 and ugetrlimit in @default
Also, move prlimit64() out of @resources.
prlimit64() may be used both for getting and setting resource limits, and
is implicitly called by glibc at various places, on some archs, the same
was as getrlimit(). SImilar, igetrlimit() is an arch-specific
replacement for getrlimit(), and hence should be whitelisted at the same
place as getrlimit() and prlimit64().
Also see: https://lists.freedesktop.org/archives/systemd-devel/2017-September/039543.html
Zbigniew Jędrzejewski-Szmek [Thu, 5 Oct 2017 09:26:44 +0000 (11:26 +0200)]
Merge pull request #6944 from poettering/suspend-fix
systemctl reboot/suspend tweaks
Hans de Goede [Wed, 4 Oct 2017 23:06:55 +0000 (01:06 +0200)]
hwdb: Add accelerometer orientation entry for Chuwi Hi8 Pro tablet (#6998)
Add an accelerometer orientation entry for the Chuwi Hi8 Pro tablet.
Lennart Poettering [Wed, 4 Oct 2017 19:44:29 +0000 (21:44 +0200)]
tmpfiles: change btmp mode 0600 → 0660 (#6997)
As discussed in #6994.
Fixes: #6994
Lennart Poettering [Wed, 4 Oct 2017 19:40:01 +0000 (21:40 +0200)]
dynamic-user: don't use a UID that currently owns IPC objects (#6962)
This fixes a mostly theoretical potential security hole: if for some
reason we failed to remove IPC objects created for a dynamic user (maybe
because a MAC/SElinux erronously prohibited), then we should not hand
out the same UID again until they are successfully removed.
With this commit we'll enumerate the IPC objects currently existing, and
step away from using a UID for the dynamic UID logic if there are any
matching it.
Zbigniew Jędrzejewski-Szmek [Wed, 4 Oct 2017 19:33:52 +0000 (21:33 +0200)]
Merge pull request #6975 from sourcejedi/logind_pid_0_v2
Selectively revert "tree-wide: use pid_is_valid() at more places"
Lennart Poettering [Mon, 2 Oct 2017 14:30:01 +0000 (16:30 +0200)]
NEWS: add comment about change sync/async behaviour for shutdown commands
Lennart Poettering [Fri, 29 Sep 2017 14:10:27 +0000 (16:10 +0200)]
man: document which special "systemctl" commands are synchronous and which asynchronous.
This documents the status quo, clarifying when we are synchronous and
when asynchronous by default and when --no-block is support to force
asynchronous operation.
See: #6479
Lennart Poettering [Mon, 2 Oct 2017 14:09:24 +0000 (16:09 +0200)]
logind: don's change dry-run boolean before we actually enqueue the operation
Let's not affect change before the PK check.
Lennart Poettering [Mon, 2 Oct 2017 14:08:49 +0000 (16:08 +0200)]
logind: reorder things a bit
Let's keep the three sleep method implementations close to each other.
Lennart Poettering [Fri, 29 Sep 2017 14:07:11 +0000 (16:07 +0200)]
systemctl: make sure "reboot", "suspend" and friends are always asynchronous
Currently, "systemctl reboot" behaves differently in setups with and
without logind. If logind is used (which is probably the more common
case) the operation is asynchronous, and otherwise synchronous (though
subject to --no-block in this case). Let's clean this up, and always
expose the same behaviour, regardless if logind is used or not: let's
always make it asynchronous.
It might make sense to add a "--block" mode in a future PR that makes
these operations synchronous, but this requires non-trivial work in
logind, and is outside of the scope of this change.
See: #6479
Lennart Poettering [Mon, 2 Oct 2017 14:03:55 +0000 (16:03 +0200)]
logind: add Halt() and CanHalt() APIs
This adds new method calls Halt() and CanHalt() to the logind bus APIs.
They aren't overly useful (as the whole concept of halting isn't really
too useful), however they clean up one major asymmetry: currently, using
the "shutdown" legacy commands it is possibly to enqueue a "halt"
operation through logind, while logind officially doesn't actually
support this. Moreover, the path through "shutdown" currently ultimately
fails, since the referenced "halt" action isn't actually defined in
PolicyKit.
Finally, the current logic results in an unexpected asymmetry in
systemctl: "systemctl poweroff", "systemctl reboot" are currently
asynchronous (due to the logind involvement) while "systemctl halt"
isnt. Let's clean this up, and make all three APIs implemented by
logind natively, and all three hence asynchronous in "systemctl".
Moreover, let's add the missing PK action.
Fixes: #6957
Lennart Poettering [Wed, 4 Oct 2017 18:00:14 +0000 (20:00 +0200)]
Merge pull request #6992 from keszybz/fix-test-copy
test-copy: fix operation when test-copy is too small
Zbigniew Jędrzejewski-Szmek [Wed, 4 Oct 2017 17:32:59 +0000 (19:32 +0200)]
hwdb: switch meson to use ids_parser.py (#6964)
Also drop the now-unused perl implementation (that doesn't do sorting),
so it's incompatible anyway.
Zbigniew Jędrzejewski-Szmek [Wed, 4 Oct 2017 17:32:12 +0000 (19:32 +0200)]
udevadm,basic: replace nulstr_contains with STR_IN_SET (#6965)
STR_IN_SET is a newer approach which is easier to write and read, and which
seems to result in space savings too:
before:
4949848 build/src/shared/libsystemd-shared-234.so
350704 build/systemctl
4967184 build/systemd
826216 build/udevadm
after:
4949848 build/src/shared/libsystemd-shared-234.so
350704 build/systemctl
4966888 build/systemd
826168 build/udevadm
Yu Watanabe [Wed, 4 Oct 2017 17:29:36 +0000 (02:29 +0900)]
nss-systemd: if cannot open bus, then try to read user info directly (#6971)
If sd_bus_open_system() fail, then try to read information about
dynamic users from /run/systemd/dynamic-uid.
This makes services can successfully call getpwuid() or their friends
even if dbus.service is not started yet.
Fixes #6967.
Lennart Poettering [Wed, 4 Oct 2017 17:25:30 +0000 (19:25 +0200)]
Merge pull request #6974 from keszybz/clean-up-defines
Clean up define definitions
Lennart Poettering [Wed, 4 Oct 2017 15:54:35 +0000 (17:54 +0200)]
Merge pull request #6985 from yuwata/empty
load-fragment: do not create empty array
Alan Jenkins [Tue, 3 Oct 2017 11:26:02 +0000 (12:26 +0100)]
logind: use pid_is_valid() where appropriate
These two sites _do_ match the definition of pid_is_valid(); they don't
provide any special handling for the invalid PID value 0. (They're used
by dbus methods, so the PID value 0 is handled with reference to the dbus
client creds, outside of these functions).
Alan Jenkins [Tue, 3 Oct 2017 11:13:06 +0000 (12:13 +0100)]
systemctl: use pid_is_valid() where appropriate
This was the one valid site in commit
ee043777be58251e7441b4f04594e9e3792d7fb2.
The second part of this hunk, avoiding using `%m`
when we didn't actually have `errno` set, seems
like a nice enough cleanup to be worthwhile on
it's own.
Also use PID_FMT to improve the error message we print
(pid_t is signed).
Yu Watanabe [Wed, 4 Oct 2017 14:01:32 +0000 (23:01 +0900)]
tree-wide: use IN_SET macro (#6977)
Zbigniew Jędrzejewski-Szmek [Wed, 4 Oct 2017 13:22:07 +0000 (15:22 +0200)]
test-sizeof: add pid_t and gid_t
C.f. #6975.
Zbigniew Jędrzejewski-Szmek [Wed, 4 Oct 2017 13:17:09 +0000 (15:17 +0200)]
test-copy: fix operation when test-copy is too small
Fixes #6981.
Djalal Harouni [Wed, 4 Oct 2017 13:01:21 +0000 (15:01 +0200)]
Merge pull request #6986 from OpenDZ/tixxdz/seccomp-more-default-syscalls-v1
seccomp: add sched_yield syscall to the @default syscall set
Yu Watanabe [Wed, 4 Oct 2017 12:43:00 +0000 (21:43 +0900)]
man: fix that the same option is listed twice (#6991)
Lennart Poettering [Wed, 4 Oct 2017 12:16:28 +0000 (14:16 +0200)]
units: prohibit all IP traffic on all our long-running services (#6921)
Let's lock things down further.
Zbigniew Jędrzejewski-Szmek [Tue, 3 Oct 2017 11:15:27 +0000 (13:15 +0200)]
meson: generate ENABLE_* names automatically
After previous changes, the naming of configuration options and internal
defines is consistent.
Zbigniew Jędrzejewski-Szmek [Tue, 3 Oct 2017 11:12:29 +0000 (13:12 +0200)]
build-sys: s/ENABLE_RESOLVED/ENABLE_RESOLVE/
The configuration option was called -Dresolve, but the internal define
was …RESOLVED. This options governs more than just resolved itself, so
let's settle on the version without "d".
Zbigniew Jędrzejewski-Szmek [Tue, 3 Oct 2017 10:23:55 +0000 (12:23 +0200)]
build-sys: s/HAVE_MYHOSTNAME/ENABLE_MYHOSTNAME/
Same justification as for HAVE_UTMP. HAVE_MYHOSTNAME was used before mysthostname
was merged into systemd.
Zbigniew Jędrzejewski-Szmek [Tue, 3 Oct 2017 10:22:40 +0000 (12:22 +0200)]
build-sys: s/HAVE_SMACK/ENABLE_SMACK/
Same justification as for HAVE_UTMP.
Zbigniew Jędrzejewski-Szmek [Tue, 3 Oct 2017 10:20:49 +0000 (12:20 +0200)]
build-sys: s/HAVE_IMA/ENABLE_IMA/
Same justification as for HAVE_UTMP.
Zbigniew Jędrzejewski-Szmek [Tue, 3 Oct 2017 10:19:30 +0000 (12:19 +0200)]
build-sys: s/HAVE_UTMP/ENABLE_UTMP/
"Have" should be about the external environment and dependencies. Anything
which is a pure yes/no choice should be "enable".
Zbigniew Jędrzejewski-Szmek [Tue, 3 Oct 2017 10:11:49 +0000 (12:11 +0200)]
build-sys: require all defines under #if to be present
This should help to catch any errors with typos and HAVE/ENABLE mismatches.
Zbigniew Jędrzejewski-Szmek [Tue, 3 Oct 2017 10:10:57 +0000 (12:10 +0200)]
test-nss: fix names of two defines
That's another bug fixed (sys/auxv.h was the first).
Zbigniew Jędrzejewski-Szmek [Tue, 3 Oct 2017 08:41:51 +0000 (10:41 +0200)]
build-sys: use #if Y instead of #ifdef Y everywhere
The advantage is that is the name is mispellt, cpp will warn us.
$ git grep -Ee "conf.set\('(HAVE|ENABLE)_" -l|xargs sed -r -i "s/conf.set\('(HAVE|ENABLE)_/conf.set10('\1_/"
$ git grep -Ee '#ifn?def (HAVE|ENABLE)' -l|xargs sed -r -i 's/#ifdef (HAVE|ENABLE)/#if \1/; s/#ifndef (HAVE|ENABLE)/#if ! \1/;'
$ git grep -Ee 'if.*defined\(HAVE' -l|xargs sed -i -r 's/defined\((HAVE_[A-Z0-9_]*)\)/\1/g'
$ git grep -Ee 'if.*defined\(ENABLE' -l|xargs sed -i -r 's/defined\((ENABLE_[A-Z0-9_]*)\)/\1/g'
+ manual changes to meson.build
squash! build-sys: use #if Y instead of #ifdef Y everywhere
v2:
- fix incorrect setting of HAVE_LIBIDN2
Djalal Harouni [Wed, 4 Oct 2017 09:41:42 +0000 (10:41 +0100)]
seccomp: add sched_yield syscall to the @default syscall set
Zbigniew Jędrzejewski-Szmek [Wed, 4 Oct 2017 09:33:30 +0000 (11:33 +0200)]
core: use strv_isempty to check if supplementary_groups is empty
With the previous commit, we know that it will be NULL if empty, but
it's safe to always use strv_isempty() in case the code changes
in the future.
Yu Watanabe [Wed, 4 Oct 2017 06:21:12 +0000 (15:21 +0900)]
Yu Watanabe [Wed, 4 Oct 2017 09:09:32 +0000 (18:09 +0900)]
man: empty string resets the list of NTP servers (#6984)
Alan Jenkins [Tue, 3 Oct 2017 11:05:24 +0000 (12:05 +0100)]
Revert "tree-wide: use pid_is_valid() at more places"
This reverts commit
ee043777be58251e7441b4f04594e9e3792d7fb2.
It broke almost everywhere it touched. The places that
handn't been converted, were mostly followed by special
handling for the invalid PID `0`. That explains why they
tested for `pid < 0` instead of `pid <= 0`.
I think that one was the first commit I reviewed, heh.
Zbigniew Jędrzejewski-Szmek [Tue, 3 Oct 2017 10:09:40 +0000 (12:09 +0200)]
meson: check for sys/auxv.h
This check was present in configure.ac, but was never added under meson.
The code under HAVE_SYS_AUX_H has been dead ever since :(.
Zbigniew Jędrzejewski-Szmek [Tue, 3 Oct 2017 08:32:34 +0000 (10:32 +0200)]
build-sys: change all HAVE_DECL_ macros to HAVE_
This is a legacy of autotools, where one detection routine used a different
prefix then the others.
$ git grep -e HAVE_DECL_ -l|xargs sed -i s/HAVE_DECL_/HAVE_/g
Zbigniew Jędrzejewski-Szmek [Tue, 3 Oct 2017 08:26:53 +0000 (10:26 +0200)]
Merge pull request #6946 from poettering/synthesize-dns
Some DNS RR synthesizing fixes
Djalal Harouni [Tue, 3 Oct 2017 05:20:05 +0000 (07:20 +0200)]
seccomp: remove '@credentials' syscall set (#6958)
This removes the '@credentials' syscall set that was added in commit
v234-468-gcd0ddf6f75.
Most of these syscalls are so simple that we do not want to filter them.
They work on the current calling process, doing only read operations,
they do not have a deep kernel path.
The problem may only be in 'capget' syscall since it can query arbitrary
processes, and used to discover processes, however sending signal 0 to
arbitrary processes can be used to discover if a process exists or not.
It is unfortunate that Linux allows to query processes of different
users. Lets put it now in '@process' syscall set, and later we may add
it to a new '@basic-process' set that allows most basic process
operations.
Yu Watanabe [Tue, 3 Oct 2017 04:28:48 +0000 (13:28 +0900)]
Merge pull request #6940 from poettering/magic-dirs
make sure StateDirectory= and friends play nicely with DynamicUser= and RootImage=/RootDirectory=
Zbigniew Jędrzejewski-Szmek [Mon, 2 Oct 2017 18:39:08 +0000 (20:39 +0200)]
Merge pull request #6943 from poettering/dissect-ro
Automatically recognize that "squashfs" and "iso9660" area always read-only
Lennart Poettering [Thu, 28 Sep 2017 17:35:32 +0000 (19:35 +0200)]
update TODO
Lennart Poettering [Mon, 2 Oct 2017 09:27:03 +0000 (11:27 +0200)]
core: fix special directories for user services
The system paths were listed where the user paths should have been
listed. Correct that.
Lennart Poettering [Mon, 2 Oct 2017 08:51:19 +0000 (10:51 +0200)]
path-util: some updates to path_make_relative()
Don't miscount number of "../" to generate, if we "." is included in an
input path.
Also, refuse if we encounter "../" since we can't possibly follow that
up properly, without file system access.
Some other modernizations.
Lennart Poettering [Mon, 2 Oct 2017 08:50:07 +0000 (10:50 +0200)]
core: fix StateDirectory= (and friends) safety checks when decoding transient unit properties
Let's make sure relative directories such as "foo/bar" are accepted, by
using the same validation checks as in unit file parsing.
Lennart Poettering [Thu, 28 Sep 2017 21:41:06 +0000 (23:41 +0200)]
test: add test for DynamicUser= + StateDirectory=
Also, tests for DynamicUser= should really run for system mode, as we
allocate from a system resource.
(This also increases the test timeout to 2min. If one of our tests
really hangs then waiting for 2min longer doesn't hurt either. The old
2s is really short, given that we run in potentially slow VM
environments for this test. This becomes noticable when the slow "find"
command this adds is triggered)
Lennart Poettering [Thu, 28 Sep 2017 18:33:38 +0000 (20:33 +0200)]
core: pass the correct error to the caller
Lennart Poettering [Thu, 28 Sep 2017 18:28:09 +0000 (20:28 +0200)]
core: when looking for a UID to use for a dynamic UID start with the current owner of the StateDirectory= and friends
Let's optimize dynamic UID allocation a bit: if a StateDirectory= (or
suchlike) is configured, we start our allocation loop from that UID and
use it if it currently isn't used otherwise. This is beneficial as it
saves us from having to expensively recursively chown() these
directories in the typical case (which StateDirectory= does when it
notices that the owner of the directory doesn't match the UID picked).
With this in place we now have the a three-phase logic for allocating a
dynamic UID:
a) first, we try to use the owning UID of StateDirectory=,
CacheDirectory=, LogDirectory= if that exists and is currently
otherwise unused.
b) if that didn't work out, we hash the UID from the service name
c) if that didn't yield an unused UID either, randomly pick new ones
until we find a free one.
Lennart Poettering [Thu, 28 Sep 2017 17:14:10 +0000 (19:14 +0200)]
man: document the new logic
Lennart Poettering [Thu, 28 Sep 2017 16:55:45 +0000 (18:55 +0200)]
execute: make StateDirectory= and friends compatible with DynamicUser=1 and RootDirectory=/RootImage=
Let's clean up the interaction of StateDirectory= (and friends) to
DynamicUser=1: instead of creating these directories directly below
/var/lib, place them in /var/lib/private instead if DynamicUser=1 is
set, making that directory 0700 and owned by root:root. This way, if a
dynamic UID is later reused, access to the old run's state directory is
prohibited for that user. Then, use file system namespacing inside the
service to make /var/lib/private a readable tmpfs, hiding all state
directories that are not listed in StateDirectory=, and making access to
the actual state directory possible. Mount all directories listed in
StateDirectory= to the same places inside the service (which means
they'll now be mounted into the tmpfs instance). Finally, add a symlink
from the state directory name in /var/lib/ to the one in
/var/lib/private, so that both the host and the service can access the
path under the same location.
Here's an example: let's say a service runs with StateDirectory=foo.
When DynamicUser=0 is set, it will get the following setup, and no
difference between what the unit and what the host sees:
/var/lib/foo (created as directory)
Now, if DynamicUser=1 is set, we'll instead get this on the host:
/var/lib/private (created as directory with mode 0700, root:root)
/var/lib/private/foo (created as directory)
/var/lib/foo → private/foo (created as symlink)
And from inside the unit:
/var/lib/private (a tmpfs mount with mode 0755, root:root)
/var/lib/private/foo (bind mounted from the host)
/var/lib/foo → private/foo (the same symlink as above)
This takes inspiration from how container trees are protected below
/var/lib/machines: they generally reuse UIDs/GIDs of the host, but
because /var/lib/machines itself is set to 0700 host users cannot access
files in the container tree even if the UIDs/GIDs are reused. However,
for this commit we add one further trick: inside and outside of the unit
/var/lib/private is a different thing: outside it is a plain,
inaccessible directory, and inside it is a world-readable tmpfs mount
with only the whitelisted subdirs below it, bind mounte din. This
means, from the outside the dir acts as an access barrier, but from the
inside it does not. And the symlink created in /var/lib/foo itself
points across the barrier in both cases, so that root and the unit's
user always have access to these dirs without knowing the details of
this mounting magic.
This logic resolves a major shortcoming of DynamicUser=1 units:
previously they couldn't safely store persistant data. With this change
they can have their own private state, log and data directories, which
they can write to, but which are protected from UID recycling.
With this change, if RootDirectory= or RootImage= are used it is ensured
that the specified state/log/cache directories are always mounted in
from the host. This change of semantics I think is much preferable since
this means the root directory/image logic can be used easily for
read-only resource bundling (as all writable data resides outside of the
image). Note that this is a change of behaviour, but given that we
haven't released any systemd version with StateDirectory= and friends
implemented this should be a safe change to make (in particular as
previously it wasn't clear what would actually happen when used in
combination). Moreover, by making this change we can later add a "+"
modifier to these setings too working similar to the same modifier in
ReadOnlyPaths= and friends, making specified paths relative to the
container itself.
Lennart Poettering [Thu, 28 Sep 2017 16:35:51 +0000 (18:35 +0200)]
namespace: if we can create the destination of bind and PrivateTmp= mounts
When putting together the namespace, always create the file or directory
we are supposed to bind mount on, the same way we do it for most other
stuff, for example mount units or systemd-nspawn's --bind= option.
This has the big benefit that we can use namespace bind mounts on dirs
in /tmp or /var/tmp even in conjunction with PrivateTmp=.
Lennart Poettering [Thu, 28 Sep 2017 16:30:55 +0000 (18:30 +0200)]
namespace: properly handle bind mounts from the host
Before this patch we had an ordering problem: if we have no namespacing
enabled except for two bind mounts that intend to swap /a and /b via
bind mounts, then we'd execute the bind mount binding /b to /a, followed
by thebind mount from /a to /b, thus having the effect that /b is now
visible in both /a and /b, which was not intended.
With this change, as soon as any bind mount is configured we'll put
together the service mount namespace in a temporary directory instead of
operating directly in the root. This solves the problem in a
straightforward fashion: the source of bind mounts will always refer to
the host, and thus be unaffected from the bind mounts we already
created.
Lennart Poettering [Thu, 28 Sep 2017 16:28:23 +0000 (18:28 +0200)]
namespace: create /dev, /proc, /sys when needed
We already create /dev implicitly if PrivateTmp=yes is on, if it is
missing. Do so too for the other two API VFS, as well as for /dev if
PrivateTmp=yes is off but MountAPIVFS=yes is on (i.e. when /dev is bind
mounted from the host).
Lennart Poettering [Thu, 28 Sep 2017 14:58:43 +0000 (16:58 +0200)]
core: usually our enum's _INVALID and _MAX special values are named after the full type
In most cases we followed the rule that the special _INVALID and _MAX
values we use in our enums use the full type name as prefix (in contrast
to regular values that we often make shorter), do so for
ExecDirectoryType as well.
No functional changes, just a little bit of renaming to make this code
more like the rest.
Lennart Poettering [Thu, 28 Sep 2017 17:13:44 +0000 (19:13 +0200)]
core: chown() StateDirectory= and friends recursively when starting a service
This is particularly useful when used in conjunction with DynamicUser=1,
where the UID might change for every invocation, but is useful in other
cases too, for example, when these directories are shared between
systems where the UID assignments differ slightly.
Lennart Poettering [Thu, 28 Sep 2017 11:01:33 +0000 (13:01 +0200)]
nspawn: properly report all kinds of changed UID/GID when patching things for userns
We forgot to propagate one chmod().
Lennart Poettering [Mon, 2 Oct 2017 15:12:58 +0000 (17:12 +0200)]
Merge pull request #6960 from keszybz/hwdb-update
Hwdb update and sorting
Jouke Witteveen [Mon, 2 Oct 2017 14:35:27 +0000 (16:35 +0200)]
service: better detect when a Type=notify service cannot become active anymore (#6959)
No need to wait for a timeout when we know things are not going to work out.
When the main process goes away and only notifications from the main process are
accepted, then we will not receive any notifications anymore.
Zbigniew Jędrzejewski-Szmek [Mon, 2 Oct 2017 13:08:10 +0000 (15:08 +0200)]
Merge pull request #6941 from andir/use-in_set
use IN_SET where possible
Zbigniew Jędrzejewski-Szmek [Mon, 2 Oct 2017 12:52:12 +0000 (14:52 +0200)]
Minor line wrapping adjustment