Zbigniew Jędrzejewski-Szmek [Sat, 30 Apr 2016 01:18:11 +0000 (21:18 -0400)]
dh-dhcp{,6}-client: change the semantics of DUID setting
Both versions of the code are changed to allow the caller to override
DUID using simple rules: duid type and value may be specified, in
which case the caller is responsible to providing the contents,
or just duid type may be specified as DUID_TYPE_EN, in which case we
we fill in the values. In the future more support for other types may
be added, e.g. DUID_TYPE_LLT.
There still remains and ugly discrepancy between dhcp4 and dhcp6 code:
dhcp6 has sd_dhcp6_client_set_duid and sd_dhcp6_client_set_iaid and
requires client->state to be DHCP6_STATE_STOPPED, while dhcp4 has
sd_dhcp_client_set_iaid_duid and will reconfigure the client if it
is not stopped. This commit doesn't touch that part.
This addresses #3127 § 2.
Zbigniew Jędrzejewski-Szmek [Sat, 30 Apr 2016 01:18:02 +0000 (21:18 -0400)]
dhcp-identifier: un-inline dhcp_validate_duid_len
After all it is used in more than one place and is not that short.
Also tweak the test a bit:
- do not check that duid_len > 0, because we want to allow unknown
duid types, and there might be some which are fine with 0 length data,
(also assert should not be called from library code),
- always check that duid_len <= MAX_DUID_LEN, because we could overwrite
available buffer space otherwise.
Zbigniew Jędrzejewski-Szmek [Tue, 3 May 2016 16:08:56 +0000 (12:08 -0400)]
sd-dhcp: change uint8_t *duid to const void*
Zbigniew Jędrzejewski-Szmek [Tue, 3 May 2016 15:52:44 +0000 (11:52 -0400)]
sd-dhcp{,6}-client: use standard indentation for functions args
Zbigniew Jędrzejewski-Szmek [Fri, 29 Apr 2016 02:52:04 +0000 (22:52 -0400)]
networkd: rework headers to avoid circular includes
Header files were organized in a way where the includer would add various
typedefs used by the includee before including it, resulting in a tangled
web of dependencies between files.
Replace this with the following logic:
networkd.h
/ \
networkd-link.h \
networkd-ipv4ll.h--\__\
networkd-fdb.h \
networkd-network.h netword-netdev-*.h
networkd-route.h \
networkd-netdev.h
If a pointer to a structure defined in a different header file is needed,
use a typedef line instead of including the whole header.
Zbigniew Jędrzejewski-Szmek [Fri, 29 Apr 2016 18:27:23 +0000 (14:27 -0400)]
Merge pull request #3151 from keszybz/pr3149-2
Assorted fixes #3149 + one commit tacked on top
Zbigniew Jędrzejewski-Szmek [Fri, 29 Apr 2016 18:27:04 +0000 (14:27 -0400)]
Merge pull request #3148 from poettering/trigger
core: introduce activation rate limit and parse nice levels and close sockets properly
Lennart Poettering [Thu, 28 Apr 2016 19:02:11 +0000 (21:02 +0200)]
update TODO
Lennart Poettering [Fri, 29 Apr 2016 09:36:00 +0000 (11:36 +0200)]
core: merge service_connection_unref() into service_close_socket_fd()
We always call one after the other anyway, and this way service_set_socket_fd()
and service_close_socket_fd() nicely match each other as one undoes the effect
of the other.
Lennart Poettering [Fri, 29 Apr 2016 09:18:53 +0000 (11:18 +0200)]
core: rerun GC logic for a unit that loses a reference
Let's make sure when we drop a reference to a unit, that we run the GC queue on
it again.
This (together with the previous commit) should deal with the GC issues pointed
out in:
https://github.com/systemd/systemd/pull/2993#issuecomment-
215331189
Lennart Poettering [Fri, 29 Apr 2016 09:14:03 +0000 (11:14 +0200)]
core: rework socket/service GC logic
There's no need to set the no_gc bit for service units that socket units
prepare, as we always keep a proper reference (as maintained by unit_ref_set())
on them, and such references are honoured by the GC logic anyway. Moreover,
explicitly setting the no_gc bit is problematic if the socket gets GC'ed for a
reason, as the service might then leak with the bit set.
Lennart Poettering [Fri, 29 Apr 2016 08:46:56 +0000 (10:46 +0200)]
hwdb: add missing newline so the hwdb buils correctly again
Lennart Poettering [Thu, 28 Apr 2016 19:47:20 +0000 (21:47 +0200)]
socket: really always close auxiliary fds when closing socket fds
Lennart Poettering [Thu, 28 Apr 2016 19:00:28 +0000 (21:00 +0200)]
core: make parsing of RLIMIT_NICE aware of actual nice levels
Lennart Poettering [Thu, 28 Apr 2016 15:09:50 +0000 (17:09 +0200)]
core: make sure to close connection fd when we fail to activate a per-connection service
Fixes: #2993 #2691
Lennart Poettering [Thu, 28 Apr 2016 14:51:30 +0000 (16:51 +0200)]
core: minor error path fix
In service_set_socket_fd(), let's make sure that if we can't add the requested
dependencies we take no possession of the passed connection fd.
This way, we follow the strict rule: we take possession of the passed fd on
success, but on failure we don't, and the fd remains in possession of the
caller.
Lennart Poettering [Tue, 26 Apr 2016 18:46:20 +0000 (20:46 +0200)]
core: rename StartLimitInterval= to StartLimitIntervalSec=
We generally follow the rule that for time settings we suffix the setting name
with "Sec" to indicate the default unit if none is specified. The only
exception was the rate limiting interval settings. Fix this, and keep the old
names for compatibility.
Do the same for journald's RateLimitInterval= setting
Lennart Poettering [Tue, 26 Apr 2016 18:34:33 +0000 (20:34 +0200)]
core: move start ratelimiting check after condition checks
With #2564 unit start rate limiting was moved from after the condition checks
are to before they are made, in an attempt to fix #2467. This however resulted
in #2684. However, with a previous commit a concept of per socket unit trigger
rate limiting has been added, to fix #2467 more comprehensively, hence the
start limit can be moved after the condition checks again, thus fixing #2684.
Fixes: #2684
Lennart Poettering [Tue, 26 Apr 2016 18:26:15 +0000 (20:26 +0200)]
core: introduce activation rate limiting for socket units
This adds two new settings TriggerLimitIntervalSec= and TriggerLimitBurst= that
define a rate limit for activation of socket units. When the limit is hit, the
socket is is put into a failure mode. This is an alternative fix for #2467,
since the original fix resulted in issue #2684.
In a later commit the StartLimitInterval=/StartLimitBurst= rate limiter will be
changed to be applied after any start conditions checks are made. This way,
there are two separate rate limiters enforced: one at triggering time, before
any jobs are queued with this patch, as well as the start limit that is moved
again to be run immediately before the unit is activated. Condition checks are
done in between the two, and thus no longer affect the start limit.
Lennart Poettering [Wed, 27 Apr 2016 07:44:49 +0000 (09:44 +0200)]
build-sys: improve compat with older kernel headers
In 4.2 kernel headers, some netlink defines are missing that we need. missing.h
already can add them in, but currently makes this dependent on a definition
that these kernels already have. Change the check hence to check for the newest
definition in the table, so that the whole bunch of definitions as added in on
all kernels lacking this.
Zbigniew Jędrzejewski-Szmek [Fri, 29 Apr 2016 14:17:43 +0000 (10:17 -0400)]
path-util: also support ".old" and ".new" suffixes and recommend them
~ suffix works fine, but looks to much like it the file is supposed to be
automatically cleaned up. For new versions of configuration files installers
might want to using something that looks more permanent like foobar.new.
So let's add treat ".old" and ".new" as special.
Update test to match.
kayrus [Fri, 29 Apr 2016 13:59:51 +0000 (15:59 +0200)]
core: Filter by unit name behind the D-Bus, instead on the client side (#3142)
This commit improves systemd performance on the systems which have
thousands of units.
Zbigniew Jędrzejewski-Szmek [Fri, 29 Apr 2016 13:16:45 +0000 (09:16 -0400)]
Merge pull request #3126 from poettering/small-fixes
fsync directory when creating or rotating journal files and other small fixes,
most importantly for the DHCP DUID code.
Lennart Poettering [Fri, 29 Apr 2016 12:25:52 +0000 (14:25 +0200)]
test-copy: never call alloca() in a loop
That's a total no-no, hence rework this to use malloc()-based memory instead of
alloca()-based memory.
Also see CODING_STYLE about this.
Lennart Poettering [Fri, 29 Apr 2016 12:21:22 +0000 (14:21 +0200)]
copy: also copy AF_UNIX sockets
We previously would fail with EOPNOTSUPP when encountering an AF_UNIX socket in
the directory tree to copy. Fix that, and copy them too (even if they are dead
in the result).
Fixes: #2914
Lennart Poettering [Fri, 29 Apr 2016 11:36:38 +0000 (13:36 +0200)]
man: document that RemainAfterExit= doesn't make much sense for repetitive timers
Fixes #3122
Lennart Poettering [Fri, 29 Apr 2016 11:26:12 +0000 (13:26 +0200)]
path-util: document that we shouldn't add further entries to hidden_or_backup_file()
And let's add ".bak" as a generic suffix for backups, that people can use
without having to register their stuff in our list.
Ming Lin [Fri, 29 Apr 2016 11:02:57 +0000 (04:02 -0700)]
rules: add NVMe rules (#3136)
Add NVMe rules using the "wwid" attribute.
root@target:~# cat /sys/block/nvme0n1/wwid
eui.
3825004235000591
root@target:~# ls /dev/disk/by-id/ -l |grep nvme
lrwxrwxrwx 1 root root 13 Apr 27 16:08 nvme-eui.
3825004235000591 -> ../../nvme0n1
lrwxrwxrwx 1 root root 15 Apr 27 16:08 nvme-eui.
3825004235000591-part1 -> ../../nvme0n1p1
lrwxrwxrwx 1 root root 15 Apr 27 16:08 nvme-eui.
3825004235000591-part2 -> ../../nvme0n1p2
Lennart Poettering [Fri, 29 Apr 2016 10:50:29 +0000 (12:50 +0200)]
Merge pull request #3069 from Werkov/fix-dependencies-for-bind-mounts
Always create dependencies for bind mounts
Lennart Poettering [Fri, 29 Apr 2016 10:21:52 +0000 (12:21 +0200)]
journal-file: when rotating a journal file, fsync directory too
As suggested by:
https://github.com/systemd/systemd/pull/3126#discussion_r61125474
Lennart Poettering [Tue, 26 Apr 2016 14:19:28 +0000 (16:19 +0200)]
networkd: clean up DUID code a bit
Let's move DUID configuration into the [DHCP] section, since it only makes
sense in a DHCP context, and should be close to the configuration of
ClientIdentifier= and suchlike.
This really shouldn't be a section of its own, we don't have any for any of our
other per-protocol specific identifiers...
Follow-up for #2890 #2943
Lennart Poettering [Tue, 26 Apr 2016 13:47:55 +0000 (15:47 +0200)]
journal: when creating a new journal file, fsync() the directory it is created in too
Fixes: #2831
Lennart Poettering [Tue, 26 Apr 2016 13:47:03 +0000 (15:47 +0200)]
update TODO a bit
Lennart Poettering [Tue, 26 Apr 2016 13:08:06 +0000 (15:08 +0200)]
man: minor wording fixes
As suggested in:
https://github.com/systemd/systemd/pull/3124#discussion_r61068789
Lennart Poettering [Tue, 26 Apr 2016 13:06:28 +0000 (15:06 +0200)]
vimrc: fix indentation logic for our docbook xml files
Make sure TAB results in 2ch indenting as we commonly use for our docbook XML
files.
Lubomir Rintel [Fri, 29 Apr 2016 09:45:07 +0000 (11:45 +0200)]
parse-util: fix conversion from size_t on s390 (#3147)
On s390 size_t is an unsigned long, nor an unsigned int. They both are
of the same size and can be cast to each other safely, but the compiler
still seems unhappy about incompatible pointers.
Fixes:
7c2da2ca8
Lennart Poettering [Fri, 29 Apr 2016 08:40:15 +0000 (10:40 +0200)]
Merge pull request #3137 from keszybz/dirent-simplification
Various small cleanups in shared code
Evgeny Vereshchagin [Fri, 29 Apr 2016 08:38:35 +0000 (11:38 +0300)]
nspawn: convert uuid to string (#3146)
Fixes:
cp /etc/machine-id /var/tmp/systemd-test.HccKPa/nspawn-root/etc
systemd-nspawn -D /var/tmp/systemd-test.HccKPa/nspawn-root --link-journal host -b
...
Host and machine ids are equal (P�S!V): refusing to link journals
Susant Sahani [Thu, 28 Apr 2016 23:03:29 +0000 (04:33 +0530)]
networkd: reconfigure IPv6 and static address after link up event (#3105)
Now we are not setting static address, start dhcp6 client and
discovering IPv6 routers after link gained carrier.
This fixes #2912.
Zbigniew Jędrzejewski-Szmek [Thu, 28 Apr 2016 17:49:16 +0000 (13:49 -0400)]
basic/mount-util: recognize pvfs2 as network fs (#3140)
Added to kernel 4.6.
Evgeny Vereshchagin [Thu, 28 Apr 2016 17:48:17 +0000 (20:48 +0300)]
nspawn: initialize the veth_name (#3141)
Fixes:
$ systemd-nspawn -h
...
Failed to remove veth interface ����: Operation not permitted
This is a follow-up for
d2773e59de3dd970d861
Naohiro Aota [Thu, 28 Apr 2016 15:41:50 +0000 (00:41 +0900)]
cgtop: initialize `ours' to NULL properly (#3139)
Running cgtop on a system, which lacks expecting stat file, results in a
segfault. For example, a system with blkio tree but without cfq io scheduler,
lacks "blkio.io_service_bytes".
When the targeting cgroup's file does not exist, process() returns 0 and
also does not modify `*ret' value (which is `*ours'). As a result,
callers of refresh_one() can have bogus pointer, which result in SEGV.
This patch just properly initialize the variable to NULL.
Zbigniew Jędrzejewski-Szmek [Thu, 28 Apr 2016 12:24:53 +0000 (08:24 -0400)]
test: chmod +x sysv-generator-test
Just for convenience.
Zbigniew Jędrzejewski-Szmek [Thu, 28 Apr 2016 12:24:25 +0000 (08:24 -0400)]
test-path-util: add a trivial test for hidden_or_backup_file
Zbigniew Jędrzejewski-Szmek [Wed, 27 Apr 2016 13:24:59 +0000 (09:24 -0400)]
tree-wide: rename hidden_file to hidden_or_backup_file and optimize
In standard linux parlance, "hidden" usually means that the file name starts
with ".", and nothing else. Rename the function to convey what the function does
better to casual readers.
Stop exposing hidden_file_allow_backup which is rather ugly and rewrite
hidden_file to extract the suffix first. Note that hidden_file_allow_backup
excluded files with "~" at the end, which is quite confusing. Let's get
rid of it before it gets used in the wrong place.
muzena [Wed, 27 Apr 2016 12:32:21 +0000 (14:32 +0200)]
Add croatian translation, hr.po and systemd.hr.catalog files
Zbigniew Jędrzejewski-Szmek [Wed, 27 Apr 2016 12:59:12 +0000 (08:59 -0400)]
basic/dirent-util: do not call hidden_file_allow_backup from dirent_is_file_with_suffix
If the file name is supposed to end in a suffix, there's not need to check the
name against a list of "special" file names, which is slow. Instead, just check
that the name doens't start with a period.
Zbigniew Jędrzejewski-Szmek [Mon, 25 Apr 2016 01:20:26 +0000 (21:20 -0400)]
networkd: drop unnecessary stmt
Zbigniew Jędrzejewski-Szmek [Sun, 24 Apr 2016 15:31:19 +0000 (11:31 -0400)]
machinectl: simplify option string assignment
It's better to avoid having the option string duplicated, lest we forget
to modify them in sync in the future.
Martin Pitt [Wed, 27 Apr 2016 08:34:24 +0000 (10:34 +0200)]
Stop syslog.socket when entering emergency mode (#3130)
When enabling ForwardToSyslog=yes, the syslog.socket is active when entering
emergency mode. Any log message then triggers the start of rsyslog.service (or
other implementation) along with its dependencies such as local-fs.target and
sysinit.target. As these might fail themselves (e. g. faulty /etc/fstab), this
breaks the emergency mode.
This causes syslog.socket to fail with "Failed to queue service startup job:
Transition is destructive".
Add Conflicts=syslog.socket to emergency.service to make sure the socket is
stopped when emergency.service is started.
Fixes #266
Nalin Dahyabhai [Wed, 27 Apr 2016 08:32:05 +0000 (04:32 -0400)]
Correctly parse OBJECT_PID in journald messages (#3129)
The parse_pid() function doesn't succeed if we don't zero-terminate after the
last digit in the buffer.
Martin Pitt [Wed, 27 Apr 2016 07:58:42 +0000 (09:58 +0200)]
path-util: Add hidden suffixes for ucf (#3131)
ucf is a standard Debian helper for managing configuration file upgrades which
need more interaction or elaborate merging than conffiles managed by dpkg.
Ignore its temporary and backup files similarly to the *.dpkg-* ones to avoid
creating units for them in generators.
https://bugs.debian.org/775903
Vito Caputo [Wed, 27 Apr 2016 06:29:43 +0000 (23:29 -0700)]
journal: set STATE_ARCHIVED as part of offlining (#2740)
The only code path which makes a journal durable is via
journal_file_set_offline().
When we perform a rotate the journal's header->state is being set to
STATE_ARCHIVED prior to journal_file_set_offline() being called.
In journal_file_set_offline(), we short-circuit the entire offline when
f->header->state != STATE_ONLINE.
This all results in none of the journal_file_set_offline() fsync() calls
being reached when rotate archives a journal, so archived journals are
never explicitly made durable.
What we do now is instead of setting the f->header->state to
STATE_ARCHIVED directly in journal_file_rotate() prior to
journal_file_close(), we set an archive flag in f->archive for the
journal_file_set_offline() machinery to honor by committing
STATE_ARCHIVED instead of STATE_OFFLINE when set.
Prior to this, rotated journals were never getting fsync() explicitly
performed on them, since journal_file_set_offline() short-circuited.
Obviously this is undesirable, and depends entirely on the underlying
filesystem as to how much durability was achieved when simply closing
the file.
Note that this problem existed prior to the recent asynchronous fsync
changes, but those changes do facilitate our performing this durable
offline on rotate without blocking, regardless of the underlying
filesystem sync-on-close semantics.
tblume [Tue, 26 Apr 2016 15:10:36 +0000 (17:10 +0200)]
core: set start job timeout from the kernel commandline (#3112)
Add the boot parameter: systemd.default_timeout_start_sec to allow modification
of the default start job timeout at boot time.
Zbigniew Jędrzejewski-Szmek [Tue, 26 Apr 2016 13:52:55 +0000 (09:52 -0400)]
Merge pull request #3124 from poettering/small-journal-fixes
Zbigniew Jędrzejewski-Szmek [Tue, 26 Apr 2016 13:52:30 +0000 (09:52 -0400)]
Revert "smaller journal fixes (#3124)"
This reverts commit
6e3930c40f3379b7123e505a71ba4cd6db6c372f.
Merge got squashed by mistake.
Lennart Poettering [Tue, 26 Apr 2016 12:57:04 +0000 (14:57 +0200)]
Merge pull request #3093 from poettering/nspawn-userns-magic
nspawn automatic user namespaces
Lennart Poettering [Tue, 26 Apr 2016 12:38:45 +0000 (14:38 +0200)]
smaller journal fixes (#3124)
* sd-journal: detect earlier if we try to read an object from an invalid offset
Specifically, detect early if we try to read from offset 0, i.e. are using
uninitialized offset data.
* journal: when dumping journal contents, react nicer to lines we can't read
If journal files are not cleanly closed it might happen that intermediaery
journal entries cannot be read. Handle this nicely, skip over the unreadable
entries, and log a debug message about it; after all we generally follow the
logic that we try to make the best of corrupted files.
* journal-file: always generate the same error when encountering corrupted files
Let's make sure EBADMSG is the one error we throw when we encounter corrupted
data, so that we can neatly test for it.
* journal-file: when iterating through a partly corruped journal file, treat error like EOF
When we linearly iterate through a corrupted journal file, and we encounter a
read error, don't consider this fatal, but merely as EOF condition (and log
about it).
* journal-file: make seeking in corrupted files work
Previously, when we used a bisection table for seeking through a corrupted
file, and the end of the bisection table was corrupted we'd most likely fail
the entire seek operation. Improve the situation: if we encounter invalid
entries in a bisection table, linearly go backwards until we find a working
entry again.
* man: elaborate on the automatic systemd-journald.socket service dependencies
Fixes: #1603
Martin Pitt [Tue, 26 Apr 2016 10:16:43 +0000 (12:16 +0200)]
tests: document requirements of networkd integration tests (#3125)
Document the necessary dependencies and nspawn/lxd options to run
test/networkd-test.py.
Lennart Poettering [Tue, 26 Apr 2016 09:57:54 +0000 (11:57 +0200)]
man: elaborate on the automatic systemd-journald.socket service dependencies
Fixes: #1603
Lennart Poettering [Tue, 26 Apr 2016 09:39:48 +0000 (11:39 +0200)]
journal-file: make seeking in corrupted files work
Previously, when we used a bisection table for seeking through a corrupted
file, and the end of the bisection table was corrupted we'd most likely fail
the entire seek operation. Improve the situation: if we encounter invalid
entries in a bisection table, linearly go backwards until we find a working
entry again.
Lennart Poettering [Tue, 26 Apr 2016 09:38:39 +0000 (11:38 +0200)]
journal-file: when iterating through a partly corruped journal file, treat error like EOF
When we linearly iterate through a corrupted journal file, and we encounter a
read error, don't consider this fatal, but merely as EOF condition (and log
about it).
Lennart Poettering [Tue, 26 Apr 2016 09:37:22 +0000 (11:37 +0200)]
journal-file: always generate the same error when encountering corrupted files
Let's make sure EBADMSG is the one error we throw when we encounter corrupted
data, so that we can neatly test for it.
Lennart Poettering [Mon, 25 Apr 2016 19:43:12 +0000 (21:43 +0200)]
journal: when dumping journal contents, react nicer to lines we can't read
If journal files are not cleanly closed it might happen that intermediaery
journal entries cannot be read. Handle this nicely, skip over the unreadable
entries, and log a debug message about it; after all we generally follow the
logic that we try to make the best of corrupted files.
Lennart Poettering [Mon, 25 Apr 2016 19:42:15 +0000 (21:42 +0200)]
sd-journal: detect earlier if we try to read an object from an invalid offset
Specifically, detect early if we try to read from offset 0, i.e. are using
uninitialized offset data.
Zbigniew Jędrzejewski-Szmek [Tue, 26 Apr 2016 09:19:10 +0000 (05:19 -0400)]
systemd --user: call pam_loginuid when creating user@.service (#3120)
This way the user service will have a loginuid, and it will be inherited by
child services. This shouldn't change anything as far as systemd itself is
concerned, but is nice for various services spawned from by systemd --user
that expect a loginuid.
pam_loginuid(8) says that it should be enabled for "..., crond and atd".
user@.service should behave similarly to those two as far as audit is
concerned.
https://bugzilla.redhat.com/show_bug.cgi?id=1328947#c28
Zbigniew Jędrzejewski-Szmek [Mon, 25 Apr 2016 19:57:36 +0000 (15:57 -0400)]
Merge pull request #3109 from poettering/journal-by-fd
rework "journalctl -M"
Zbigniew Jędrzejewski-Szmek [Mon, 25 Apr 2016 19:56:17 +0000 (15:56 -0400)]
Merge pull request #3114 from poettering/journalctl-b
Fix endless loops in journalctl --list-boots (closes #617).
EMOziko [Mon, 25 Apr 2016 19:42:35 +0000 (23:42 +0400)]
Hp Folio 1040g2 micmute and toggle touchpad fn keys fix (#3118)
Added HP Folio 1040g2 Fn+F8 MICMUTE FIx
Lennart Poettering [Mon, 25 Apr 2016 19:38:56 +0000 (21:38 +0200)]
machined: add CAP_MKNOD to capabilities to run with (#3116)
Container images from Debian or suchlike contain device nodes in /dev. Let's
make sure we can clone them properly, hence pass CAP_MKNOD to machined.
Fixes: #2867 #465
Lennart Poettering [Mon, 25 Apr 2016 19:37:51 +0000 (21:37 +0200)]
machined: generate a nicer error when the user tries "machinectl clone" on non-btrfs file systems (#3117)
Fixes: #2060
(Of course, in the long run, we should probably add a copy-based fall-back. But
given how slow that is, this probably requires some asynchronous forking logic
like the CopyFrom() and CopyTo() method calls already implement.)
Lennart Poettering [Mon, 25 Apr 2016 19:36:25 +0000 (21:36 +0200)]
core: fix description of "resources" service error (#3119)
The "resources" error is really just the generic error we return when
we hit some kind of error and we have no more appropriate error for the case to
return, for example because of some OS error.
Hence, reword the explanation and don't claim any relation to resource limits.
Admittedly, the "resources" service error is a bit of a misnomer, but I figure
it's kind of API now.
Fixes: #2716
Lennart Poettering [Mon, 25 Apr 2016 18:02:03 +0000 (20:02 +0200)]
Merge pull request #3113 from ssahani/route-fix
netwotkd: fix address and route conf
Vito Caputo [Mon, 25 Apr 2016 17:58:16 +0000 (10:58 -0700)]
journal: fix already offline check and thread leak (#2810)
Early in journal_file_set_offline() f->header->state is tested to see if
it's != STATE_ONLINE, and since there's no need to do anything if the
journal isn't online, the function simply returned here.
Since moving part of the offlining process to a separate thread, there
are two problems here:
1. We can't simply check f->header->state, because if there is an
offline thread active it may modify f->header->state.
2. Even if the journal is deemed offline, the thread responsible may
still need joining, so a bare return may leak the thread's resources
like its stack.
To address #1, the helper journal_file_is_offlining() is called prior to
accessing f->header->state.
If journal_file_is_offlining() returns true, f->header->state isn't even
checked, because an offlining journal is obviously online, and we'll
just continue with the normal set offline code path.
If journal_file_is_offlining() returns false, then it's safe to check
f->header->state, because the offline_state is beyond the point of
modifying f->header->state, and there's a memory barrier in the helper.
If we find f->header->state is != STATE_ONLINE, then we call the
idempotent journal_file_set_offline_thread_join() on the way out of the
function, to join a potential lingering offline thread.
Lennart Poettering [Mon, 25 Apr 2016 09:57:56 +0000 (11:57 +0200)]
journalctl: turn --unit= in combination with --user into --user-unit=
Let's be nice to users, and let's turn the nonsensical "--unit=… --user" into
"--user-unit=…" which the user more likely meant.
Fixes #1621
Lennart Poettering [Mon, 25 Apr 2016 09:39:38 +0000 (11:39 +0200)]
man: document the new by-fd journal calls
Also, remove documentation for sd_journal_open_container() as we consider it
deprecated now.
Lennart Poettering [Mon, 25 Apr 2016 09:36:37 +0000 (11:36 +0200)]
man: don't include history sections in man pages
I am pretty sure we shouldn't carry history sections in man pages, since it's
very hard to keep them correctly updated, the current ones are very
out-of-date, and they tend to make APIs appear unnecessarily complex.
Lennart Poettering [Mon, 25 Apr 2016 09:31:47 +0000 (11:31 +0200)]
sd-journal: "soft" deprecate sd_journal_open_container()
Let's document the call as deprecated, since it doesn't cover containers with
directories that aren#t visible to the host properly.
Lennart Poettering [Mon, 25 Apr 2016 09:21:46 +0000 (11:21 +0200)]
journalctl: port --machine= switch to use machined's OpenMachineRootDirectory()
This way, the switch becomes compatible with nspawn containers using --image=,
and those which only store journal data in /run (i.e. have persistant logs
off).
Fixes: #49
Lennart Poettering [Mon, 25 Apr 2016 16:08:42 +0000 (18:08 +0200)]
journalctl: don't trust the per-field entry tables when looking for boot IDs
When appending to a journal file, journald will:
a) first, append the actual entry to the end of the journal file
b) second, add an offset reference to it to the global entry array stored at
the beginning of the file
c) third, add offset references to it to the per-field entry array stored at
various places of the file
The global entry array, maintained by b) is used when iterating through the
journal without matches applied.
The per-field entry array maintained by c) is used when iterating through the
journal with a match for that specific field applied.
In the wild, there are journal files where a) and b) were completed, but c)
was not before the files were abandoned. This means, that in some cases log
entries are at the end of these files that appear in the global entry array,
but not in the per-field entry array of the _BOOT_ID= field. Now, the
"journalctl --list-boots" command alternatingly uses the global entry array
and the per-field entry array of the _BOOT_ID= field. It seeks to the last
entry of a specific _BOOT_ID=field by having the right match installed, and
then jumps to the next following entry with no match installed anymore, under
the assumption this would bring it to the next boot ID. However, if the
per-field entry wasn't written fully, it might actually turn out that the
global entry array might know one more entry with the same _BOOT_ID, thus
resulting in a indefinite loop around the same _BOOT_ID.
This patch fixes that, by updating the boot search logic to always continue
reading entries until the boot ID actually changed from the previous. Thus, the
per-field entry array is used as quick jump index (i.e. as an optimization),
but not trusted otherwise. Only the global entry array is trusted.
This replaces PR #1904, which is actually very similar to this one. However,
this one actually reads the boot ID directly from the entry header, and doesn't
try to read it at all until the read pointer is actually really located on the
first item to read.
Fixes: #617
Replaces: #1904
Lennart Poettering [Mon, 25 Apr 2016 16:06:47 +0000 (18:06 +0200)]
journalctl: improve output of --header a bit
Show the various timestamps in hexadecimal too. This is useful for matching the
timestamps included in cursor strings (which are encoded in hex, too), with the
references in the journal header.
Lennart Poettering [Mon, 25 Apr 2016 15:36:51 +0000 (17:36 +0200)]
nspawn: explicitly remove veth links after use (#3111)
* sd-netlink: permit RTM_DELLINK messages with no ifindex
This is useful for removing network interfaces by name.
* nspawn: explicitly remove veth links we created after use
Sometimes the kernel keeps veth links pinned after the namespace they have been
joined to died. Let's hence explicitly remove veth links after use.
Fixes: #2173
Lennart Poettering [Mon, 25 Apr 2016 14:37:09 +0000 (16:37 +0200)]
journalctl: simplify discover_next_boot() a bit
Drop the "read_realtime" parameter. Getting the realtime timestamp from an
entry is cheap, as it is a normal header field, hence let's just get this
unconditionally, and simplify our code a bit.
Lennart Poettering [Mon, 25 Apr 2016 14:24:05 +0000 (16:24 +0200)]
journalctl: simplify get_boots() a bit, by getting rid of one BootId object
Let's store the reference as simple sd_id128_t, since we don't actually need a
BootId for it.
Lennart Poettering [Mon, 25 Apr 2016 14:23:29 +0000 (16:23 +0200)]
journalctl: add some explanatory comments to get_boots()
Lennart Poettering [Mon, 25 Apr 2016 09:16:08 +0000 (11:16 +0200)]
sd-journal: add logic to open journal files of a specific OS tree
With this change a new flag SD_JOURNAL_OS_ROOT is introduced. If specified
while opening the journal with the per-directory calls (specifically:
sd_journal_open_directory() and sd_journal_open_directory_fd()) the passed
directory is assumed to be the root directory of an OS tree, and the journal
files are searched for in /var/log/journal, /run/log/journal relative to it.
This is useful to allow usage of sd-journal on file descriptors returned by the
OpenRootDirectory() call of machined.
Lennart Poettering [Mon, 25 Apr 2016 09:13:16 +0000 (11:13 +0200)]
machined: add new OpenRootDirectory() call to Machine objects
This new call returns a file descriptor for the root directory of a container.
This file descriptor may then be used to access the rest of the container's
file system, via openat() and similar calls. Since the file descriptor returned
is for the file system namespace inside of the container it may be used to
access all files of the container exactly the way the container itself would
see them. This is particularly useful for containers run directly from
loopback media, for example via systemd-nspawn's --image= switch. It also
provides access to directories such as /run of a container that are normally
not accessible to the outside of a container.
This replaces PR #2870.
Fixes: #2870
Lennart Poettering [Sun, 24 Apr 2016 22:31:24 +0000 (00:31 +0200)]
sd-journal: add API for opening journal files or directories by fd
Also, expose this via the "journalctl --file=-" syntax for STDIN. This feature
remains undocumented though, as it is probably not too useful in real-life as
this still requires fds that support mmaping and seeking, i.e. does not work
for pipes, for which reading from STDIN is most commonly used.
Michal Koutný [Mon, 25 Apr 2016 11:25:00 +0000 (13:25 +0200)]
Always create dependencies for loop device mounts
In case a file is on a networked filesystem, we may tag the fstab record with
_netdev option, however, corrrect dependencies will be created for this mount.
Michal Koutný [Tue, 19 Apr 2016 16:44:40 +0000 (18:44 +0200)]
Always create dependencies for bind mounts
Dependencies were not created for _netdev mountpoints, the reasoning for this
is in the commit
fc676b00, i.e. to avoid adding dependencies for network
mountpoints where What= appears like a path. Thus proposing this semantically
more correct condition when dependencies are added for _actual_ bind mounts
irrespectively of network flag.
Consequently it allows to add _netdev option to bind mounts, which includes
them in remote-fs.target, which simplifies configuration.
Lennart Poettering [Mon, 25 Apr 2016 10:48:05 +0000 (12:48 +0200)]
nspawn: when readjusting UID/GID ownership of OS trees, skip read-only subtrees
This should allow tools like rkt to pre-mount read-only subtrees in the OS
tree, without breaking the patching code.
Note that the code will still fail, if the top-level directory is already
read-only.
Lennart Poettering [Fri, 22 Apr 2016 16:10:16 +0000 (18:10 +0200)]
nspawn: don't try to patch UIDs/GIDs of procfs and suchlike
Lennart Poettering [Fri, 22 Apr 2016 12:12:27 +0000 (14:12 +0200)]
units: turn on user namespace by default in systemd-nspawn@.service
Now that user namespacing is supported in a pretty automatic way, actually turn
it on by default if the systemd-nspawn@.service template is used.
Lennart Poettering [Fri, 22 Apr 2016 12:10:09 +0000 (14:10 +0200)]
nspawn: make -U a tiny bit smarter
With this change -U will turn on user namespacing only if the kernel actually
supports it and otherwise gracefully degrade to non-userns mode.
Lennart Poettering [Fri, 22 Apr 2016 11:46:23 +0000 (13:46 +0200)]
man: document the new user namespacing options
Lennart Poettering [Fri, 22 Apr 2016 11:02:53 +0000 (13:02 +0200)]
nspawn: allow configuration of user namespaces in .nspawn files
In order to implement this we change the bool arg_userns into an enum
UserNamespaceMode, which can take one of NO, PICK or FIXED, and replace the
arg_uid_range_pick bool with it.
Lennart Poettering [Fri, 22 Apr 2016 09:47:35 +0000 (11:47 +0200)]
nspawn: add -U as shortcut for --private-users=pick
Given that user namespacing is pretty useful now, let's add a shortcut command
line switch for the logic.
Lennart Poettering [Fri, 22 Apr 2016 09:28:09 +0000 (11:28 +0200)]
nspawn: optionally, automatically allocate a UID/GID range for userns containers
This adds the new value "pick" to --private-users=. When specified a new
UID/GID range of 65536 users is automatically and randomly allocated from the
host range 0x00080000-0xDFFF0000 and used for the container. The setting
implies --private-users-chown, so that container directory is recursively
chown()ed to the newly allocated UID/GID range, if that's necessary. As an
optimization before picking a randomized UID/GID the UID of the container's
root directory is used as starting point and used if currently not used
otherwise.
To protect against using the same UID/GID range multiple times a few mechanisms
are in place:
- The first and the last UID and GID of the range are checked with getpwuid()
and getgrgid(). If an entry already exists a different range is picked. Note
that by "last" UID the user 65534 is used, as 65535 is the 16bit (uid_t) -1.
- A lock file for the range is taken in /run/systemd/nspawn-uid/. Since the
ranges are taken in a non-overlapping fashion, and always start on 64K
boundaries this allows us to maintain a single lock file for each range that
can be randomly picked. This protects nspawn from picking the same range in
two parallel instances.
- If possible the /etc/passwd lock file is taken while a new range is selected
until the container is up. This means adduser/addgroup should safely avoid
the range as long as nss-mymachines is used, since the allocated range will
then show up in the user database.
The UID/GID range nspawn picks from is compiled in and not configurable at the
moment. That should probably stay that way, since we already provide ways how
users can pick their own ranges manually if they don't like the automatic
logic.
The new --private-users=pick logic makes user namespacing pretty useful now, as
it relieves the user from managing UID/GID ranges.
Lennart Poettering [Wed, 20 Apr 2016 20:53:39 +0000 (22:53 +0200)]
nspawn: optionally fix up OS tree uid/gids for userns
This adds a new --private-userns-chown switch that may be used in combination
with --private-userns. If it is passed a recursive chmod() operation is run on
the OS tree, fixing all file owner UID/GIDs to the right ranges. This should
make user namespacing pretty workable, as the OS trees don't need to be
prepared manually anymore.
Lennart Poettering [Thu, 21 Apr 2016 10:47:36 +0000 (12:47 +0200)]
util: copy_file_range() returns EBADF when used on a tty
In nspawn we invoke copy_bytes() on a TTY fd. copy_file_range() returns EBADF
on a TTY and this error is considered fatal by copy_bytes() so far. Correct
that, so that nspawn's copy_bytes() operation works again.
This is a follow-up for
a44202e98b638024c45e50ad404c7069c7835c04.