Josef Bacik [Thu, 6 Nov 2014 15:19:54 +0000 (10:19 -0500)]
Btrfs: make sure we wait on logged extents when fsycning two subvols
commit
9dba8cf128ef98257ca719722280c9634e7e9dc7 upstream.
If we have two fsync()'s race on different subvols one will do all of its work
to get into the log_tree, wait on it's outstanding IO, and then allow the
log_tree to finish it's commit. The problem is we were just free'ing that
subvols logged extents instead of waiting on them, so whoever lost the race
wouldn't really have their data on disk. Fix this by waiting properly instead
of freeing the logged extents. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michael Halcrow [Wed, 26 Nov 2014 17:09:16 +0000 (09:09 -0800)]
eCryptfs: Remove buggy and unnecessary write in file name decode routine
commit
942080643bce061c3dd9d5718d3b745dcb39a8bc upstream.
Dmitry Chernenkov used KASAN to discover that eCryptfs writes past the
end of the allocated buffer during encrypted filename decoding. This
fix corrects the issue by getting rid of the unnecessary 0 write when
the current bit offset is 2.
Signed-off-by: Michael Halcrow <mhalcrow@google.com>
Reported-by: Dmitry Chernenkov <dmitryc@google.com>
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tyler Hicks [Tue, 7 Oct 2014 20:51:55 +0000 (15:51 -0500)]
eCryptfs: Force RO mount when encrypted view is enabled
commit
332b122d39c9cbff8b799007a825d94b2e7c12f2 upstream.
The ecryptfs_encrypted_view mount option greatly changes the
functionality of an eCryptfs mount. Instead of encrypting and decrypting
lower files, it provides a unified view of the encrypted files in the
lower filesystem. The presence of the ecryptfs_encrypted_view mount
option is intended to force a read-only mount and modifying files is not
supported when the feature is in use. See the following commit for more
information:
e77a56d [PATCH] eCryptfs: Encrypted passthrough
This patch forces the mount to be read-only when the
ecryptfs_encrypted_view mount option is specified by setting the
MS_RDONLY flag on the superblock. Additionally, this patch removes some
broken logic in ecryptfs_open() that attempted to prevent modifications
of files when the encrypted view feature was in use. The check in
ecryptfs_open() was not sufficient to prevent file modifications using
system calls that do not operate on a file descriptor.
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Reported-by: Priya Bansal <p.bansal@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jan Kara [Fri, 19 Dec 2014 13:27:55 +0000 (14:27 +0100)]
udf: Check component length before reading it
commit
e237ec37ec154564f8690c5bd1795339955eeef9 upstream.
Check that length specified in a component of a symlink fits in the
input buffer we are reading. Also properly ignore component length for
component types that do not use it. Otherwise we read memory after end
of buffer for corrupted udf image.
Reported-by: Carl Henrik Lunde <chlunde@ping.uio.no>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jan Kara [Fri, 19 Dec 2014 11:21:47 +0000 (12:21 +0100)]
udf: Verify symlink size before loading it
commit
a1d47b262952a45aae62bd49cfaf33dd76c11a2c upstream.
UDF specification allows arbitrarily large symlinks. However we support
only symlinks at most one block large. Check the length of the symlink
so that we don't access memory beyond end of the symlink block.
Reported-by: Carl Henrik Lunde <chlunde@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jan Kara [Fri, 19 Dec 2014 11:03:53 +0000 (12:03 +0100)]
udf: Verify i_size when loading inode
commit
e159332b9af4b04d882dbcfe1bb0117f0a6d4b58 upstream.
Verify that inode size is sane when loading inode with data stored in
ICB. Otherwise we may get confused later when working with the inode and
inode size is too big.
Reported-by: Carl Henrik Lunde <chlunde@ping.uio.no>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jan Kara [Thu, 18 Dec 2014 21:37:50 +0000 (22:37 +0100)]
udf: Check path length when reading symlink
commit
0e5cc9a40ada6046e6bc3bdfcd0c0d7e4b706b14 upstream.
Symlink reading code does not check whether the resulting path fits into
the page provided by the generic code. This isn't as easy as just
checking the symlink size because of various encoding conversions we
perform on path. So we have to check whether there is still enough space
in the buffer on the fly.
Reported-by: Carl Henrik Lunde <chlunde@ping.uio.no>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Oleg Nesterov [Wed, 10 Dec 2014 23:55:25 +0000 (15:55 -0800)]
exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting
commit
24c037ebf5723d4d9ab0996433cee4f96c292a4d upstream.
alloc_pid() does get_pid_ns() beforehand but forgets to put_pid_ns() if it
fails because disable_pid_allocation() was called by the exiting
child_reaper.
We could simply move get_pid_ns() down to successful return, but this fix
tries to be as trivial as possible.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
Cc: Sterling Alexander <stalexan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Joonsoo Kim [Wed, 10 Dec 2014 23:41:12 +0000 (15:41 -0800)]
mm/CMA: fix boot regression due to physical address of high_memory
commit
6b101e2a3ce4d2a0312087598bd1ab4a1db2ac40 upstream.
high_memory isn't direct mapped memory so retrieving it's physical address
isn't appropriate. But, it would be useful to check physical address of
highmem boundary so it's justfiable to get physical address from it. In
x86, there is a validation check if CONFIG_DEBUG_VIRTUAL and it triggers
following boot failure reported by Ingo.
...
BUG: Int 6: CR2
00f06f53
...
Call Trace:
dump_stack+0x41/0x52
early_idt_handler+0x6b/0x6b
cma_declare_contiguous+0x33/0x212
dma_contiguous_reserve_area+0x31/0x4e
dma_contiguous_reserve+0x11d/0x125
setup_arch+0x7b5/0xb63
start_kernel+0xb8/0x3e6
i386_start_kernel+0x79/0x7d
To fix boot regression, this patch implements workaround to avoid
validation check in x86 when retrieving physical address of high_memory.
__pa_nodebug() used by this patch is implemented only in x86 so there is
no choice but to use dirty #ifdef.
[akpm@linux-foundation.org: tweak comment]
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Reported-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jan Kara [Wed, 10 Dec 2014 23:52:22 +0000 (15:52 -0800)]
ncpfs: return proper error from NCP_IOC_SETROOT ioctl
commit
a682e9c28cac152e6e54c39efcf046e0c8cfcf63 upstream.
If some error happens in NCP_IOC_SETROOT ioctl, the appropriate error
return value is then (in most cases) just overwritten before we return.
This can result in reporting success to userspace although error happened.
This bug was introduced by commit
2e54eb96e2c8 ("BKL: Remove BKL from
ncpfs"). Propagate the errors correctly.
Coverity id:
1226925.
Fixes: 2e54eb96e2c80 ("BKL: Remove BKL from ncpfs")
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rabin Vincent [Fri, 19 Dec 2014 12:36:08 +0000 (13:36 +0100)]
crypto: af_alg - fix backlog handling
commit
7e77bdebff5cb1e9876c561f69710b9ab8fa1f7e upstream.
If a request is backlogged, it's complete() handler will get called
twice: once with -EINPROGRESS, and once with the final error code.
af_alg's complete handler, unlike other users, does not handle the
-EINPROGRESS but instead always completes the completion that recvmsg()
is waiting on. This can lead to a return to user space while the
request is still pending in the driver. If userspace closes the sockets
before the requests are handled by the driver, this will lead to
use-after-frees (and potential crashes) in the kernel due to the tfm
having been freed.
The crashes can be easily reproduced (for example) by reducing the max
queue length in cryptod.c and running the following (from
http://www.chronox.de/libkcapi.html) on AES-NI capable hardware:
$ while true; do kcapi -x 1 -e -c '__ecb-aes-aesni' \
-k
00000000000000000000000000000000 \
-p
00000000000000000000000000000000 >/dev/null & done
Signed-off-by: Rabin Vincent <rabin.vincent@axis.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Richard Guy Briggs [Tue, 23 Dec 2014 18:02:04 +0000 (13:02 -0500)]
audit: restore AUDIT_LOGINUID unset ABI
commit
041d7b98ffe59c59fdd639931dea7d74f9aa9a59 upstream.
A regression was caused by commit
780a7654cee8:
audit: Make testing for a valid loginuid explicit.
(which in turn attempted to fix a regression caused by
e1760bd)
When audit_krule_to_data() fills in the rules to get a listing, there was a
missing clause to convert back from AUDIT_LOGINUID_SET to AUDIT_LOGINUID.
This broke userspace by not returning the same information that was sent and
expected.
The rule:
auditctl -a exit,never -F auid=-1
gives:
auditctl -l
LIST_RULES: exit,never f24=0 syscall=all
when it should give:
LIST_RULES: exit,never auid=-1 (0xffffffff) syscall=all
Tag it so that it is reported the same way it was set. Create a new
private flags audit_krule field (pflags) to store it that won't interact with
the public one from the API.
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paul Moore [Fri, 19 Dec 2014 23:35:53 +0000 (18:35 -0500)]
audit: don't attempt to lookup PIDs when changing PID filtering audit rules
commit
3640dcfa4fd00cd91d88bb86250bdb496f7070c0 upstream.
Commit
f1dc4867 ("audit: anchor all pid references in the initial pid
namespace") introduced a find_vpid() call when adding/removing audit
rules with PID/PPID filters; unfortunately this is problematic as
find_vpid() only works if there is a task with the associated PID
alive on the system. The following commands demonstrate a simple
reproducer.
# auditctl -D
# auditctl -l
# autrace /bin/true
# auditctl -l
This patch resolves the problem by simply using the PID provided by
the user without any additional validation, e.g. no calls to check to
see if the task/PID exists.
Cc: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Richard Guy Briggs [Fri, 19 Dec 2014 04:09:27 +0000 (23:09 -0500)]
audit: use supplied gfp_mask from audit_buffer in kauditd_send_multicast_skb
commit
54dc77d974a50147d6639dac6f59cb2c29207161 upstream.
Eric Paris explains: Since kauditd_send_multicast_skb() gets called in
audit_log_end(), which can come from any context (aka even a sleeping context)
GFP_KERNEL can't be used. Since the audit_buffer knows what context it should
use, pass that down and use that.
See: https://lkml.org/lkml/2014/12/16/542
BUG: sleeping function called from invalid context at mm/slab.c:2849
in_atomic(): 1, irqs_disabled(): 0, pid: 885, name: sulogin
2 locks held by sulogin/885:
#0: (&sig->cred_guard_mutex){+.+.+.}, at: [<
ffffffff91152e30>] prepare_bprm_creds+0x28/0x8b
#1: (tty_files_lock){+.+.+.}, at: [<
ffffffff9123e787>] selinux_bprm_committing_creds+0x55/0x22b
CPU: 1 PID: 885 Comm: sulogin Not tainted 3.18.0-next-
20141216 #30
Hardware name: Dell Inc. Latitude E6530/07Y85M, BIOS A15 06/20/2014
ffff880223744f10 ffff88022410f9b8 ffffffff916ba529 0000000000000375
ffff880223744f10 ffff88022410f9e8 ffffffff91063185 0000000000000006
0000000000000000 0000000000000000 0000000000000000 ffff88022410fa38
Call Trace:
[<
ffffffff916ba529>] dump_stack+0x50/0xa8
[<
ffffffff91063185>] ___might_sleep+0x1b6/0x1be
[<
ffffffff910632a6>] __might_sleep+0x119/0x128
[<
ffffffff91140720>] cache_alloc_debugcheck_before.isra.45+0x1d/0x1f
[<
ffffffff91141d81>] kmem_cache_alloc+0x43/0x1c9
[<
ffffffff914e148d>] __alloc_skb+0x42/0x1a3
[<
ffffffff914e2b62>] skb_copy+0x3e/0xa3
[<
ffffffff910c263e>] audit_log_end+0x83/0x100
[<
ffffffff9123b8d3>] ? avc_audit_pre_callback+0x103/0x103
[<
ffffffff91252a73>] common_lsm_audit+0x441/0x450
[<
ffffffff9123c163>] slow_avc_audit+0x63/0x67
[<
ffffffff9123c42c>] avc_has_perm+0xca/0xe3
[<
ffffffff9123dc2d>] inode_has_perm+0x5a/0x65
[<
ffffffff9123e7ca>] selinux_bprm_committing_creds+0x98/0x22b
[<
ffffffff91239e64>] security_bprm_committing_creds+0xe/0x10
[<
ffffffff911515e6>] install_exec_creds+0xe/0x79
[<
ffffffff911974cf>] load_elf_binary+0xe36/0x10d7
[<
ffffffff9115198e>] search_binary_handler+0x81/0x18c
[<
ffffffff91153376>] do_execveat_common.isra.31+0x4e3/0x7b7
[<
ffffffff91153669>] do_execve+0x1f/0x21
[<
ffffffff91153967>] SyS_execve+0x25/0x29
[<
ffffffff916c61a9>] stub_execve+0x69/0xa0
Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Tested-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric W. Biederman [Tue, 2 Dec 2014 19:56:30 +0000 (13:56 -0600)]
userns: Unbreak the unprivileged remount tests
commit
db86da7cb76f797a1a8b445166a15cb922c6ff85 upstream.
A security fix in caused the way the unprivileged remount tests were
using user namespaces to break. Tweak the way user namespaces are
being used so the test works again.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric W. Biederman [Sat, 6 Dec 2014 01:36:04 +0000 (19:36 -0600)]
userns: Allow setting gid_maps without privilege when setgroups is disabled
commit
66d2f338ee4c449396b6f99f5e75cd18eb6df272 upstream.
Now that setgroups can be disabled and not reenabled, setting gid_map
without privielge can now be enabled when setgroups is disabled.
This restores most of the functionality that was lost when unprivileged
setting of gid_map was removed. Applications that use this functionality
will need to check to see if they use setgroups or init_groups, and if they
don't they can be fixed by simply disabling setgroups before writing to
gid_map.
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric W. Biederman [Tue, 2 Dec 2014 18:27:26 +0000 (12:27 -0600)]
userns: Add a knob to disable setgroups on a per user namespace basis
commit
9cc46516ddf497ea16e8d7cb986ae03a0f6b92f8 upstream.
- Expose the knob to user space through a proc file /proc/<pid>/setgroups
A value of "deny" means the setgroups system call is disabled in the
current processes user namespace and can not be enabled in the
future in this user namespace.
A value of "allow" means the segtoups system call is enabled.
- Descendant user namespaces inherit the value of setgroups from
their parents.
- A proc file is used (instead of a sysctl) as sysctls currently do
not allow checking the permissions at open time.
- Writing to the proc file is restricted to before the gid_map
for the user namespace is set.
This ensures that disabling setgroups at a user namespace
level will never remove the ability to call setgroups
from a process that already has that ability.
A process may opt in to the setgroups disable for itself by
creating, entering and configuring a user namespace or by calling
setns on an existing user namespace with setgroups disabled.
Processes without privileges already can not call setgroups so this
is a noop. Prodcess with privilege become processes without
privilege when entering a user namespace and as with any other path
to dropping privilege they would not have the ability to call
setgroups. So this remains within the bounds of what is possible
without a knob to disable setgroups permanently in a user namespace.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric W. Biederman [Tue, 9 Dec 2014 20:03:14 +0000 (14:03 -0600)]
userns: Rename id_map_mutex to userns_state_mutex
commit
f0d62aec931e4ae3333c797d346dc4f188f454ba upstream.
Generalize id_map_mutex so it can be used for more state of a user namespace.
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric W. Biederman [Thu, 27 Nov 2014 05:22:14 +0000 (23:22 -0600)]
userns: Only allow the creator of the userns unprivileged mappings
commit
f95d7918bd1e724675de4940039f2865e5eec5fe upstream.
If you did not create the user namespace and are allowed
to write to uid_map or gid_map you should already have the necessary
privilege in the parent user namespace to establish any mapping
you want so this will not affect userspace in practice.
Limiting unprivileged uid mapping establishment to the creator of the
user namespace makes it easier to verify all credentials obtained with
the uid mapping can be obtained without the uid mapping without
privilege.
Limiting unprivileged gid mapping establishment (which is temporarily
absent) to the creator of the user namespace also ensures that the
combination of uid and gid can already be obtained without privilege.
This is part of the fix for CVE-2014-8989.
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric W. Biederman [Sat, 6 Dec 2014 00:26:30 +0000 (18:26 -0600)]
userns: Check euid no fsuid when establishing an unprivileged uid mapping
commit
80dd00a23784b384ccea049bfb3f259d3f973b9d upstream.
setresuid allows the euid to be set to any of uid, euid, suid, and
fsuid. Therefor it is safe to allow an unprivileged user to map
their euid and use CAP_SETUID privileged with exactly that uid,
as no new credentials can be obtained.
I can not find a combination of existing system calls that allows setting
uid, euid, suid, and fsuid from the fsuid making the previous use
of fsuid for allowing unprivileged mappings a bug.
This is part of a fix for CVE-2014-8989.
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric W. Biederman [Sat, 6 Dec 2014 00:14:19 +0000 (18:14 -0600)]
userns: Don't allow unprivileged creation of gid mappings
commit
be7c6dba2332cef0677fbabb606e279ae76652c3 upstream.
As any gid mapping will allow and must allow for backwards
compatibility dropping groups don't allow any gid mappings to be
established without CAP_SETGID in the parent user namespace.
For a small class of applications this change breaks userspace
and removes useful functionality. This small class of applications
includes tools/testing/selftests/mount/unprivilged-remount-test.c
Most of the removed functionality will be added back with the addition
of a one way knob to disable setgroups. Once setgroups is disabled
setting the gid_map becomes as safe as setting the uid_map.
For more common applications that set the uid_map and the gid_map
with privilege this change will have no affect.
This is part of a fix for CVE-2014-8989.
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric W. Biederman [Sat, 6 Dec 2014 00:01:11 +0000 (18:01 -0600)]
userns: Don't allow setgroups until a gid mapping has been setablished
commit
273d2c67c3e179adb1e74f403d1e9a06e3f841b5 upstream.
setgroups is unique in not needing a valid mapping before it can be called,
in the case of setgroups(0, NULL) which drops all supplemental groups.
The design of the user namespace assumes that CAP_SETGID can not actually
be used until a gid mapping is established. Therefore add a helper function
to see if the user namespace gid mapping has been established and call
that function in the setgroups permission check.
This is part of the fix for CVE-2014-8989, being able to drop groups
without privilege using user namespaces.
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric W. Biederman [Fri, 5 Dec 2014 23:51:47 +0000 (17:51 -0600)]
userns: Document what the invariant required for safe unprivileged mappings.
commit
0542f17bf2c1f2430d368f44c8fcf2f82ec9e53e upstream.
The rule is simple. Don't allow anything that wouldn't be allowed
without unprivileged mappings.
It was previously overlooked that establishing gid mappings would
allow dropping groups and potentially gaining permission to files and
directories that had lesser permissions for a specific group than for
all other users.
This is the rule needed to fix CVE-2014-8989 and prevent any other
security issues with new_idmap_permitted.
The reason for this rule is that the unix permission model is old and
there are programs out there somewhere that take advantage of every
little corner of it. So allowing a uid or gid mapping to be
established without privielge that would allow anything that would not
be allowed without that mapping will result in expectations from some
code somewhere being violated. Violated expectations about the
behavior of the OS is a long way to say a security issue.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric W. Biederman [Fri, 5 Dec 2014 23:19:27 +0000 (17:19 -0600)]
groups: Consolidate the setgroups permission checks
commit
7ff4d90b4c24a03666f296c3d4878cd39001e81e upstream.
Today there are 3 instances of setgroups and due to an oversight their
permission checking has diverged. Add a common function so that
they may all share the same permission checking code.
This corrects the current oversight in the current permission checks
and adds a helper to avoid this in the future.
A user namespace security fix will update this new helper, shortly.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric W. Biederman [Sat, 4 Oct 2014 21:44:03 +0000 (14:44 -0700)]
umount: Disallow unprivileged mount force
commit
b2f5d4dc38e034eecb7987e513255265ff9aa1cf upstream.
Forced unmount affects not just the mount namespace but the underlying
superblock as well. Restrict forced unmount to the global root user
for now. Otherwise it becomes possible a user in a less privileged
mount namespace to force the shutdown of a superblock of a filesystem
in a more privileged mount namespace, allowing a DOS attack on root.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric W. Biederman [Fri, 22 Aug 2014 21:39:03 +0000 (16:39 -0500)]
mnt: Update unprivileged remount test
commit
4a44a19b470a886997d6647a77bb3e38dcbfa8c5 upstream.
- MNT_NODEV should be irrelevant except when reading back mount flags,
no longer specify MNT_NODEV on remount.
- Test MNT_NODEV on devpts where it is meaningful even for unprivileged mounts.
- Add a test to verify that remount of a prexisting mount with the same flags
is allowed and does not change those flags.
- Cleanup up the definitions of MS_REC, MS_RELATIME, MS_STRICTATIME that are used
when the code is built in an environment without them.
- Correct the test error messages when tests fail. There were not 5 tests
that tested MS_RELATIME.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric W. Biederman [Wed, 13 Aug 2014 08:33:38 +0000 (01:33 -0700)]
mnt: Implicitly add MNT_NODEV on remount when it was implicitly added by mount
commit
3e1866410f11356a9fd869beb3e95983dc79c067 upstream.
Now that remount is properly enforcing the rule that you can't remove
nodev at least sandstorm.io is breaking when performing a remount.
It turns out that there is an easy intuitive solution implicitly
add nodev on remount when nodev was implicitly added on mount.
Tested-by: Cedric Bosdonnat <cbosdonnat@suse.com>
Tested-by: Richard Weinberger <richard@nod.at>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Luis Henriques [Wed, 3 Dec 2014 21:20:21 +0000 (21:20 +0000)]
thermal: Fix error path in thermal_init()
commit
9d367e5e7b05c71a8c1ac4e9b6e00ba45a79f2fc upstream.
thermal_unregister_governors() and class_unregister() were being called in
the wrong order.
Fixes: 80a26a5c22b9 ("Thermal: build thermal governors into thermal_sys module")
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric W. Biederman [Thu, 18 Dec 2014 16:57:19 +0000 (10:57 -0600)]
mnt: Fix a memory stomp in umount
commit
c297abfdf15b4480704d6b566ca5ca9438b12456 upstream.
While reviewing the code of umount_tree I realized that when we append
to a preexisting unmounted list we do not change pprev of the former
first item in the list.
Which means later in namespace_unlock hlist_del_init(&mnt->mnt_hash) on
the former first item of the list will stomp unmounted.first leaving
it set to some random mount point which we are likely to free soon.
This isn't likely to hit, but if it does I don't know how anyone could
track it down.
[ This happened because we don't have all the same operations for
hlist's as we do for normal doubly-linked lists. In particular,
list_splice() is easy on our standard doubly-linked lists, while
hlist_splice() doesn't exist and needs both start/end entries of the
hlist. And commit
38129a13e6e7 incorrectly open-coded that missing
hlist_splice().
We should think about making these kinds of "mindless" conversions
easier to get right by adding the missing hlist helpers - Linus ]
Fixes: 38129a13e6e71f666e0468e99fdd932a687b4d7e switch mnt_hash to hlist
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johannes Berg [Wed, 17 Dec 2014 12:55:49 +0000 (13:55 +0100)]
mac80211: free management frame keys when removing station
commit
28a9bc68124c319b2b3dc861e80828a8865fd1ba upstream.
When writing the code to allow per-station GTKs, I neglected to
take into account the management frame keys (index 4 and 5) when
freeing the station and only added code to free the first four
data frame keys.
Fix this by iterating the array of keys over the right length.
Fixes: e31b82136d1a ("cfg80211/mac80211: allow per-station GTKs")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andreas Müller [Fri, 12 Dec 2014 11:11:11 +0000 (12:11 +0100)]
mac80211: fix multicast LED blinking and counter
commit
d025933e29872cb1fe19fc54d80e4dfa4ee5779c upstream.
As multicast-frames can't be fragmented, "dot11MulticastReceivedFrameCount"
stopped being incremented after the use-after-free fix. Furthermore, the
RX-LED will be triggered by every multicast frame (which wouldn't happen
before) which wouldn't allow the LED to rest at all.
Fixes https://bugzilla.kernel.org/show_bug.cgi?id=89431 which also had the
patch.
Fixes: b8fff407a180 ("mac80211: fix use-after-free in defragmentation")
Signed-off-by: Andreas Müller <goo@stapelspeicher.org>
[rewrite commit message]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jes Sorensen [Wed, 10 Dec 2014 19:14:07 +0000 (14:14 -0500)]
mac80211: avoid using uninitialized stack data
commit
7e6225a1604d0c6aa4140289bf5761868ffc9c83 upstream.
Avoid a case where we would access uninitialized stack data if the AP
advertises HT support without 40MHz channel support.
Fixes: f3000e1b43f1 ("mac80211: fix broken use of VHT/20Mhz with some APs")
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Felix Fietkau [Mon, 24 Nov 2014 17:12:16 +0000 (18:12 +0100)]
mac80211: copy chandef from AP vif to VLANs
commit
2967e031d4d737d9cc8252d878a17924d7b704f0 upstream.
Instead of keeping track of all those special cases where
VLAN interfaces have no bss_conf.chandef, just make sure
they have the same as the AP interface they belong to.
Among others, this fixes a crash getting a VLAN's channel
from userspace since a NULL channel is returned as a good
result (return value 0) for VLANs since the commit below.
Fixes: c12bc4885f4b3 ("mac80211: return the vif's chandef in ieee80211_cfg_get_channel()")
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
[rewrite commit log]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Takashi Iwai [Thu, 4 Dec 2014 17:25:19 +0000 (18:25 +0100)]
KEYS: Fix stale key registration at error path
commit
b26bdde5bb27f3f900e25a95e33a0c476c8c2c48 upstream.
When loading encrypted-keys module, if the last check of
aes_get_sizes() in init_encrypted() fails, the driver just returns an
error without unregistering its key type. This results in the stale
entry in the list. In addition to memory leaks, this leads to a kernel
crash when registering a new key type later.
This patch fixes the problem by swapping the calls of aes_get_sizes()
and register_key_type(), and releasing resources properly at the error
paths.
Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=908163
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Borislav Petkov [Mon, 8 Dec 2014 11:08:20 +0000 (12:08 +0100)]
x86/microcode/intel: Fish out the stashed microcode for the BSP
commit
25cdb9c86826f8d035d8aaa07fc36832e76bd8a0 upstream.
I'm such a moron! The simple solution of saving the BSP patch
for use on resume was too simple (and wrong!), hint:
sizeof(struct microcode_intel).
What needs to be done instead is to fish out the microcode patch
we have stashed previously and apply that on the BSP in case the
late loader hasn't been utilized.
So do that instead.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20141208110820.GB20057@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Borislav Petkov [Wed, 3 Dec 2014 16:21:41 +0000 (17:21 +0100)]
x86, microcode: Reload microcode on resume
commit
fbae4ba8c4a387e306adc9c710e5c225cece7678 upstream.
Normally, we do reapply microcode on resume. However, in the cases where
that microcode comes from the early loader and the late loader hasn't
been utilized yet, there's no easy way for us to go and apply the patch
applied during boot by the early loader.
Thus, reuse the patch stashed by the early loader for the BSP.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Boris Ostrovsky [Mon, 1 Dec 2014 21:27:44 +0000 (16:27 -0500)]
x86, microcode: Don't initialize microcode code on paravirt
commit
a18a0f6850d4b286a5ebf02cd5b22fe496b86349 upstream.
Paravirtual guests are not expected to load microcode into processors
and therefore it is not necessary to initialize microcode loading
logic.
In fact, under certain circumstances initializing this logic may cause
the guest to crash. Specifically, 32-bit kernels use __pa_nodebug()
macro which does not work in Xen (the code path that leads to this macro
happens during resume when we call mc_bp_resume()->load_ucode_ap()
->check_loader_disabled_ap())
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1417469264-31470-1-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Borislav Petkov [Mon, 1 Dec 2014 16:50:16 +0000 (17:50 +0100)]
x86, microcode, intel: Drop unused parameter
commit
47768626c6db42cd06ff077ba12dd2cb10ab818b upstream.
apply_microcode_early() doesn't use mc_saved_data, kill it.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Borislav Petkov [Mon, 1 Dec 2014 10:12:21 +0000 (11:12 +0100)]
x86, microcode, AMD: Do not use smp_processor_id() in preemtible context
commit
2ef84b3bb97f03332f0c1edb4466b1750dcf97b5 upstream.
Hand down the cpu number instead, otherwise lockdep screams when doing
echo 1 > /sys/devices/system/cpu/microcode/reload.
BUG: using smp_processor_id() in preemptible [
00000000] code: amd64-microcode/2470
caller is debug_smp_processor_id+0x12/0x20
CPU: 1 PID: 2470 Comm: amd64-microcode Not tainted 3.18.0-rc6+ #26
...
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1417428741-4501-1-git-send-email-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jan Kara [Thu, 18 Dec 2014 16:26:10 +0000 (17:26 +0100)]
isofs: Fix unchecked printing of ER records
commit
4e2024624e678f0ebb916e6192bd23c1f9fdf696 upstream.
We didn't check length of rock ridge ER records before printing them.
Thus corrupted isofs image can cause us to access and print some memory
behind the buffer with obvious consequences.
Reported-and-tested-by: Carl Henrik Lunde <chlunde@ping.uio.no>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andy Lutomirski [Wed, 17 Dec 2014 22:48:30 +0000 (14:48 -0800)]
x86/tls: Don't validate lm in set_thread_area() after all
commit
3fb2f4237bb452eb4e98f6a5dbd5a445b4fed9d0 upstream.
It turns out that there's a lurking ABI issue. GCC, when
compiling this in a 32-bit program:
struct user_desc desc = {
.entry_number = idx,
.base_addr = base,
.limit = 0xfffff,
.seg_32bit = 1,
.contents = 0, /* Data, grow-up */
.read_exec_only = 0,
.limit_in_pages = 1,
.seg_not_present = 0,
.useable = 0,
};
will leave .lm uninitialized. This means that anything in the
kernel that reads user_desc.lm for 32-bit tasks is unreliable.
Revert the .lm check in set_thread_area(). The value never did
anything in the first place.
Fixes: 0e58af4e1d21 ("x86/tls: Disallow unusual TLS segments")
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/d7875b60e28c512f6a6fc0baf5714d58e7eaadbb.1418856405.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andy Lutomirski [Tue, 25 Nov 2014 01:39:06 +0000 (17:39 -0800)]
x86/asm/traps: Disable tracing and kprobes in fixup_bad_iret and sync_regs
commit
7ddc6a2199f1da405a2fb68c40db8899b1a8cd87 upstream.
These functions can be executed on the int3 stack, so kprobes
are dangerous. Tracing is probably a bad idea, too.
Fixes: b645af2d5905 ("x86_64, traps: Rework bad_iret")
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/50e33d26adca60816f3ba968875801652507d0c4.1416870125.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Uwe Kleine-König [Fri, 14 Nov 2014 20:43:33 +0000 (21:43 +0100)]
ARM: mvebu: fix ordering in Armada 370 .dtsi
commit
ab1e85372168892387dd1ac171158fc8c3119be4 upstream.
Commit
a095b1c78a35 ("ARM: mvebu: sort DT nodes by address")
missed placing the system-controller in the correct order.
Fixes: a095b1c78a35 ("ARM: mvebu: sort DT nodes by address")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lkml.kernel.org/r/20141114204333.GS27002@pengutronix.de
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Thomas Petazzoni [Tue, 28 Oct 2014 16:08:42 +0000 (17:08 +0100)]
ARM: mvebu: remove conflicting muxing on Armada 370 DB
commit
b4607572ef86b288a856b9df410ea593c5371dec upstream.
Back when audio was enabled, the muxing of some MPP pins was causing
problems. However, since commit
fea038ed55ae ("ARM: mvebu: Add proper
pin muxing on the Armada 370 DB board"), those problematic MPP pins
have been assigned a proper muxing for the Ethernet interfaces. This
proper muxing is now conflicting with the hog pins muxing that had
been added as part of
249f3822509b ("ARM: mvebu: add audio support to
Armada 370 DB").
Therefore, this commit simply removes the hog pins muxing, which
solves a warning a boot time due to the conflicting muxing
requirements.
Fixes: fea038ed55ae ("ARM: mvebu: Add proper pin muxing on the Armada 370 DB board")
Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lkml.kernel.org/r/1414512524-24466-5-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Thomas Petazzoni [Thu, 13 Nov 2014 09:38:57 +0000 (10:38 +0100)]
ARM: mvebu: disable I/O coherency on non-SMP situations on Armada 370/375/38x/XP
commit
e55355453600a33bb5ca4f71f2d7214875f3b061 upstream.
Enabling the hardware I/O coherency on Armada 370, Armada 375, Armada
38x and Armada XP requires a certain number of conditions:
- On Armada 370, the cache policy must be set to write-allocate.
- On Armada 375, 38x and XP, the cache policy must be set to
write-allocate, the pages must be mapped with the shareable
attribute, and the SMP bit must be set
Currently, on Armada XP, when CONFIG_SMP is enabled, those conditions
are met. However, when Armada XP is used in a !CONFIG_SMP kernel, none
of these conditions are met. With Armada 370, the situation is worse:
since the processor is single core, regardless of whether CONFIG_SMP
or !CONFIG_SMP is used, the cache policy will be set to write-back by
the kernel and not write-allocate.
Since solving this problem turns out to be quite complicated, and we
don't want to let users with a mainline kernel known to have
infrequent but existing data corruptions, this commit proposes to
simply disable hardware I/O coherency in situations where it is known
not to work.
And basically, the is_smp() function of the kernel tells us whether it
is OK to enable hardware I/O coherency or not, so this commit slightly
refactors the coherency_type() function to return
COHERENCY_FABRIC_TYPE_NONE when is_smp() is false, or the appropriate
type of the coherency fabric in the other case.
Thanks to this, the I/O coherency fabric will no longer be used at all
in !CONFIG_SMP configurations. It will continue to be used in
CONFIG_SMP configurations on Armada XP, Armada 375 and Armada 38x
(which are multiple cores processors), but will no longer be used on
Armada 370 (which is a single core processor).
In the process, it simplifies the implementation of the
coherency_type() function, and adds a missing call to of_node_put().
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Fixes: e60304f8cb7bb545e79fe62d9b9762460c254ec2 ("arm: mvebu: Add hardware I/O Coherency support")
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Link: https://lkml.kernel.org/r/1415871540-20302-3-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Thomas Petazzoni [Thu, 13 Nov 2014 09:38:56 +0000 (10:38 +0100)]
ARM: mvebu: make the coherency_ll.S functions work with no coherency fabric
commit
30cdef97107370a7f63ab5d80fd2de30540750c8 upstream.
The ll_add_cpu_to_smp_group(), ll_enable_coherency() and
ll_disable_coherency() are used on Armada XP to control the coherency
fabric. However, they make the assumption that the coherency fabric is
always available, which is currently a correct assumption but will no
longer be true with a followup commit that disables the usage of the
coherency fabric when the conditions are not met to use it.
Therefore, this commit modifies those functions so that they check the
return value of ll_get_coherency_base(), and if the return value is 0,
they simply return without configuring anything in the coherency
fabric.
The ll_get_coherency_base() function is also modified to properly
return 0 when the function is called with the MMU disabled. In this
case, it normally returns the physical address of the coherency
fabric, but we now check if the virtual address is 0, and if that's
case, return a physical address of 0 to indicate that the coherency
fabric is not enabled.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Link: https://lkml.kernel.org/r/1415871540-20302-2-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dmitry Osipenko [Fri, 10 Oct 2014 13:24:47 +0000 (17:24 +0400)]
ARM: tegra: Re-add removed SoC id macro to tegra_resume()
commit
e4a680099a6e97ecdbb81081cff9e4a489a4dc44 upstream.
Commit
d127e9c ("ARM: tegra: make tegra_resume can work with current and later
chips") removed tegra_get_soc_id macro leaving used cpu register corrupted after
branching to v7_invalidate_l1() and as result causing execution of unintended
code on tegra20. Possibly it was expected that r6 would be SoC id func argument
since common cpu reset handler is setting r6 before branching to tegra_resume(),
but neither tegra20_lp1_reset() nor tegra30_lp1_reset() aren't setting r6
register before jumping to resume function. Fix it by re-adding macro.
Fixes: d127e9c (ARM: tegra: make tegra_resume can work with current and later chips)
Reviewed-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Thierry Reding [Thu, 30 Oct 2014 14:32:56 +0000 (15:32 +0100)]
drm/tegra: gem: dumb: pitch and size are outputs
commit
dc6057ecb39edb34b0461ca55382094410bd257a upstream.
When creating a dumb buffer object using the DRM_IOCTL_MODE_CREATE_DUMB
IOCTL, only the width, height, bpp and flags parameters are inputs. The
caller is not guaranteed to zero out or set handle, pitch and size, so
the driver must not treat these values as possible inputs.
Fixes a bug where running the Weston compositor on Tegra DRM would cause
an attempt to allocate a 3 GiB framebuffer to be allocated.
Fixes: de2ba664c30f ("gpu: host1x: drm: Add memory manager and fb")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Zi Shen Lim [Wed, 3 Dec 2014 08:38:01 +0000 (08:38 +0000)]
arm64: bpf: lift restriction on last instruction
commit
51c9fbb1b146f3336a93d398c439b6fbfe5ab489 upstream.
Earlier implementation assumed last instruction is BPF_EXIT.
Since this is no longer a restriction in eBPF, we remove this
limitation.
Per Alexei Starovoitov [1]:
> classic BPF has a restriction that last insn is always BPF_RET.
> eBPF doesn't have BPF_RET instruction and this restriction.
> It has BPF_EXIT insn which can appear anywhere in the program
> one or more times and it doesn't have to be last insn.
[1] https://lkml.org/lkml/2014/11/27/2
Fixes: e54bcde3d69d ("arm64: eBPF JIT compiler")
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Catalin Marinas [Mon, 17 Nov 2014 10:37:40 +0000 (10:37 +0000)]
arm64: Add COMPAT_HWCAP_LPAE
commit
7d57511d2dba03a8046c8b428dd9192a4bfc1e73 upstream.
Commit
a469abd0f868 (ARM: elf: add new hwcap for identifying atomic
ldrd/strd instructions) introduces HWCAP_ELF for 32-bit ARM
applications. As LPAE is always present on arm64, report the
corresponding compat HWCAP to user space.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mikulas Patocka [Wed, 5 Nov 2014 22:00:13 +0000 (17:00 -0500)]
dm thin: fix a race in thin_dtr
commit
17181fb7a0c3a279196c0eeb2caba65a1519614b upstream.
As long as struct thin_c is in the list, anyone can grab a reference of
it. Consequently, we must wait for the reference count to drop to zero
*after* we remove the structure from the list, not before.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Joe Thornber [Thu, 11 Dec 2014 11:12:19 +0000 (11:12 +0000)]
dm thin: fix missing out-of-data-space to write mode transition if blocks are released
commit
2c43fd26e46734430122b8d2ad3024bb532df3ef upstream.
Discard bios and thin device deletion have the potential to release data
blocks. If the thin-pool is in out-of-data-space mode, and blocks were
released, transition the thin-pool back to full write mode.
The correct time to do this is just after the thin-pool metadata commit.
It cannot be done before the commit because the space maps will not
allow immediate reuse of the data blocks in case there's a rollback
following power failure.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Joe Thornber [Wed, 10 Dec 2014 17:06:57 +0000 (17:06 +0000)]
dm thin: fix inability to discard blocks when in out-of-data-space mode
commit
45ec9bd0fd7abf8705e7cf12205ff69fe9d51181 upstream.
When the pool was in PM_OUT_OF_SPACE mode its process_prepared_discard
function pointer was incorrectly being set to
process_prepared_discard_passdown rather than process_prepared_discard.
This incorrect function pointer meant the discard was being passed down,
but not effecting the mapping. As such any discard that was issued, in
an attempt to reclaim blocks, would not successfully free data space.
Reported-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dan Carpenter [Sat, 29 Nov 2014 12:50:21 +0000 (15:50 +0300)]
dm space map metadata: fix sm_bootstrap_get_nr_blocks()
commit
c1c6156fe4d4577444b769d7edd5dd503e57bbc9 upstream.
This function isn't right and it causes a static checker warning:
drivers/md/dm-thin.c:3016 maybe_resize_data_dev()
error: potentially using uninitialized 'sb_data_size'.
It should set "*count" and return zero on success the same as the
sm_metadata_get_nr_blocks() function does earlier.
Fixes: 3241b1d3e0aa ('dm: add persistent data library')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Joe Thornber [Fri, 28 Nov 2014 09:48:25 +0000 (09:48 +0000)]
dm cache: fix spurious cell_defer when dealing with partial block at end of device
commit
f824a2af3dfbbb766c02e19df21f985bceadf0ee upstream.
We never bother caching a partial block that is at the back end of the
origin device. No cell ever gets locked, but the calling code was
assuming it was and trying to release it.
Now the code only releases if the cell has been set to a non NULL
value.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Joe Thornber [Thu, 27 Nov 2014 12:26:46 +0000 (12:26 +0000)]
dm cache: dirty flag was mistakenly being cleared when promoting via overwrite
commit
1e32134a5a404e80bfb47fad8a94e9bbfcbdacc5 upstream.
If the incoming bio is a WRITE and completely covers a block then we
don't bother to do any copying for a promotion operation. Once this is
done the cache block and origin block will be different, so we need to
set it to 'dirty'.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Joe Thornber [Thu, 27 Nov 2014 12:21:08 +0000 (12:21 +0000)]
dm cache: only use overwrite optimisation for promotion when in writeback mode
commit
f29a3147e251d7ae20d3194ff67f109d71e501b4 upstream.
Overwrite causes the cache block and origin blocks to diverge, which
is only allowed in writeback mode.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Milan Broz [Sat, 22 Nov 2014 08:36:04 +0000 (09:36 +0100)]
dm crypt: use memzero_explicit for on-stack buffer
commit
1a71d6ffe18c0d0f03fc8531949cc8ed41d702ee upstream.
Use memzero_explicit to cleanup sensitive data allocated on stack
to prevent the compiler from optimizing and removing memset() calls.
Signed-off-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Darrick J. Wong [Wed, 26 Nov 2014 01:45:15 +0000 (17:45 -0800)]
dm bufio: fix memleak when using a dm_buffer's inline bio
commit
445559cdcb98a141f5de415b94fd6eaccab87e6d upstream.
When dm-bufio sets out to use the bio built into a struct dm_buffer to
issue an IO, it needs to call bio_reset after it's done with the bio
so that we can free things attached to the bio such as the integrity
payload. Therefore, inject our own endio callback to take care of
the bio_reset after calling submit_io's end_io callback.
Test case:
1. modprobe scsi_debug delay=0 dif=1 dix=199 ato=1 dev_size_mb=300
2. Set up a dm-bufio client, e.g. dm-verity, on the scsi_debug device
3. Repeatedly read metadata and watch kmalloc-192 leak!
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mikulas Patocka [Fri, 5 Sep 2014 16:16:01 +0000 (12:16 -0400)]
dcache: fix kmemcheck warning in switch_names
commit
08d4f7722268755ee34ed1c9e8afee7dfff022bb upstream.
This patch fixes kmemcheck warning in switch_names. The function
switch_names swaps inline names of two dentries. It swaps full arrays
d_iname, no matter how many bytes are really used by the strings. Reading
data beyond string ends results in kmemcheck warning.
We fix the bug by marking both arrays as fully initialized.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peng Tao [Mon, 17 Nov 2014 03:05:17 +0000 (11:05 +0800)]
nfs41: fix nfs4_proc_layoutget error handling
commit
4bd5a980de87d2b5af417485bde97b8eb3d6cf6a upstream.
nfs4_layoutget_release() drops layout hdr refcnt. Grab the refcnt
early so that it is safe to call .release in case nfs4_alloc_pages
fails.
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Fixes: a47970ff78147 ("NFSv4.1: Hold reference to layout hdr in layoutget")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jan Kara [Wed, 22 Oct 2014 13:21:47 +0000 (15:21 +0200)]
f2fs: fix possible data corruption in f2fs_write_begin()
commit
9234f3190bf8b25b11b105191d408ac50a107948 upstream.
f2fs_write_begin() doesn't initialize the 'dn' variable if the inode has
inline data. However it uses its contents to decide whether it should
just zero out the page or load data to it. Thus if we are unlucky we can
zero out page contents instead of loading inline data into a page.
CC: Changman Lee <cm224.lee@samsung.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jan Kara [Tue, 21 Oct 2014 12:07:33 +0000 (14:07 +0200)]
f2fs: avoid returning uninitialized value to userspace from f2fs_trim_fs()
commit
9bd27ae4aafc9bfee6c8791f7d801ea16cc5622b upstream.
If user specifies too low end sector for trimming, f2fs_trim_fs() will
use uninitialized value as a number of trimmed blocks and returns it to
userspace. Initialize number of trimmed blocks early to avoid the
problem.
Coverity-id:
1248809
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hannes Reinecke [Thu, 30 Oct 2014 08:44:36 +0000 (09:44 +0100)]
scsi: correct return values for .eh_abort_handler implementations
commit
b6c92b7e0af575e2b8b05bdf33633cf9e1661cbf upstream.
The .eh_abort_handler needs to return SUCCESS, FAILED, or
FAST_IO_FAIL. So fixup all callers to adhere to this requirement.
Reviewed-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Markus Pargmann [Mon, 6 Oct 2014 19:33:36 +0000 (21:33 +0200)]
regulator: anatop: Set default voltage selector for vddpu
commit
fe08be3ec8672ed92b3ed1b85810df9fa0f98931 upstream.
The code reads the default voltage selector from its register. If the
bootloader disables the regulator, the default voltage selector will be
0 which results in faulty behaviour of this regulator driver.
This patch sets a default voltage selector for vddpu if it is not set in
the register.
Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sumit.Saxena@avagotech.com [Mon, 17 Nov 2014 09:54:28 +0000 (15:24 +0530)]
megaraid_sas: dndinaness related bug fixes
commit
6e755ddc2935d970574263db3eca547eb70e67d7 upstream.
This patch addresses few endianness related bug fixes.
Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com>
Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sumit.Saxena@avagotech.com [Mon, 17 Nov 2014 09:54:23 +0000 (15:24 +0530)]
megaraid_sas: corrected return of wait_event from abort frame path
commit
170c238701ec38b1829321b17c70671c101bac55 upstream.
Corrected wait_event() call which was waiting for wrong completion
status (0xFF).
Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com>
Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Guo [Wed, 24 Sep 2014 02:29:04 +0000 (04:29 +0200)]
mmc: sdhci-pci-o2micro: Fix Dell E5440 issue
commit
6380ea099cdd46d7377b6fbec0291cf2aa387bad upstream.
Fix Dell E5440 when reboot Linux, can't find o2micro sd host chip issue.
Fixes: 01acf6917aed (mmc: sdhci-pci: add support of O2Micro/BayHubTech SD hosts)
Signed-off-by: Peter Guo <peter.guo@bayhubtech.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Baruch Siach [Mon, 22 Sep 2014 07:12:51 +0000 (10:12 +0300)]
mmc: block: add newline to sysfs display of force_ro
commit
0031a98a85e9fca282624bfc887f9531b2768396 upstream.
Make force_ro consistent with other sysfs entries.
Fixes: 371a689f64b0d ('mmc: MMC boot partitions support')
Cc: Andrei Warkentin <andrey.warkentin@gmail.com>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ulf Hansson [Tue, 25 Nov 2014 12:05:13 +0000 (13:05 +0100)]
mmc: omap_hsmmc: Fix UHS card with DDR50 support
commit
903101a83949d6fc77c092cef07e9c1e10c07e46 upstream.
The commit, mmc: omap: clarify DDR timing mode between SD-UHS and eMMC,
switched omap_hsmmc to support MMC DDR mode instead of UHS DDR50 mode.
Add UHS DDR50 mode again and this time let's also keep the MMC DDR mode.
Fixes: 5438ad95a57c (mmc: omap: clarify DDR timing mode between SD-UHS and eMMC)
Reported-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
James Hogan [Mon, 17 Nov 2014 17:49:05 +0000 (17:49 +0000)]
mmc: dw_mmc: avoid write to CDTHRCTL on older versions
commit
66dfd10173159cafa9cb0d39936b8daeaab8e3e0 upstream.
Commit
f1d2736c8156 (mmc: dw_mmc: control card read threshold) added
dw_mci_ctrl_rd_thld() with an unconditional write to the CDTHRCTL
register at offset 0x100. However before version 240a, the FIFO region
started at 0x100, so the write messes with the FIFO and completely
breaks the driver.
If the version id < 240A, return early from dw_mci_ctl_rd_thld() so as
not to hit this problem.
Fixes: f1d2736c8156 (mmc: dw_mmc: control card read threshold)
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dmitry Eremin-Solenikov [Fri, 24 Oct 2014 17:19:57 +0000 (21:19 +0400)]
mfd: tc6393xb: Fail ohci suspend if full state restore is required
commit
1a5fb99de4850cba710d91becfa2c65653048589 upstream.
Some boards with TC6393XB chip require full state restore during system
resume thanks to chip's VCC being cut off during suspend (Sharp SL-6000
tosa is one of them). Failing to do so would result in ohci Oops on
resume due to internal memory contentes being changed. Fail ohci suspend
on tc6393xb is full state restore is required.
Recommended workaround is to unbind tmio-ohci driver before suspend and
rebind it after resume.
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tony Lindgren [Sun, 2 Nov 2014 18:09:38 +0000 (10:09 -0800)]
mfd: twl4030-power: Fix regression with missing compatible flag
commit
1b9b46d05f887aec418b3a5f4f55abf79316fcda upstream.
Commit
e7cd1d1eb16f ("mfd: twl4030-power: Add generic reset
configuration") accidentally removed the compatible flag for
"ti,twl4030-power" that should be there as documented in the
binding.
If "ti,twl4030-power" only the poweroff configuration is done
by the driver.
Fixes: e7cd1d1eb16f ("mfd: twl4030-power: Add generic reset configuration")
Reported-by: "Dr. H. Nikolaus Schaller" <hns@goldelico.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sonny Rao [Mon, 24 Nov 2014 07:02:44 +0000 (23:02 -0800)]
clocksource: arch_timer: Fix code to use physical timers when requested
commit
0b46b8a718c6e90910a1b1b0fe797be3c167e186 upstream.
This is a bug fix for using physical arch timers when
the arch_timer_use_virtual boolean is false. It restores the
arch_counter_get_cntpct() function after removal in
0d651e4e "clocksource: arch_timer: use virtual counters"
We need this on certain ARMv7 systems which are architected like this:
* The firmware doesn't know and doesn't care about hypervisor mode and
we don't want to add the complexity of hypervisor there.
* The firmware isn't involved in SMP bringup or resume.
* The ARCH timer come up with an uninitialized offset between the
virtual and physical counters. Each core gets a different random
offset.
* The device boots in "Secure SVC" mode.
* Nothing has touched the reset value of CNTHCTL.PL1PCEN or
CNTHCTL.PL1PCTEN (both default to 1 at reset)
One example of such as system is RK3288 where it is much simpler to
use the physical counter since there's nobody managing the offset and
each time a core goes down and comes back up it will get reinitialized
to some other random value.
Fixes: 0d651e4e65e9 ("clocksource: arch_timer: use virtual counters")
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hante Meuleman [Wed, 3 Dec 2014 20:05:27 +0000 (21:05 +0100)]
brcmfmac: Fix bitmap malloc bug in msgbuf.
commit
333c2aa029b847051a2db76a6ca59f699a520030 upstream.
Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andy Lutomirski [Sat, 6 Dec 2014 03:03:28 +0000 (19:03 -0800)]
x86, kvm: Clear paravirt_enabled on KVM guests for espfix32's benefit
commit
29fa6825463c97e5157284db80107d1bfac5d77b upstream.
paravirt_enabled has the following effects:
- Disables the F00F bug workaround warning. There is no F00F bug
workaround any more because Linux's standard IDT handling already
works around the F00F bug, but the warning still exists. This
is only cosmetic, and, in any event, there is no such thing as
KVM on a CPU with the F00F bug.
- Disables 32-bit APM BIOS detection. On a KVM paravirt system,
there should be no APM BIOS anyway.
- Disables tboot. I think that the tboot code should check the
CPUID hypervisor bit directly if it matters.
- paravirt_enabled disables espfix32. espfix32 should *not* be
disabled under KVM paravirt.
The last point is the purpose of this patch. It fixes a leak of the
high 16 bits of the kernel stack address on 32-bit KVM paravirt
guests. Fixes CVE-2014-8134.
Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andy Lutomirski [Mon, 8 Dec 2014 21:55:20 +0000 (13:55 -0800)]
x86_64, switch_to(): Load TLS descriptors before switching DS and ES
commit
f647d7c155f069c1a068030255c300663516420e upstream.
Otherwise, if buggy user code points DS or ES into the TLS
array, they would be corrupted after a context switch.
This also significantly improves the comments and documents some
gotchas in the code.
Before this patch, the both tests below failed. With this
patch, the es test passes, although the gsbase test still fails.
----- begin es test -----
/*
* Copyright (c) 2014 Andy Lutomirski
* GPL v2
*/
static unsigned short GDT3(int idx)
{
return (idx << 3) | 3;
}
static int create_tls(int idx, unsigned int base)
{
struct user_desc desc = {
.entry_number = idx,
.base_addr = base,
.limit = 0xfffff,
.seg_32bit = 1,
.contents = 0, /* Data, grow-up */
.read_exec_only = 0,
.limit_in_pages = 1,
.seg_not_present = 0,
.useable = 0,
};
if (syscall(SYS_set_thread_area, &desc) != 0)
err(1, "set_thread_area");
return desc.entry_number;
}
int main()
{
int idx = create_tls(-1, 0);
printf("Allocated GDT index %d\n", idx);
unsigned short orig_es;
asm volatile ("mov %%es,%0" : "=rm" (orig_es));
int errors = 0;
int total = 1000;
for (int i = 0; i < total; i++) {
asm volatile ("mov %0,%%es" : : "rm" (GDT3(idx)));
usleep(100);
unsigned short es;
asm volatile ("mov %%es,%0" : "=rm" (es));
asm volatile ("mov %0,%%es" : : "rm" (orig_es));
if (es != GDT3(idx)) {
if (errors == 0)
printf("[FAIL]\tES changed from 0x%hx to 0x%hx\n",
GDT3(idx), es);
errors++;
}
}
if (errors) {
printf("[FAIL]\tES was corrupted %d/%d times\n", errors, total);
return 1;
} else {
printf("[OK]\tES was preserved\n");
return 0;
}
}
----- end es test -----
----- begin gsbase test -----
/*
* gsbase.c, a gsbase test
* Copyright (c) 2014 Andy Lutomirski
* GPL v2
*/
static unsigned char *testptr, *testptr2;
static unsigned char read_gs_testvals(void)
{
unsigned char ret;
asm volatile ("movb %%gs:%1, %0" : "=r" (ret) : "m" (*testptr));
return ret;
}
int main()
{
int errors = 0;
testptr = mmap((void *)0x200000000UL, 1, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0);
if (testptr == MAP_FAILED)
err(1, "mmap");
testptr2 = mmap((void *)0x300000000UL, 1, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0);
if (testptr2 == MAP_FAILED)
err(1, "mmap");
*testptr = 0;
*testptr2 = 1;
if (syscall(SYS_arch_prctl, ARCH_SET_GS,
(unsigned long)testptr2 - (unsigned long)testptr) != 0)
err(1, "ARCH_SET_GS");
usleep(100);
if (read_gs_testvals() == 1) {
printf("[OK]\tARCH_SET_GS worked\n");
} else {
printf("[FAIL]\tARCH_SET_GS failed\n");
errors++;
}
asm volatile ("mov %0,%%gs" : : "r" (0));
if (read_gs_testvals() == 0) {
printf("[OK]\tWriting 0 to gs worked\n");
} else {
printf("[FAIL]\tWriting 0 to gs failed\n");
errors++;
}
usleep(100);
if (read_gs_testvals() == 0) {
printf("[OK]\tgsbase is still zero\n");
} else {
printf("[FAIL]\tgsbase was corrupted\n");
errors++;
}
return errors == 0 ? 0 : 1;
}
----- end gsbase test -----
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/509d27c9fec78217691c3dad91cec87e1006b34a.1418075657.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andy Lutomirski [Fri, 5 Dec 2014 00:48:17 +0000 (16:48 -0800)]
x86/tls: Disallow unusual TLS segments
commit
0e58af4e1d2166e9e33375a0f121e4867010d4f8 upstream.
Users have no business installing custom code segments into the
GDT, and segments that are not present but are otherwise valid
are a historical source of interesting attacks.
For completeness, block attempts to set the L bit. (Prior to
this patch, the L bit would have been silently dropped.)
This is an ABI break. I've checked glibc, musl, and Wine, and
none of them look like they'll have any trouble.
Note to stable maintainers: this is a hardening patch that fixes
no known bugs. Given the possibility of ABI issues, this
probably shouldn't be backported quickly.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andy Lutomirski [Fri, 5 Dec 2014 00:48:16 +0000 (16:48 -0800)]
x86/tls: Validate TLS entries to protect espfix
commit
41bdc78544b8a93a9c6814b8bbbfef966272abbe upstream.
Installing a 16-bit RW data segment into the GDT defeats espfix.
AFAICT this will not affect glibc, Wine, or dosemu at all.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jan Kara [Mon, 15 Dec 2014 13:22:46 +0000 (14:22 +0100)]
isofs: Fix infinite looping over CE entries
commit
f54e18f1b831c92f6512d2eedb224cd63d607d3d upstream.
Rock Ridge extensions define so called Continuation Entries (CE) which
define where is further space with Rock Ridge data. Corrupted isofs
image can contain arbitrarily long chain of these, including a one
containing loop and thus causing kernel to end in an infinite loop when
traversing these entries.
Limit the traversal to 32 entries which should be more than enough space
to store all the Rock Ridge data.
Reported-by: P J P <ppandit@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Tue, 16 Dec 2014 17:39:45 +0000 (09:39 -0800)]
Linux 3.18.1
Takashi Iwai [Sat, 6 Dec 2014 17:02:55 +0000 (18:02 +0100)]
ALSA: usb-audio: Don't resubmit pending URBs at MIDI error recovery
commit
66139a48cee1530c91f37c145384b4ee7043f0b7 upstream.
In snd_usbmidi_error_timer(), the driver tries to resubmit MIDI input
URBs to reactivate the MIDI stream, but this causes the error when
some of URBs are still pending like:
WARNING: CPU: 0 PID: 0 at ../drivers/usb/core/urb.c:339 usb_submit_urb+0x5f/0x70()
URB
ef705c40 submitted while active
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.6-2-desktop #1
Hardware name: FOXCONN TPS01/TPS01, BIOS 080015 03/23/2010
c0984bfa f4009ed4 c078deaf f4009ee4 c024c884 c09a135c f4009f00 00000000
c0984bfa 00000153 c061ac4f c061ac4f 00000009 00000001 ef705c40 e854d1c0
f4009eec c024c8d3 00000009 f4009ee4 c09a135c f4009f00 f4009f04 c061ac4f
Call Trace:
[<
c0205df6>] try_stack_unwind+0x156/0x170
[<
c020482a>] dump_trace+0x5a/0x1b0
[<
c0205e56>] show_trace_log_lvl+0x46/0x50
[<
c02049d1>] show_stack_log_lvl+0x51/0xe0
[<
c0205eb7>] show_stack+0x27/0x50
[<
c078deaf>] dump_stack+0x45/0x65
[<
c024c884>] warn_slowpath_common+0x84/0xa0
[<
c024c8d3>] warn_slowpath_fmt+0x33/0x40
[<
c061ac4f>] usb_submit_urb+0x5f/0x70
[<
f7974104>] snd_usbmidi_submit_urb+0x14/0x60 [snd_usbmidi_lib]
[<
f797483a>] snd_usbmidi_error_timer+0x6a/0xa0 [snd_usbmidi_lib]
[<
c02570c0>] call_timer_fn+0x30/0x130
[<
c0257442>] run_timer_softirq+0x1c2/0x260
[<
c0251493>] __do_softirq+0xc3/0x270
[<
c0204732>] do_softirq_own_stack+0x22/0x30
[<
c025186d>] irq_exit+0x8d/0xa0
[<
c0795228>] smp_apic_timer_interrupt+0x38/0x50
[<
c0794a3c>] apic_timer_interrupt+0x34/0x3c
[<
c0673d9e>] cpuidle_enter_state+0x3e/0xd0
[<
c028bb8d>] cpu_idle_loop+0x29d/0x3e0
[<
c028bd23>] cpu_startup_entry+0x53/0x60
[<
c0bfac1e>] start_kernel+0x415/0x41a
For avoiding these errors, check the pending URBs and skip
resubmitting such ones.
Reported-and-tested-by: Stefan Seyfried <stefan.seyfried@googlemail.com>
Acked-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Takashi Iwai [Thu, 13 Nov 2014 06:11:38 +0000 (07:11 +0100)]
ALSA: hda - Fix built-in mic at resume on Lenovo Ideapad S210
commit
fedb2245cbb8d823e449ebdd48ba9bb35c071ce0 upstream.
The built-in mic boost volume gets almost muted after suspend/resume
on Lenovo Ideapad S210.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=88121
Reported-and-tested-by: Roman Kagan <rkagan@mail.ru>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Takashi Iwai [Tue, 9 Dec 2014 18:58:53 +0000 (19:58 +0100)]
ALSA: hda - Add EAPD fixup for ASUS Z99He laptop
commit
f62f5eff3d40a56ad1cf0d81a6cac8dd8743e8a1 upstream.
The same fixup to enable EAPD is needed for ASUS Z99He with
AD1986A
codec like another ASUS machine.
Reported-and-tested-by: Dmitry V. Zimin <pfzim@mail.ru>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Al Viro [Sun, 26 Oct 2014 23:31:10 +0000 (19:31 -0400)]
deal with deadlock in d_walk()
commit
ca5358ef75fc69fee5322a38a340f5739d997c10 upstream.
... by not hitting rename_retry for reasons other than rename having
happened. In other words, do _not_ restart when finding that
between unlocking the child and locking the parent the former got
into __dentry_kill(). Skip the killed siblings instead...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Al Viro [Sun, 26 Oct 2014 23:19:16 +0000 (19:19 -0400)]
move d_rcu from overlapping d_child to overlapping d_alias
commit
946e51f2bf37f1656916eb75bd0742ba33983c28 upstream.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Larry Finger [Fri, 28 Nov 2014 16:41:16 +0000 (10:41 -0600)]
rtlwifi: rtl8192ce: Fix missing interrupt ready flag
commit
87141db0848aa20c43d453f5545efc8f390d4372 upstream.
Proper operation with the rewritten PCI mini driver requires that a flag be set
when interrupts are enabled. This flag was missed. This patch is one of three needed to
fix the kernel regression reported at https://bugzilla.kernel.org/show_bug.cgi?id=88951.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-by: Catalin Iacob <iacobcatalin@gmail.com>
Tested-by: Catalin Iacob <iacobcatalin@gmail.com>
Cc: Catalin Iacob <iacobcatalin@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Larry Finger [Fri, 28 Nov 2014 16:41:15 +0000 (10:41 -0600)]
rtlwifi: rtl8192ce: Fix kernel crashes due to missing callback entry
commit
f892914c03131a445b926b82815b03162c19288e upstream.
In the major update of the rtlwifi-family of drivers, one of the callback entries
was missed, which leads to memory corruption. Unfortunately, this corruption
never caused a kernel oops, but showed up in other parts of the system.
This patch is one of three needed to fix the kernel regression reported at
https://bugzilla.kernel.org/show_bug.cgi?id=88951.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-by: Catalin Iacob <iacobcatalin@gmail.com>
Tested-by: Catalin Iacob <iacobcatalin@gmail.com>
Cc: Catalin Iacob <iacobcatalin@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Larry Finger [Fri, 28 Nov 2014 16:41:14 +0000 (10:41 -0600)]
rtlwifi: rtl8192ce: Fix editing error that causes silent memory corruption
commit
99a82f734aa6c6d397e029e6dfa933f04e0fa8c8 upstream.
In the major update of the rtlwifi-family of drivers, there was an editing
mistake. Unfortunately, this particular error leads to memory corruption that
silently leads to failure of the system. This patch is one of three needed to
fix the kernel regression reported at https://bugzilla.kernel.org/show_bug.cgi?id=88951.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-by: Catalin Iacob <iacobcatalin@gmail.com>
Tested-by: Catalin Iacob <iacobcatalin@gmail.com>
Cc: Catalin Iacob <iacobcatalin@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Borkmann [Wed, 10 Dec 2014 15:33:10 +0000 (16:33 +0100)]
netlink: use jhash as hashfn for rhashtable
[ Upstream commit
7f19fc5e0b617593dcda0d9956adc78b559ef1f5 ]
For netlink, we shouldn't be using arch_fast_hash() as a hashing
discipline, but rather jhash() instead.
Since netlink sockets can be opened by any user, a local attacker
would be able to easily create collisions with the DPDK-derived
arch_fast_hash(), which trades off performance for security by
using crc32 CPU instructions on x86_64.
While it might have a legimite use case in other places, it should
be avoided in netlink context, though. As rhashtable's API is very
flexible, we could later on still decide on other hashing disciplines,
if legitimate.
Reference: http://thread.gmane.org/gmane.linux.kernel/
1844123
Fixes: e341694e3eb5 ("netlink: Convert netlink_lookup() to use RCU protected hash table")
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Valdis.Kletnieks@vt.edu [Tue, 9 Dec 2014 21:15:50 +0000 (16:15 -0500)]
net: fix suspicious rcu_dereference_check in net/sched/sch_fq_codel.c
[ Upstream commit
69204cf7eb9c5a72067ce6922d4699378251d053 ]
commit
46e5da40ae (net: qdisc: use rcu prefix and silence
sparse warnings) triggers a spurious warning:
net/sched/sch_fq_codel.c:97 suspicious rcu_dereference_check() usage!
The code should be using the _bh variant of rcu_dereference.
Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Vrabel [Tue, 9 Dec 2014 18:43:28 +0000 (18:43 +0000)]
xen-netfront: use correct linear area after linearizing an skb
[ Upstream commit
11d3d2a16cc1f05c6ece69a4392e99efb85666a6 ]
Commit
97a6d1bb2b658ac85ed88205ccd1ab809899884d (xen-netfront: Fix
handling packets on compound pages with skb_linearize) attempted to
fix a problem where an skb that would have required too many slots
would be dropped causing TCP connections to stall.
However, it filled in the first slot using the original buffer and not
the new one and would use the wrong offset and grant access to the
wrong page.
Netback would notice the malformed request and stop all traffic on the
VIF, reporting:
vif vif-3-0 vif3.0: txreq.offset: 85e, size: 4002, end: 6144
vif vif-3-0 vif3.0: fatal error; disabling device
Reported-by: Anthony Wright <anthony@overnetdata.com>
Tested-by: Anthony Wright <anthony@overnetdata.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Tue, 9 Dec 2014 17:56:08 +0000 (09:56 -0800)]
tcp: fix more NULL deref after prequeue changes
[ Upstream commit
0f85feae6b710ced3abad5b2b47d31dfcb956b62 ]
When I cooked commit
c3658e8d0f1 ("tcp: fix possible NULL dereference in
tcp_vX_send_reset()") I missed other spots we could deref a NULL
skb_dst(skb)
Again, if a socket is provided, we do not need skb_dst() to get a
pointer to network namespace : sock_net(sk) is good enough.
Reported-by: Dann Frazier <dann.frazier@canonical.com>
Bisected-by: Dann Frazier <dann.frazier@canonical.com>
Tested-by: Dann Frazier <dann.frazier@canonical.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: ca777eff51f7 ("tcp: remove dst refcount false sharing for prequeue mode")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Borkmann [Wed, 3 Dec 2014 11:13:58 +0000 (12:13 +0100)]
net: sctp: use MAX_HEADER for headroom reserve in output path
[ Upstream commit
9772b54c55266ce80c639a80aa68eeb908f8ecf5 ]
To accomodate for enough headroom for tunnels, use MAX_HEADER instead
of LL_MAX_HEADER. Robert reported that he has hit after roughly 40hrs
of trinity an skb_under_panic() via SCTP output path (see reference).
I couldn't reproduce it from here, but not using MAX_HEADER as elsewhere
in other protocols might be one possible cause for this.
In any case, it looks like accounting on chunks themself seems to look
good as the skb already passed the SCTP output path and did not hit
any skb_over_panic(). Given tunneling was enabled in his .config, the
headroom would have been expanded by MAX_HEADER in this case.
Reported-by: Robert Święcki <robert@swiecki.net>
Reference: https://lkml.org/lkml/2014/12/1/507
Fixes: 594ccc14dfe4d ("[SCTP] Replace incorrect use of dev_alloc_skb with alloc_skb in sctp_packet_transmit().")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Tue, 2 Dec 2014 12:30:59 +0000 (04:30 -0800)]
net: mvneta: fix race condition in mvneta_tx()
[ Upstream commit
5f478b41033606d325e420df693162e2524c2b94 ]
mvneta_tx() dereferences skb to get skb->len too late,
as hardware might have completed the transmit and TX completion
could have freed the skb from another cpu.
Fixes: 71f6d1b31fb1 ("net: mvneta: replace Tx timer with a real interrupt")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
willy tarreau [Tue, 2 Dec 2014 07:13:04 +0000 (08:13 +0100)]
net: mvneta: fix Tx interrupt delay
[ Upstream commit
aebea2ba0f7495e1a1c9ea5e753d146cb2f6b845 ]
The mvneta driver sets the amount of Tx coalesce packets to 16 by
default. Normally that does not cause any trouble since the driver
uses a much larger Tx ring size (532 packets). But some sockets
might run with very small buffers, much smaller than the equivalent
of 16 packets. This is what ping is doing for example, by setting
SNDBUF to 324 bytes rounded up to 2kB by the kernel.
The problem is that there is no documented method to force a specific
packet to emit an interrupt (eg: the last of the ring) nor is it
possible to make the NIC emit an interrupt after a given delay.
In this case, it causes trouble, because when ping sends packets over
its raw socket, the few first packets leave the system, and the first
15 packets will be emitted without an IRQ being generated, so without
the skbs being freed. And since the socket's buffer is small, there's
no way to reach that amount of packets, and the ping ends up with
"send: no buffer available" after sending 6 packets. Running with 3
instances of ping in parallel is enough to hide the problem, because
with 6 packets per instance, that's 18 packets total, which is enough
to grant a Tx interrupt before all are sent.
The original driver in the LSP kernel worked around this design flaw
by using a software timer to clean up the Tx descriptors. This timer
was slow and caused terrible network performance on some Tx-bound
workloads (such as routing) but was enough to make tools like ping
work correctly.
Instead here, we simply set the packet counts before interrupt to 1.
This ensures that each packet sent will produce an interrupt. NAPI
takes care of coalescing interrupts since the interrupt is disabled
once generated.
No measurable performance impact nor CPU usage were observed on small
nor large packets, including when saturating the link on Tx, and this
fixes tools like ping which rely on too small a send buffer. If one
wants to increase this value for certain workloads where it is safe
to do so, "ethtool -C $dev tx-frames" will override this default
setting.
This fix needs to be applied to stable kernels starting with 3.10.
Tested-By: Maggie Mae Roxas <maggie.mae.roxas@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Denis Kirjanov [Mon, 1 Dec 2014 09:57:02 +0000 (12:57 +0300)]
mips: bpf: Fix broken BPF_MOD
[ Upstream commit
2e46477a12f6fd273e31a220b155d66e8352198c ]
Remove optimize_div() from BPF_MOD | BPF_K case
since we don't know the dividend and fix the
emit_mod() by reading the mod operation result from HI register
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Reviewed-by: Markos Chandras <markos.chandras@imgtec.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pravin B Shelar [Mon, 1 Dec 2014 07:04:17 +0000 (23:04 -0800)]
openvswitch: Fix flow mask validation.
[ Upstream commit
f2a01517f2a1040a0b156f171a7cefd748f2fd03 ]
Following patch fixes typo in the flow validation. This prevented
installation of ARP and IPv6 flows.
Fixes: 19e7a3df72 ("openvswitch: Fix NDP flow mask validation")
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Reviewed-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tom Herbert [Sat, 29 Nov 2014 17:59:45 +0000 (09:59 -0800)]
gre: Set inner mac header in gro complete
[ Upstream commit
6fb2a756739aa507c1fd5b8126f0bfc2f070dc46 ]
Set the inner mac header to point to the GRE payload when
doing GRO. This is needed if we proceed to send the packet
through GRE GSO which now uses the inner mac header instead
of inner network header to determine the length of encapsulation
headers.
Fixes: 14051f0452a2 ("gre: Use inner mac length when computing tunnel length")
Reported-by: Wolfgang Walter <linux@stwm.de>
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Marcelo Leitner [Thu, 11 Dec 2014 12:02:22 +0000 (10:02 -0200)]
Fix race condition between vxlan_sock_add and vxlan_sock_release
[ Upstream commit
00c83b01d58068dfeb2e1351cca6fccf2a83fa8f ]
Currently, when trying to reuse a socket, vxlan_sock_add will grab
vn->sock_lock, locate a reusable socket, inc refcount and release
vn->sock_lock.
But vxlan_sock_release() will first decrement refcount, and then grab
that lock. refcnt operations are atomic but as currently we have
deferred works which hold vs->refcnt each, this might happen, leading to
a use after free (specially after vxlan_igmp_leave):
CPU 1 CPU 2
deferred work vxlan_sock_add
... ...
spin_lock(&vn->sock_lock)
vs = vxlan_find_sock();
vxlan_sock_release
dec vs->refcnt, reaches 0
spin_lock(&vn->sock_lock)
vxlan_sock_hold(vs), refcnt=1
spin_unlock(&vn->sock_lock)
hlist_del_rcu(&vs->hlist);
vxlan_notify_del_rx_port(vs)
spin_unlock(&vn->sock_lock)
So when we look for a reusable socket, we check if it wasn't freed
already before reusing it.
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Fixes: 7c47cedf43a8b3 ("vxlan: move IGMP join/leave to work queue")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>