Ingo Molnar [Mon, 8 Jun 2015 19:20:26 +0000 (21:20 +0200)]
x86/asm/entry: (Re-)rename __NR_entry_INT80_compat_max to __NR_syscall_compat_max
Brian Gerst noticed that I did a weird rename in the following commit:
b2502b418e63 ("x86/asm/entry: Untangle 'system_call' into two entry points: entry_SYSCALL_64 and entry_INT80_32")
which renamed __NR_ia32_syscall_max to __NR_entry_INT80_compat_max.
Now the original name was a misnomer, but the new one is a misnomer as well,
as all the 32-bit compat syscall entry points (sysenter, syscall) share the
system call table, not just the INT80 based one.
Rename it to __NR_syscall_compat_max.
Reported-by: Brian Gerst <brgerst@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Denys Vlasenko [Sun, 7 Jun 2015 18:24:30 +0000 (20:24 +0200)]
x86/asm/entry/32: Reinstate clearing of pt_regs->r8..r11 on EFAULT path
I broke this recently when I changed pt_regs->r8..r11 clearing
logic in INT 80 code path.
There is a branch from SYSENTER/SYSCALL code to INT 80 code:
if we fail to retrieve arg6, we return EFAULT. Before this
patch, in this case we don't clear pt_regs->r8..r11.
This patch fixes this. The resulting code is smaller and
simpler.
While at it, remove incorrect comment about syscall dispatching
CALL insn: it does not use RIP-relative addressing form (the
comment was meant to be "TODO: make this rip-relative", and
morphed since then, dropping "TODO").
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1433701470-28800-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Mon, 8 Jun 2015 18:43:07 +0000 (20:43 +0200)]
x86/asm/entry/64: Clean up entry_64.S
Make the 64-bit syscall entry code a bit more readable:
- use consistent assembly coding style similar to the other entry_*.S files
- remove old comments that are not true anymore
- eliminate whitespace noise
- use consistent vertical spacing
- fix various comments
- reorganize entry point generation tables to be more readable
No code changed:
# arch/x86/entry/entry_64.o:
text data bss dec hex filename
12282 0 0 12282 2ffa entry_64.o.before
12282 0 0 12282 2ffa entry_64.o.after
md5:
cbab1f2d727a2a8a87618eeb79f391b7 entry_64.o.before.asm
cbab1f2d727a2a8a87618eeb79f391b7 entry_64.o.after.asm
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Mon, 8 Jun 2015 18:48:01 +0000 (20:48 +0200)]
Merge branch 'x86/asm' into x86/core, to prepare for new patch
Collect all changes to arch/x86/entry/entry_64.S, before applying
patch that changes most of the file.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Mon, 8 Jun 2015 07:49:11 +0000 (09:49 +0200)]
x86/asm/entry/32: Clean up entry_32.S
Make the 32-bit syscall entry code a bit more readable:
- use consistent assembly coding style similar to entry_64.S
- remove old comments that are not true anymore
- eliminate whitespace noise
- use consistent vertical spacing
- fix various comments
No code changed:
# arch/x86/entry/entry_32.o:
text data bss dec hex filename
6025 0 0 6025 1789 entry_32.o.before
6025 0 0 6025 1789 entry_32.o.after
md5:
f3fa16b2b0dca804f052deb6b30ba6cb entry_32.o.before.asm
f3fa16b2b0dca804f052deb6b30ba6cb entry_32.o.after.asm
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Mon, 8 Jun 2015 06:42:03 +0000 (08:42 +0200)]
x86/asm/entry: Untangle 'system_call' into two entry points: entry_SYSCALL_64 and entry_INT80_32
The 'system_call' entry points differ starkly between native 32-bit and 64-bit
kernels: on 32-bit kernels it defines the INT 0x80 entry point, while on
64-bit it's the SYSCALL entry point.
This is pretty confusing when looking at generic code, and it also obscures
the nature of the entry point at the assembly level.
So unangle this by splitting the name into its two uses:
system_call (32) -> entry_INT80_32
system_call (64) -> entry_SYSCALL_64
As per the generic naming scheme for x86 system call entry points:
entry_MNEMONIC_qualifier
where 'qualifier' is one of _32, _64 or _compat.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Mon, 8 Jun 2015 06:33:56 +0000 (08:33 +0200)]
x86/asm/entry: Untangle 'ia32_sysenter_target' into two entry points: entry_SYSENTER_32 and entry_SYSENTER_compat
So the SYSENTER instruction is pretty quirky and it has different behavior
depending on bitness and CPU maker.
Yet we create a false sense of coherency by naming it 'ia32_sysenter_target'
in both of the cases.
Split the name into its two uses:
ia32_sysenter_target (32) -> entry_SYSENTER_32
ia32_sysenter_target (64) -> entry_SYSENTER_compat
As per the generic naming scheme for x86 system call entry points:
entry_MNEMONIC_qualifier
where 'qualifier' is one of _32, _64 or _compat.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Mon, 8 Jun 2015 06:28:07 +0000 (08:28 +0200)]
x86/asm/entry: Rename compat syscall entry points
Rename the following system call entry points:
ia32_cstar_target -> entry_SYSCALL_compat
ia32_syscall -> entry_INT80_compat
The generic naming scheme for x86 system call entry points is:
entry_MNEMONIC_qualifier
where 'qualifier' is one of _32, _64 or _compat.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Borislav Petkov [Thu, 4 Jun 2015 16:55:26 +0000 (18:55 +0200)]
x86/uapi: Do not export <asm/msr-index.h> as part of the user API headers
This header containing all MSRs and respective bit definitions
got exported to userspace in conjunction with the big UAPI
shuffle.
But, it doesn't belong in the UAPI headers because userspace can
do its own MSR defines and exporting them from the kernel blocks
us from doing cleanups/renames in that header. Which is
ridiculous - it is not kernel's job to export such a header and
keep MSRs list and their names stable.
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1433436928-31903-19-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Sun, 7 Jun 2015 13:35:27 +0000 (15:35 +0200)]
Merge branch 'x86/ras' into x86/core, to fix conflicts
Conflicts:
arch/x86/include/asm/irq_vectors.h
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Borislav Petkov [Thu, 4 Jun 2015 16:55:25 +0000 (18:55 +0200)]
x86: Kill CONFIG_X86_HT
In talking to Aravind recently about making certain AMD topology
attributes available to the MCE injection module, it seemed like
that CONFIG_X86_HT thing is more or less superfluous. It is
def_bool y, depends on SMP and gets enabled in the majority of
.configs - distro and otherwise - out there.
So let's kill it and make code behind it depend directly on SMP.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Daniel Walter <dwalter@google.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jacob Shin <jacob.w.shin@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1433436928-31903-18-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ashok Raj [Thu, 4 Jun 2015 16:55:24 +0000 (18:55 +0200)]
x86/mce: Handle Local MCE events
Add the necessary changes to do_machine_check() to be able to
process MCEs signaled as local MCEs. Typically, only recoverable
errors (SRAR type) will be Signaled as LMCE. The architecture
does not restrict to only those errors, however.
When errors are signaled as LMCE, there is no need for the MCE
handler to perform rendezvous with other logical processors
unlike earlier processors that would broadcast machine check
errors.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1433436928-31903-17-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ashok Raj [Thu, 4 Jun 2015 16:55:23 +0000 (18:55 +0200)]
x86/mce: Add infrastructure to support Local MCE
Initialize and prepare for handling LMCEs. Add a boot-time
option to disable LMCEs.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
[ Simplify stuff, align statements for better readability, reflow comments; kill
unused lmce_clear(); save us an MSR write if LMCE is already enabled. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1433436928-31903-16-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ashok Raj [Thu, 4 Jun 2015 16:55:22 +0000 (18:55 +0200)]
x86/mce: Add Local MCE definitions
Add required definitions to support Local Machine Check
Exceptions.
Historically, machine check exceptions on Intel x86 processors
have been broadcast to all logical processors in the system.
Upcoming CPUs will support an opt-in mechanism to request some
machine check exceptions be delivered to a single logical
processor experiencing the fault.
See http://www.intel.com/sdm Volume 3, System Programming Guide,
chapter 15 for more information on MSRs and documentation on
Local MCE.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1433436928-31903-15-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Fri, 5 Jun 2015 12:11:49 +0000 (14:11 +0200)]
x86/asm/entry/64/compat: Rename ia32entry.S -> entry_64_compat.S
So we now have the following system entry code related
files, which define the following system call instruction
and other entry paths:
entry_32.S # 32-bit binaries on 32-bit kernels
entry_64.S # 64-bit binaries on 64-bit kernels
entry_64_compat.S # 32-bit binaries on 64-bit kernels
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Denys Vlasenko [Wed, 3 Jun 2015 13:58:50 +0000 (15:58 +0200)]
x86/asm/entry/32: Remove unnecessary optimization in stub32_clone
Really swap arguments #4 and #5 in stub32_clone instead of
"optimizing" it into a move.
Yes, tls_val is currently unused. Yes, on some CPUs XCHG is a
little bit more expensive than MOV. But a cycle or two on an
expensive syscall like clone() is way below noise floor, and
this optimization is simply not worth the obfuscation of logic.
[ There's also ongoing work on the clone() ABI by Josh Triplett
that will depend on this change later on. ]
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1433339930-20880-2-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Denys Vlasenko [Wed, 3 Jun 2015 13:58:49 +0000 (15:58 +0200)]
x86/asm/entry/32: Explain the stub32_clone logic
The reason for copying of %r8 to %rcx is quite non-obvious.
Add a comment which explains why it is done.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1433339930-20880-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Fri, 5 Jun 2015 11:02:28 +0000 (13:02 +0200)]
x86/asm/entry/32: Improve code readability
Make the 64-bit compat 32-bit syscall entry code a bit more readable:
- eliminate whitespace noise
- use consistent vertical spacing
- use consistent assembly coding style similar to entry_64.S
- fix various comments
No code changed:
arch/x86/entry/ia32entry.o:
text data bss dec hex filename
1391 0 0 1391 56f ia32entry.o.before
1391 0 0 1391 56f ia32entry.o.after
md5:
f28501dcc366e68b557313942c6496d6 ia32entry.o.before.asm
f28501dcc366e68b557313942c6496d6 ia32entry.o.after.asm
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Denys Vlasenko [Wed, 3 Jun 2015 12:56:09 +0000 (14:56 +0200)]
x86/asm/entry/32: Do not use R9 in SYSCALL32 entry point
SYSENTER and SYSCALL 32-bit entry points differ in handling of
arg2 and arg6.
SYSENTER:
* ecx arg2
* ebp user stack
* 0(%ebp) arg6
SYSCALL:
* ebp arg2
* esp user stack
* 0(%esp) arg6
Sysenter code loads 0(%ebp) to %ebp right away.
(This destroys %ebp. It means we do not preserve it on return.
It's not causing problems since userspace VDSO code does not
depend on it, and SYSENTER insn can't be sanely used outside of
VDSO).
Syscall code loads 0(%ebp) to %r9. This allows to eliminate one
MOV insn (r9 is a register where arg6 should be for 64-bit ABI),
but on audit/ptrace code paths this requires juggling of r9 and
ebp: (1) ptrace expects arg6 to be in pt_regs->bp;
(2) r9 is callee-clobbered register and needs to be
saved/restored around calls to C functions.
This patch changes syscall code to load 0(%ebp) to %ebp, making
it more similar to sysenter code. It's a bit smaller:
text data bss dec hex filename
1407 0 0 1407 57f ia32entry.o.before
1391 0 0 1391 56f ia32entry.o
To preserve ABI compat, we restore ebp on exit.
Run-tested.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1433336169-18964-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Denys Vlasenko [Tue, 2 Jun 2015 19:04:02 +0000 (21:04 +0200)]
x86/asm/entry/32: Open-code LOAD_ARGS32
This macro is small, has only three callsites, and one of them
is slightly different using a conditional parameter.
A few saved lines aren't worth the resulting obfuscation.
Generated machine code is identical.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1433271842-9139-2-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Denys Vlasenko [Tue, 2 Jun 2015 19:04:01 +0000 (21:04 +0200)]
x86/asm/entry/32: Open-code CLEAR_RREGS
This macro is small, has only four callsites, and one of them is
slightly different using a conditional parameter.
A few saved lines aren't worth the resulting obfuscation.
Generated machine code is identical.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
[ Added comments. ]
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1433271842-9139-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Denys Vlasenko [Tue, 2 Jun 2015 17:35:10 +0000 (19:35 +0200)]
x86/asm/entry/32: Simplify the zeroing of pt_regs->r8..r11 in the int80 code path
32-bit syscall entry points do not save the complete pt_regs struct,
they leave some fields uninitialized. However, they must be
careful to not leak uninitialized data in pt_regs->r8..r11 to
ptrace users.
CLEAR_RREGS macro is used to zero these fields out when needed.
However, in the int80 code path this zeroing is unconditional.
This patch simplifies it by storing zeroes there right away,
when pt_regs is constructed on stack.
This uses shorter instructions:
text data bss dec hex filename
1423 0 0 1423 58f ia32entry.o.before
1407 0 0 1407 57f ia32entry.o
Compile-tested.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1433266510-2938-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Andy Lutomirski [Thu, 4 Jun 2015 20:24:29 +0000 (13:24 -0700)]
x86/asm/entry/64: Remove pointless jump to irq_return
INTERRUPT_RETURN turns into a jmp instruction. There's no need
for extra indirection.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: <linux-kernel@vger.kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/2f2318653dbad284a59311f13f08cea71298fd7c.1433449436.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Andy Lutomirski [Fri, 5 Jun 2015 00:13:44 +0000 (17:13 -0700)]
x86/asm/msr: Make wrmsrl_safe() a function
The wrmsrl_safe macro performs invalid shifts if the value
argument is 32 bits. This makes it unnecessarily awkward to
write code that puts an unsigned long into an MSR.
Convert it to a real inline function.
For inspiration, see:
7c74d5b7b7b6 ("x86/asm/entry/64: Fix MSR_IA32_SYSENTER_CS MSR value").
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: <linux-kernel@vger.kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
[ Applied small improvements. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Wed, 3 Jun 2015 16:41:06 +0000 (18:41 +0200)]
x86/asm/entry: Move the vsyscall code to arch/x86/entry/vsyscall/
The vsyscall code is entry code too, so move it to arch/x86/entry/vsyscall/.
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Wed, 3 Jun 2015 16:36:41 +0000 (18:36 +0200)]
x86/asm/entry: Move the arch/x86/syscalls/ definitions to arch/x86/entry/syscalls/
The build time generated syscall definitions are entry code related, move
them into the arch/x86/entry/ directory.
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Wed, 3 Jun 2015 16:29:26 +0000 (18:29 +0200)]
x86/asm/entry: Move arch/x86/include/asm/calling.h to arch/x86/entry/
asm/calling.h is private to the entry code, make this more apparent
by moving it to the new arch/x86/entry/ directory.
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Wed, 3 Jun 2015 16:10:43 +0000 (18:10 +0200)]
x86/asm/entry: Move the 'thunk' functions to arch/x86/entry/
These are all calling x86 entry code functions, so move them close
to other entry code.
Change lib-y to obj-y: there's no real difference between the two
as we don't really drop any of them during the linking stage, and
obj-y is the more common approach for core kernel object code.
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Wed, 3 Jun 2015 16:05:44 +0000 (18:05 +0200)]
x86/asm/entry, x86/vdso: Move the vDSO code to arch/x86/entry/vdso/
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Wed, 3 Jun 2015 16:00:59 +0000 (18:00 +0200)]
x86/asm/entry: Move the compat syscall entry code to arch/x86/entry/
Move the ia32entry.S file over into arch/x86/entry/.
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Wed, 3 Jun 2015 11:37:36 +0000 (13:37 +0200)]
x86/asm/entry: Move entry_64.S and entry_32.S to arch/x86/entry/
Create a new directory hierarchy for the low level x86 entry code:
arch/x86/entry/*
This will host all the low level glue that is currently scattered
all across arch/x86/.
Start with entry_64.S and entry_32.S.
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Wed, 3 Jun 2015 08:00:13 +0000 (10:00 +0200)]
x86/kconfig: Reorganize arch feature Kconfig select's
Peter Zijstra noticed that in arch/x86/Kconfig there are a lot
of X86_{32,64} clauses in the X86 symbol, plus there are a number
of similar selects in the X86_32 and X86_64 config definitions
as well - which all overlap in an inconsistent mess.
So:
- move all select's from X86_32 and X86_64 to the X64 config
option
- sort their names, so that duplications are easier to spot
- align their if clauses, so that they are easier to identify
at a glance - and so that weirdnesses stand out more
No change in functionality:
105 insertions(+)
105 deletions(-)
Originally-from: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150602153027.GU3644@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Wed, 3 Jun 2015 08:07:35 +0000 (10:07 +0200)]
Merge branch 'locking/core' into x86/core, to prepare for dependent patch
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Borislav Petkov [Tue, 2 Jun 2015 13:38:27 +0000 (15:38 +0200)]
lockdep: Do not break user-visible string
Remove the line-break in the user-visible string and add the
missing space in this error message:
WARNING: lockdep init error! lock-(console_sem).lock was acquiredbefore lockdep_init
Also:
- don't yell, it's just a debug warning
- denote references to function calls with '()'
- standardize the lock name quoting
- and finish the sentence.
The result:
WARNING: lockdep init error: lock '(console_sem).lock' was acquired before lockdep_init().
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150602133827.GD19887@pd.tnic
[ Added a few more stylistic tweaks to the error message. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Wed, 3 Jun 2015 08:05:18 +0000 (10:05 +0200)]
Merge branches 'x86/mm', 'x86/build', 'x86/apic' and 'x86/platform' into x86/core, to apply dependent patch
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Jan Beulich [Mon, 1 Jun 2015 12:03:59 +0000 (13:03 +0100)]
x86/asm/entry/64: Fold identical code paths
retint_kernel doesn't require %rcx to be pointing to thread info
(anymore?), and the code on the two alternative paths is - not
really surprisingly - identical.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/556C664F020000780007FB64@mail.emea.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Jan Beulich [Mon, 1 Jun 2015 12:02:55 +0000 (13:02 +0100)]
x86/asm/entry/64: Use negative immediates for stack adjustments
Doing so allows adjustments by 128 bytes (occurring for
REMOVE_PT_GPREGS_FROM_STACK 8 uses) to be expressed with a
single byte immediate.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/556C660F020000780007FB60@mail.emea.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Thu, 28 May 2015 10:21:47 +0000 (12:21 +0200)]
x86/debug: Remove perpetually broken, unmaintainable dwarf annotations
So the dwarf2 annotations in low level assembly code have
become an increasing hindrance: unreadable, messy macros
mixed into some of the most security sensitive code paths
of the Linux kernel.
These debug info annotations don't even buy the upstream
kernel anything: dwarf driven stack unwinding has caused
problems in the past so it's out of tree, and the upstream
kernel only uses the much more robust framepointers based
stack unwinding method.
In addition to that there's a steady, slow bitrot going
on with these annotations, requiring frequent fixups.
There's no tooling and no functionality upstream that
keeps it correct.
So burn down the sick forest, allowing new, healthier growth:
27 files changed, 350 insertions(+), 1101 deletions(-)
Someone who has the willingness and time to do this
properly can attempt to reintroduce dwarf debuginfo in x86
assembly code plus dwarf unwinding from first principles,
with the following conditions:
- it should be maximally readable, and maximally low-key to
'ordinary' code reading and maintenance.
- find a build time method to insert dwarf annotations
automatically in the most common cases, for pop/push
instructions that manipulate the stack pointer. This could
be done for example via a preprocessing step that just
looks for common patterns - plus special annotations for
the few cases where we want to depart from the default.
We have hundreds of CFI annotations, so automating most of
that makes sense.
- it should come with build tooling checks that ensure that
CFI annotations are sensible. We've seen such efforts from
the framepointer side, and there's no reason it couldn't be
done on the dwarf side.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Tue, 2 Jun 2015 03:51:18 +0000 (20:51 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Various VTI tunnel (mark handling, PMTU) bug fixes from Alexander
Duyck and Steffen Klassert.
2) Revert ethtool PHY query change, it wasn't correct. The PHY address
selected by the driver running the PHY to MAC connection decides
what PHY address GET ethtool operations return information from.
3) Fix handling of sequence number bits for encryption IV generation in
ESP driver, from Herbert Xu.
4) UDP can return -EAGAIN when we hit a bad checksum on receive, even
when there are other packets in the receive queue which is wrong.
Just respect the error returned from the generic socket recv
datagram helper. From Eric Dumazet.
5) Fix BNA driver firmware loading on big-endian systems, from Ivan
Vecera.
6) Fix regression in that we were inheriting the congestion control of
the listening socket for new connections, the intended behavior
always was to use the default in this case. From Neal Cardwell.
7) Fix NULL deref in brcmfmac driver, from Arend van Spriel.
8) OTP parsing fix in iwlwifi from Liad Kaufman.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
vti6: Add pmtu handling to vti6_xmit.
Revert "net: core: 'ethtool' issue with querying phy settings"
bnx2x: Move statistics implementation into semaphores
xen: netback: read hotplug script once at start of day.
xen: netback: fix printf format string warning
Revert "netfilter: ensure number of counters is >0 in do_replace()"
net: dsa: Properly propagate errors from dsa_switch_setup_one
tcp: fix child sockets to use system default congestion control if not set
udp: fix behavior of wrong checksums
sfc: free multiple Rx buffers when required
bna: fix soft lock-up during firmware initialization failure
bna: remove unreasonable iocpf timer start
bna: fix firmware loading on big-endian machines
bridge: fix br_multicast_query_expired() bug
via-rhine: Resigning as maintainer
brcmfmac: avoid null pointer access when brcmf_msgbuf_get_pktid() fails
mac80211: Fix mac80211.h docbook comments
iwlwifi: nvm: fix otp parsing in 8000 hw family
iwlwifi: pcie: fix tracking of cmd_in_flight
ip_vti/ip6_vti: Preserve skb->mark after rcv_cb call
...
Linus Torvalds [Tue, 2 Jun 2015 03:44:51 +0000 (20:44 -0700)]
Merge git://git./linux/kernel/git/davem/sparc
Pull Sparc fixes from David Miller:
1) Setup the core/threads/sockets bitmaps correctly so that 'lscpus'
and friends operate properly. Frtom Chris Hyser.
2) The bit that normally means "Cached Virtually" on sun4v systems,
actually changes meaning in M7 and later chips. Fix from Khalid
Aziz.
3) One some PCI-E systems we need to probe different OF properties to
fill in the PCI slot information properly, from Eric Snowberg.
4) Kill an extraneous memset after kzalloc(), from Christophe Jaillet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc: Resolve conflict between sparc v9 and M7 on usage of bit 9 of TTE
sparc64: pci slots information is not populated in sysfs
sparc: kernel: GRPCI2: Remove a useless memset
sparc64: Setup sysfs to mark LDOM sockets, cores and threads correctly
Linus Torvalds [Tue, 2 Jun 2015 01:49:45 +0000 (18:49 -0700)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost
Pull virtio fix from Michael Tsirkin:
"Last-minute virtio fix for 4.1
This tweaks an exported user-space header to fix build breakage for
userspace using it"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
include/uapi/linux/virtio_balloon.h: include linux/virtio_types.h
David S. Miller [Mon, 1 Jun 2015 23:56:43 +0000 (16:56 -0700)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
Netfilter fix for net
The following patch reverts the ebtables chunk that enforces counters that was
introduced in the recently applied
d26e2c9ffa38 ('Revert "netfilter: ensure
number of counters is >0 in do_replace()"') since this breaks ebtables.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 1 Jun 2015 23:06:29 +0000 (16:06 -0700)]
Merge tag 'wireless-drivers-for-davem-2015-06-01' of git://git./linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
iwlwifi:
* fix OTP parsing 8260
* fix powersave handling for 8260
brcmfmac:
* fix null pointer crash
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert [Fri, 29 May 2015 18:28:26 +0000 (11:28 -0700)]
vti6: Add pmtu handling to vti6_xmit.
We currently rely on the PMTU discovery of xfrm.
However if a packet is localy sent, the PMTU mechanism
of xfrm tries to to local socket notification what
might not work for applications like ping that don't
check for this. So add pmtu handling to vti6_xmit to
report MTU changes immediately.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 1 Jun 2015 21:43:50 +0000 (14:43 -0700)]
Revert "net: core: 'ethtool' issue with querying phy settings"
This reverts commit
f96dee13b8e10f00840124255bed1d8b4c6afd6f.
It isn't right, ethtool is meant to manage one PHY instance
per netdevice at a time, and this is selected by the SET
command. Therefore by definition the GET command must only
return the settings for the configured and selected PHY.
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 1 Jun 2015 12:08:18 +0000 (15:08 +0300)]
bnx2x: Move statistics implementation into semaphores
Commit
dff173de84958 ("bnx2x: Fix statistics locking scheme") changed the
bnx2x locking around statistics state into using a mutex - but the lock
is being accessed via a timer which is forbidden.
[If compiled with CONFIG_DEBUG_MUTEXES, logs show a warning about
accessing the mutex in interrupt context]
This moves the implementation into using a semaphore [with size '1']
instead.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ian Campbell [Mon, 1 Jun 2015 10:30:24 +0000 (11:30 +0100)]
xen: netback: read hotplug script once at start of day.
When we come to tear things down in netback_remove() and generate the
uevent it is possible that the xenstore directory has already been
removed (details below).
In such cases netback_uevent() won't be able to read the hotplug
script and will write a xenstore error node.
A recent change to the hypervisor exposed this race such that we now
sometimes lose it (where apparently we didn't ever before).
Instead read the hotplug script configuration during setup and use it
for the lifetime of the backend device.
The apparently more obvious fix of moving the transition to
state=Closed in netback_remove() to after the uevent does not work
because it is possible that we are already in state=Closed (in
reaction to the guest having disconnected as it shutdown). Being
already in Closed means the toolstack is at liberty to start tearing
down the xenstore directories. In principal it might be possible to
arrange to unregister the device sooner (e.g on transition to Closing)
such that xenstore would still be there but this state machine is
fragile and prone to anger...
A modern Xen system only relies on the hotplug uevent for driver
domains, when the backend is in the same domain as the toolstack it
will run the necessary setup/teardown directly in the correct sequence
wrt xenstore changes.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ian Campbell [Mon, 1 Jun 2015 10:30:04 +0000 (11:30 +0100)]
xen: netback: fix printf format string warning
drivers/net/xen-netback/netback.c: In function ‘xenvif_tx_build_gops’:
drivers/net/xen-netback/netback.c:1253:8: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘int’ [-Wformat=]
(txreq.offset&~PAGE_MASK) + txreq.size);
^
PAGE_MASK's type can vary by arch, so a cast is needed.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
----
v2: Cast to unsigned long, since PAGE_MASK can vary by arch.
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bernhard Thaler [Thu, 28 May 2015 08:26:18 +0000 (10:26 +0200)]
Revert "netfilter: ensure number of counters is >0 in do_replace()"
This partially reverts commit
1086bbe97a07 ("netfilter: ensure number of
counters is >0 in do_replace()") in net/bridge/netfilter/ebtables.c.
Setting rules with ebtables does not work any more with
1086bbe97a07 place.
There is an error message and no rules set in the end.
e.g.
~# ebtables -t nat -A POSTROUTING --src 12:34:56:78:9a:bc -j DROP
Unable to update the kernel. Two possible causes:
1. Multiple ebtables programs were executing simultaneously. The ebtables
userspace tool doesn't by default support multiple ebtables programs
running
Reverting the ebtables part of
1086bbe97a07 makes this work again.
Signed-off-by: Bernhard Thaler <bernhard.thaler@wvnet.at>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Mikko Rapeli [Sat, 30 May 2015 15:39:25 +0000 (17:39 +0200)]
include/uapi/linux/virtio_balloon.h: include linux/virtio_types.h
Fixes userspace compilation error:
error: unknown type name ‘__virtio16’
__virtio16 tag;
Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Khalid Aziz [Wed, 27 May 2015 16:00:46 +0000 (10:00 -0600)]
sparc: Resolve conflict between sparc v9 and M7 on usage of bit 9 of TTE
sparc: Resolve conflict between sparc v9 and M7 on usage of bit 9 of TTE
Bit 9 of TTE is CV (Cacheable in V-cache) on sparc v9 processor while
the same bit 9 is MCDE (Memory Corruption Detection Enable) on M7
processor. This creates a conflicting usage of the same bit. Kernel
sets TTE.cv bit on all pages for sun4v architecture which works well
for sparc v9 but enables memory corruption detection on M7 processor
which is not the intent. This patch adds code to determine if kernel
is running on M7 processor and takes steps to not enable memory
corruption detection in TTE erroneously.
Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Snowberg [Wed, 27 May 2015 15:59:19 +0000 (11:59 -0400)]
sparc64: pci slots information is not populated in sysfs
Add PCI slot numbers within sysfs for PCIe hardware. Larger
PCIe systems with nested PCI bridges and slots further
down on these bridges were not being populated within sysfs.
This will add ACPI style PCI slot numbers for these systems
since the OF 'slot-names' information is not available on
all PCIe platforms.
Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe Jaillet [Fri, 1 May 2015 12:05:39 +0000 (14:05 +0200)]
sparc: kernel: GRPCI2: Remove a useless memset
grpci2priv is allocated using kzalloc, so there is no need to memset it.
Signed-off-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 29 May 2015 17:29:46 +0000 (10:29 -0700)]
net: dsa: Properly propagate errors from dsa_switch_setup_one
While shuffling some code around, dsa_switch_setup_one() was introduced,
and it was modified to return either an error code using ERR_PTR() or a
NULL pointer when running out of memory or failing to setup a switch.
This is a problem for its caler: dsa_switch_setup() which uses IS_ERR()
and expects to find an error code, not a NULL pointer, so we still try
to proceed with dsa_switch_setup() and operate on invalid memory
addresses. This can be easily reproduced by having e.g: the bcm_sf2
driver built-in, but having no such switch, such that drv->setup will
fail.
Fix this by using PTR_ERR() consistently which is both more informative
and avoids for the caller to use IS_ERR_OR_NULL().
Fixes:
df197195a5248 ("net: dsa: split dsa_switch_setup into two functions")
Reported-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Neal Cardwell [Fri, 29 May 2015 17:47:07 +0000 (13:47 -0400)]
tcp: fix child sockets to use system default congestion control if not set
Linux 3.17 and earlier are explicitly engineered so that if the app
doesn't specifically request a CC module on a listener before the SYN
arrives, then the child gets the system default CC when the connection
is established. See tcp_init_congestion_control() in 3.17 or earlier,
which says "if no choice made yet assign the current value set as
default". The change ("net: tcp: assign tcp cong_ops when tcp sk is
created") altered these semantics, so that children got their parent
listener's congestion control even if the system default had changed
after the listener was created.
This commit returns to those original semantics from 3.17 and earlier,
since they are the original semantics from 2007 in
4d4d3d1e8 ("[TCP]:
Congestion control initialization."), and some Linux congestion
control workflows depend on that.
In summary, if a listener socket specifically sets TCP_CONGESTION to
"x", or the route locks the CC module to "x", then the child gets
"x". Otherwise the child gets current system default from
net.ipv4.tcp_congestion_control. That's the behavior in 3.17 and
earlier, and this commit restores that.
Fixes:
55d8694fa82c ("net: tcp: assign tcp cong_ops when tcp sk is created")
Cc: Florian Westphal <fw@strlen.de>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Glenn Judd <glenn.judd@morganstanley.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sat, 30 May 2015 16:16:53 +0000 (09:16 -0700)]
udp: fix behavior of wrong checksums
We have two problems in UDP stack related to bogus checksums :
1) We return -EAGAIN to application even if receive queue is not empty.
This breaks applications using edge trigger epoll()
2) Under UDP flood, we can loop forever without yielding to other
processes, potentially hanging the host, especially on non SMP.
This patch is an attempt to make things better.
We might in the future add extra support for rt applications
wanting to better control time spent doing a recv() in a hostile
environment. For example we could validate checksums before queuing
packets in socket receive queue.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 1 Jun 2015 02:01:07 +0000 (19:01 -0700)]
Linux 4.1-rc6
Daniel Pieczko [Fri, 29 May 2015 11:25:54 +0000 (12:25 +0100)]
sfc: free multiple Rx buffers when required
When Rx packet data must be dropped, all the buffers
associated with that Rx packet must be freed. Extend
and rename efx_free_rx_buffer() to efx_free_rx_buffers()
and loop through all the fragments.
By doing so this patch fixes a possible memory leak.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 31 May 2015 23:00:34 +0000 (16:00 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs
Pull vfs fix from Al Viro:
"Off-by-one in d_walk()/__dentry_kill() race fix.
It's very hard to hit; possible in the same conditions as the original
bug, except that you need the skipped branch to contain all the
remaining evictables, so that the d_walk()-calling loop in
d_invalidate() decides there's nothing more to do and doesn't go for
another pass - otherwise that next pass will sweep the sucker.
So it's not too urgent, but seeing that the fix is obvious and the
original commit has spread into all -stable branches..."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
d_walk() might skip too much
Linus Torvalds [Sun, 31 May 2015 19:20:59 +0000 (12:20 -0700)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"Three fixes this time around:
- fix a memory leak which occurs when probing performance monitoring
unit interrupts
- fix handling of non-PMD aligned end of RAM causing boot failures
- fix missing syscall trace exit path with syscall tracing enabled
causing a kernel oops in the audit code"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8357/1: perf: fix memory leak when probing PMU PPIs
ARM: fix missing syscall trace exit
ARM: 8356/1: mm: handle non-pmd-aligned end of RAM
Linus Torvalds [Sun, 31 May 2015 19:03:42 +0000 (12:03 -0700)]
Merge branch 'upstream' of git://git./linux/kernel/git/ralf/linux
Pull MIPS fixes from Ralf Baechle:
"MIPS fixes for 4.1 all across the tree"
* 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/ralf/linux:
MIPS: strnlen_user.S: Fix a CPU_DADDI_WORKAROUNDS regression
MIPS: BMIPS: Fix bmips_wr_vec()
MIPS: ath79: fix build problem if CONFIG_BLK_DEV_INITRD is not set
MIPS: Fuloong 2E: Replace CONFIG_USB_ISP1760_HCD by CONFIG_USB_ISP1760
MIPS: irq: Use DECLARE_BITMAP
ttyFDC: Fix to use native endian MMIO reads
MIPS: Fix CDMM to use native endian MMIO reads
Linus Torvalds [Sun, 31 May 2015 18:39:25 +0000 (11:39 -0700)]
Merge branch 'turbostat' of git://git./linux/kernel/git/lenb/linux
Pull turbostat tool fixes from Len Brown:
"Just one minor kernel dependency in this batch -- added a #define to
msr-index.h"
* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: update version number to 4.7
tools/power turbostat: allow running without cpu0
tools/power turbostat: correctly decode of ENERGY_PERFORMANCE_BIAS
tools/power turbostat: enable turbostat to support Knights Landing (KNL)
tools/power turbostat: correctly display more than 2 threads/core
Linus Torvalds [Sun, 31 May 2015 18:31:42 +0000 (11:31 -0700)]
Merge git://git./linux/kernel/git/nab/target-pending
Pull SCSI target fixes from Nicholas Bellinger:
"These are mostly minor fixes, with the exception of the following that
address fall-out from recent v4.1-rc1 changes:
- regression fix related to the big fabric API registration changes
and configfs_depend_item() usage, that required cherry-picking one
of HCH's patches from for-next to address the issue for v4.1 code.
- remaining TCM-USER -v2 related changes to enforce full CDB
passthrough from Andy + Ilias.
Also included is a target_core_pscsi driver fix from Andy that
addresses a long standing issue with a Scsi_Host reference being
leaked on PSCSI device shutdown"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
iser-target: Fix error path in isert_create_pi_ctx()
target: Use a PASSTHROUGH flag instead of transport_types
target: Move passthrough CDB parsing into a common function
target/user: Only support full command pass-through
target/user: Update example code for new ABI requirements
target/pscsi: Don't leak scsi_host if hba is VIRTUAL_HOST
target: Fix se_tpg_tfo->tf_subsys regression + remove tf_subsystem
target: Drop signal_pending checks after interruptible lock acquire
target: Add missing parentheses
target: Fix bidi command handling
target/user: Disallow full passthrough (pass_level=0)
ISCSI: fix minor memory leak
Linus Torvalds [Sun, 31 May 2015 18:24:49 +0000 (11:24 -0700)]
Merge tag 'hwmon-for-linus-v4.1-rc6' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
"Some late hwmon patches, all headed for -stable
- fix sysfs attribute initialization in nct6775 and nct6683 drivers
- do not attempt to auto-detect tmp435 on I2C address 0x37
- ensure iio channel is of type IIO_VOLTAGE in ntc_thermistor driver"
* tag 'hwmon-for-linus-v4.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (nct6683) Add missing sysfs attribute initialization
hwmon: (nct6775) Add missing sysfs attribute initialization
hwmon: (tmp401) Do not auto-detect chip on I2C address 0x37
hwmon: (ntc_thermistor) Ensure iio channel is of type IIO_VOLTAGE
David S. Miller [Sun, 31 May 2015 06:46:54 +0000 (23:46 -0700)]
Merge branch 'bna-fixes'
Ivan Vecera says:
====================
bna: misc bugfixes
These patches fix several bugs found during device initialization debugging.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Vecera [Thu, 28 May 2015 21:10:08 +0000 (23:10 +0200)]
bna: fix soft lock-up during firmware initialization failure
Bug in the driver initialization causes soft-lockup if firmware
initialization timeout is reached. Polling function bfa_ioc_poll_fwinit()
incorrectly calls bfa_nw_iocpf_timeout() when the timeout is reached.
The problem is that bfa_nw_iocpf_timeout() calls again
bfa_ioc_poll_fwinit()... etc. The bfa_ioc_poll_fwinit() should directly
send timeout event for iocpf and the same should be done if firmware
download into HW fails.
Cc: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Vecera [Thu, 28 May 2015 21:10:07 +0000 (23:10 +0200)]
bna: remove unreasonable iocpf timer start
Driver starts iocpf timer prior bnad_ioceth_enable() call and this is
unreasonable. This piece of code probably originates from Brocade/Qlogic
out-of-box driver during initial import into upstream. This driver uses
only one timer and queue to implement multiple timers and this timer is
started at this place. The upstream driver uses multiple timers instead
of this.
Cc: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Vecera [Thu, 28 May 2015 21:10:06 +0000 (23:10 +0200)]
bna: fix firmware loading on big-endian machines
Firmware required by bna is stored in appropriate files as sequence
of LE32 integers. After loading by request_firmware() they need to be
byte-swapped on big-endian arches. Without this conversion the NIC
is unusable on big-endian machines.
Cc: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 31 May 2015 06:37:46 +0000 (23:37 -0700)]
Merge tag 'mac80211-for-davem-2015-05-28' of git://git./linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
This just has a single docbook build fix. In my confusion
I'd already sent the same fix for -next, but Ben Hutchings
noted it's necessary in 4.1.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 28 May 2015 11:42:54 +0000 (04:42 -0700)]
bridge: fix br_multicast_query_expired() bug
br_multicast_query_expired() querier argument is a pointer to
a struct bridge_mcast_querier :
struct bridge_mcast_querier {
struct br_ip addr;
struct net_bridge_port __rcu *port;
};
Intent of the code was to clear port field, not the pointer to querier.
Fixes:
2cd4143192e8 ("bridge: memorize and export selected IGMP/MLD querier port")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Acked-by: Linus Lüssing <linus.luessing@c0d3.blue>
Cc: Linus Lüssing <linus.luessing@web.de>
Cc: Steinar H. Gunderson <sesse@samfundet.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roland Dreier [Sat, 30 May 2015 06:12:10 +0000 (23:12 -0700)]
iser-target: Fix error path in isert_create_pi_ctx()
We don't assign pi_ctx to desc->pi_ctx until we're certain to succeed
in the function. That means the cleanup path should use the local
pi_ctx variable, not desc->pi_ctx.
This was detected by Coverity (CID 1260062).
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Andy Grover [Tue, 19 May 2015 21:44:41 +0000 (14:44 -0700)]
target: Use a PASSTHROUGH flag instead of transport_types
It seems like we only care if a transport is passthrough or not. Convert
transport_type to a flags field and replace TRANSPORT_PLUGIN_* with a
flag, TRANSPORT_FLAG_PASSTHROUGH.
Signed-off-by: Andy Grover <agrover@redhat.com>
Reviewed-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Andy Grover [Tue, 19 May 2015 21:44:40 +0000 (14:44 -0700)]
target: Move passthrough CDB parsing into a common function
Aside from whether they handle BIDI ops or not, parsing of the CDB by
kernel and user SCSI passthrough modules should be identical. Move this
into a new passthrough_parse_cdb() and call it from tcm-pscsi and tcm-user.
Reported-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Andy Grover [Tue, 19 May 2015 21:44:39 +0000 (14:44 -0700)]
target/user: Only support full command pass-through
After much discussion, give up on only passing a subset of SCSI commands
to userspace and pass them all. Based on what pscsi is doing, make sure
to set SCF_SCSI_DATA_CDB for I/O ops, and define attributes identical to
pscsi.
Make hw_block_size configurable via dev param.
Remove mention of command filtering from tcmu-design.txt.
Signed-off-by: Andy Grover <agrover@redhat.com>
Reviewed-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Andy Grover [Tue, 19 May 2015 21:44:38 +0000 (14:44 -0700)]
target/user: Update example code for new ABI requirements
We now require that the userspace handler set a bit if the command is not
handled.
Update calls to tcmu_hdr_get_op for v2.
Signed-off-by: Andy Grover <agrover@redhat.com>
Reviewed-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Andy Grover [Fri, 22 May 2015 21:07:44 +0000 (14:07 -0700)]
target/pscsi: Don't leak scsi_host if hba is VIRTUAL_HOST
See https://bugzilla.redhat.com/show_bug.cgi?id=1025672
We need to put() the reference to the scsi host that we got in
pscsi_configure_device(). In VIRTUAL_HOST mode it is associated with
the dev_virt, not the hba_virt.
Signed-off-by: Andy Grover <agrover@redhat.com>
Cc: stable@vger.kernel.org # 2.6.38+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Christoph Hellwig [Sun, 3 May 2015 06:50:52 +0000 (08:50 +0200)]
target: Fix se_tpg_tfo->tf_subsys regression + remove tf_subsystem
There is just one configfs subsystem in the target code, so we might as
well add two helpers to reference / unreference it from the core code
instead of passing pointers to it around.
This fixes a regression introduced for v4.1-rc1 with commit
9ac8928e6,
where configfs_depend_item() callers using se_tpg_tfo->tf_subsys would
fail, because the assignment from the original target_core_subsystem[]
is no longer happening at target_register_template() time.
(Fix target_core_exit_configfs pointer dereference - Sagi)
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Guenter Roeck [Thu, 28 May 2015 16:12:23 +0000 (09:12 -0700)]
hwmon: (nct6683) Add missing sysfs attribute initialization
The following error message is seen when loading the nct6683 driver
with DEBUG_LOCK_ALLOC enabled.
BUG: key
ffff88040b2f0030 not in .data!
------------[ cut here ]------------
WARNING: CPU: 0 PID: 186 at kernel/locking/lockdep.c:2988
lockdep_init_map+0x469/0x630()
DEBUG_LOCKS_WARN_ON(1)
Caused by a missing call to sysfs_attr_init() when initializing
sysfs attributes.
Reported-by: Alexey Orishko <alexey.orishko@gmail.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Cc: stable@vger.kernel.org # v3.18+
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Guenter Roeck [Thu, 28 May 2015 16:08:09 +0000 (09:08 -0700)]
hwmon: (nct6775) Add missing sysfs attribute initialization
The following error message is seen when loading the nct6775 driver
with DEBUG_LOCK_ALLOC enabled.
BUG: key
ffff88040b2f0030 not in .data!
------------[ cut here ]------------
WARNING: CPU: 0 PID: 186 at kernel/locking/lockdep.c:2988
lockdep_init_map+0x469/0x630()
DEBUG_LOCKS_WARN_ON(1)
Caused by a missing call to sysfs_attr_init() when initializing
sysfs attributes.
Reported-by: Alexey Orishko <alexey.orishko@gmail.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Cc: stable@vger.kernel.org # v3.12+
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Linus Torvalds [Sat, 30 May 2015 00:09:39 +0000 (17:09 -0700)]
Merge tag 'acpi-pci-4.1-rc6' of git://git./linux/kernel/git/rafael/linux-pm
Pull PCI / ACPI fix from Rafael Wysocki:
"This fixes a bug uncovered by a recent driver core change that
modified the implementation of the ACPI_COMPANION_SET() macro to
strictly rely on its second argument to be either NULL or a valid
pointer to struct acpi_device.
As it turns out, pcibios_root_bridge_prepare() on x86 and ia64 works
with the assumption that the only code path calling pci_create_root_bus()
is pci_acpi_scan_root() and therefore the sysdata argument passed to
it will always match the expectations of pcibios_root_bridge_prepare().
That need not be the case, however, and in particular it is not the
case for the Xen pcifront driver that passes a pointer to its own
private data strcture as sysdata to pci_scan_bus_parented() which then
passes it to pci_create_root_bus() and it ends up being used incorrectly
by pcibios_root_bridge_prepare()"
* tag 'acpi-pci-4.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PCI / ACPI: Do not set ACPI companions for host bridges with parents
Linus Torvalds [Fri, 29 May 2015 23:45:45 +0000 (16:45 -0700)]
Merge tag 'xfs-for-linus-4.1-rc6' of git://git./linux/kernel/git/dgc/linux-xfs
Pull xfs fixes from Dave Chinner:
"This is a little larger than I'd like late in the release cycle, but
all the fixes are for regressions introduced in the 4.1-rc1 merge, or
are needed back in -stable kernels fairly quickly as they are
filesystem corruption or userspace visible correctness issues.
Changes in this update:
- regression fix for new rename whiteout code
- regression fixes for new superblock generic per-cpu counter code
- fix for incorrect error return sign introduced in 3.17
- metadata corruption fixes that need to go back to -stable kernels"
* tag 'xfs-for-linus-4.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
xfs: fix broken i_nlink accounting for whiteout tmpfile inode
xfs: xfs_iozero can return positive errno
xfs: xfs_attr_inactive leaves inconsistent attr fork state behind
xfs: extent size hints can round up extents past MAXEXTLEN
xfs: inode and free block counters need to use __percpu_counter_compare
percpu_counter: batch size aware __percpu_counter_compare()
xfs: use percpu_counter_read_positive for mp->m_icount
Linus Torvalds [Fri, 29 May 2015 21:48:58 +0000 (14:48 -0700)]
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
"Two weeks worth of small bug fixes this time, nothing sticking out
this time:
- one defconfig change to adapt to a modified Kconfig symbol
- two fixes for i.MX for backwards compatibility with older DT files
that was accidentally broken
- one regression fix for irq handling on pxa
- three small dt files on omap, and one each for imx and exynos"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: multi_v7_defconfig: Replace CONFIG_USB_ISP1760_HCD by CONFIG_USB_ISP1760
ARM: imx6: gpc: don't register power domain if DT data is missing
ARM: imx6: allow booting with old DT
ARM: dts: set display clock correctly for exynos4412-trats2
ARM: pxa: pxa_cplds: signedness bug in probe
ARM: dts: Fix WLAN interrupt line for AM335x EVM-SK
ARM: dts: omap3-devkit8000: Fix NAND DT node
ARM: dts: am335x-boneblack: disable RTC-only sleep
ARM: dts: fix imx27 dtb build rule
ARM: dts: imx27: only map 4 Kbyte for fec registers
Linus Torvalds [Fri, 29 May 2015 21:39:24 +0000 (14:39 -0700)]
Merge tag 'dm-4.1-fixes-3' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device-mapper fixes from Mike Snitzer:
"Quite a few fixes for DM's blk-mq support thanks to extra DM multipath
testing from Junichi Nomura and Bart Van Assche.
Also fix a casting bug in dm_merge_bvec() that could cause only a
single page to be added to a bio (Joe identified this while testing
dm-cache writeback)"
* tag 'dm-4.1-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm: fix casting bug in dm_merge_bvec()
dm: fix reload failure of 0 path multipath mapping on blk-mq devices
dm: fix false warning in free_rq_clone() for unmapped requests
dm: requeue from blk-mq dm_mq_queue_rq() using BLK_MQ_RQ_QUEUE_BUSY
dm mpath: fix leak of dm_mpath_io structure in blk-mq .queue_rq error path
dm: fix NULL pointer when clone_and_map_rq returns !DM_MAPIO_REMAPPED
dm: run queue on re-queue
Guenter Roeck [Wed, 27 May 2015 23:11:48 +0000 (16:11 -0700)]
hwmon: (tmp401) Do not auto-detect chip on I2C address 0x37
I2C address 0x37 may be used by EEPROMs, which can result in false
positives. Do not attempt to detect a chip at this address.
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Linus Torvalds [Fri, 29 May 2015 20:28:57 +0000 (13:28 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"10 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
scripts/gdb: fix lx-lsmod refcnt
omfs: fix potential integer overflow in allocator
omfs: fix sign confusion for bitmap loop counter
omfs: set error return when d_make_root() fails
fs, omfs: add NULL terminator in the end up the token list
MAINTAINERS: update CAPABILITIES pattern
fs/binfmt_elf.c:load_elf_binary(): return -EINVAL on zero-length mappings
tracing/mm: don't trace mm_page_pcpu_drain on offline cpus
tracing/mm: don't trace mm_page_free on offline cpus
tracing/mm: don't trace kmem_cache_free on offline cpus
Linus Torvalds [Fri, 29 May 2015 18:24:28 +0000 (11:24 -0700)]
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/rusty/linux
Pull fixes for cpumask and modules from Rusty Russell:
"** NOW WITH TESTING! **
Two fixes which got lost in my recent distraction. One is a weird
cpumask function which needed to be rewritten, the other is a module
bug which is cc:stable"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
cpumask_set_cpu_local_first => cpumask_local_spread, lament
module: Call module notifier on failure after complete_formation()
Maciej W. Rozycki [Thu, 28 May 2015 16:46:49 +0000 (17:46 +0100)]
MIPS: strnlen_user.S: Fix a CPU_DADDI_WORKAROUNDS regression
Correct a regression introduced with
8453eebd [MIPS: Fix strnlen_user()
return value in case of overlong strings.] causing assembler warnings
and broken code generated in __strnlen_kernel_nocheck_asm:
arch/mips/lib/strnlen_user.S: Assembler messages:
arch/mips/lib/strnlen_user.S:64: Warning: Macro instruction expanded into multiple instructions in a branch delay slot
with the CPU_DADDI_WORKAROUNDS option set, resulting in the function
looping indefinitely upon mounting NFS root.
Use conditional assembly to avoid a microMIPS code size regression.
Using $at unconditionally would cause such a regression as there are no
16-bit instruction encodings available for ALU operations using this
register. Using $v1 unconditionally would produce short microMIPS
encodings, but would prevent this register from being used across calls
to this function.
The extra LI operation introduced is free, replacing a NOP originally
scheduled into the delay slot of the branch that follows.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10205/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Petri Gynther [Wed, 27 May 2015 06:25:08 +0000 (23:25 -0700)]
MIPS: BMIPS: Fix bmips_wr_vec()
bmips_wr_vec() copies exception vector code from start to dst.
The call to dma_cache_wback() needs to flush (end-start) bytes,
starting at dst, from write-back cache to memory.
Signed-off-by: Petri Gynther <pgynther@google.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Kevin Cernekee <cernekee@gmail.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10193/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Laurent Fasnacht [Wed, 27 May 2015 17:50:00 +0000 (19:50 +0200)]
MIPS: ath79: fix build problem if CONFIG_BLK_DEV_INITRD is not set
initrd_start is defined in init/do_mounts_initrd.c, which is only
included in kernel if CONFIG_BLK_DEV_INITRD=y.
Signed-off-by: Laurent Fasnacht <l@libres.ch>
Cc: linux-mips@linux-mips.org
Cc: trivial@kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/10198/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Linus Torvalds [Fri, 29 May 2015 17:52:15 +0000 (10:52 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"This is made up 4 groups of fixes detailed below.
vgem:
Due to some misgivings about possible bad use cases this allow,
backout a chunk of the interface to stop those use cases for now.
radeon:
Fix for an oops regression in the audio code, and a partial revert
for a fix that was cauing problems.
nouveau:
regression fix for Fermi, and display-less Maxwell boot fixes.
drm core:
a fix for i915 cursor vblank waiting in the atomic helpers"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/nouveau/gr/gm204: remove a stray printk
drm/nouveau/devinit/gm100-: force devinit table execution on boards without PDISP
drm/nouveau/devinit/gf100: make the force-post condition more obvious
drm/nouveau/gr/gf100-: fix wrong constant definition
drm/radeon: partially revert "fix VM_CONTEXT*_PAGE_TABLE_END_ADDR handling"
drm/radeon/audio: make sure connector is valid in hotplug case
Revert "drm/radeon: only mark audio as connected if the monitor supports it (v3)"
drm/radeon: don't share plls if monitors differ in audio support
drm/vgem: drop DRIVER_PRIME (v2)
drm/plane-helper: Adapt cursor hack to transitional helpers
Linus Torvalds [Fri, 29 May 2015 17:43:27 +0000 (10:43 -0700)]
Merge tag 'sound-4.1-rc6' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"No big surprise here, just a bunch of small fixes for HD-audio and
USB-audio:
- partial revert of widget power-saving for IDT codecs
- revert mute-LED enum ctl for Thinkpads due to confusion
- a quirk for a new Radeon HDMI controller
- Realtek codec name fix for Dell
- a workaround for headphone mic boost on some laptops
- stream_pm ops setup (and its fix for regression)
- another quirk for MS LifeCam USB-audio"
* tag 'sound-4.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix lost sound due to stream_pm ops cleanup
ALSA: hda - Disable Headphone Mic boost for ALC662
ALSA: hda - Disable power_save_node for IDT92HD71bxx
ALSA: hda - Fix noise on AMD radeon 290x controller
ALSA: hda - Set stream_pm ops automatically by generic parser
ALSA: hda/realtek - Add ALC256 alias name for Dell
Revert "ALSA: hda - Add mute-LED mode control to Thinkpad"
ALSA: usb-audio: Add quirk for MS LifeCam HD-3000
Joe Thornber [Fri, 29 May 2015 13:52:51 +0000 (14:52 +0100)]
dm: fix casting bug in dm_merge_bvec()
dm_merge_bvec() was originally added in f6fccb ("dm: introduce
merge_bvec_fn"). In that commit a value in sectors is converted to
bytes using << 9, and then assigned to an int. This code made
assumptions about the value of BIO_MAX_SECTORS.
A later commit 148e51 ("dm: improve documentation and code clarity in
dm_merge_bvec") was meant to have no functional change but it removed
the use of BIO_MAX_SECTORS in favor of using queue_max_sectors(). At
this point the cast from sector_t to int resulted in a zero value. The
fallout being dm_merge_bvec() would only allow a single page to be added
to a bio.
This interim fix is minimal for the benefit of stable@ because the more
comprehensive cleanup of passing a sector_t to all DM targets' merge
function will impact quite a few DM targets.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # 3.19+
Junichi Nomura [Fri, 29 May 2015 08:51:03 +0000 (08:51 +0000)]
dm: fix reload failure of 0 path multipath mapping on blk-mq devices
dm-multipath accepts 0 path mapping.
# echo '0 2097152 multipath 0 0 0 0' | dmsetup create newdev
Such a mapping can be used to release underlying devices while still
holding requests in its queue until working paths come back.
However, once the multipath device is created over blk-mq devices,
it rejects reloading of 0 path mapping:
# echo '0 2097152 multipath 0 0 1 1 queue-length 0 1 1 /dev/sda 1' \
| dmsetup create mpath1
# echo '0 2097152 multipath 0 0 0 0' | dmsetup load mpath1
device-mapper: reload ioctl on mpath1 failed: Invalid argument
Command failed
With following kernel message:
device-mapper: ioctl: can't change device type after initial table load.
DM tries to inherit the current table type using dm_table_set_type()
but it doesn't work as expected because of unnecessary check about
whether the target type is hybrid or not.
Hybrid type is for targets that work as either request-based or bio-based
and not required for blk-mq or non blk-mq checking.
Fixes:
65803c205983 ("dm table: train hybrid target type detection to select blk-mq if appropriate")
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Linus Torvalds [Fri, 29 May 2015 17:35:21 +0000 (10:35 -0700)]
Merge tag 'md/4.1-rc5-fixes' of git://neil.brown.name/md
Pull m,ore md bugfixes gfrom Neil Brown:
"Assorted fixes for new RAID5 stripe-batching functionality.
Unfortunately this functionality was merged a little prematurely. The
necessary testing and code review is now complete (or as complete as
it can be) and to code passes a variety of tests and looks quite
sensible.
Also a fix for some recent locking changes - a race was introduced
which causes a reshape request to sometimes fail. No data safety
issues"
* tag 'md/4.1-rc5-fixes' of git://neil.brown.name/md:
md: fix race when unfreezing sync_action
md/raid5: break stripe-batches when the array has failed.
md/raid5: call break_stripe_batch_list from handle_stripe_clean_event
md/raid5: be more selective about distributing flags across batch.
md/raid5: add handle_flags arg to break_stripe_batch_list.
md/raid5: duplicate some more handle_stripe_clean_event code in break_stripe_batch_list
md/raid5: remove condition test from check_break_stripe_batch_list.
md/raid5: Ensure a batch member is not handled prematurely.
md/raid5: close race between STRIPE_BIT_DELAY and batching.
md/raid5: ensure whole batch is delayed for all required bitmap updates.
Mike Snitzer [Thu, 28 May 2015 19:12:52 +0000 (15:12 -0400)]
dm: fix false warning in free_rq_clone() for unmapped requests
When stacking request-based dm device on non blk-mq device and
device-mapper target could not map the request (error target is used,
multipath target with all paths down, etc), the WARN_ON_ONCE() in
free_rq_clone() will trigger when it shouldn't.
The warning was added by commit aa6df8d ("dm: fix free_rq_clone() NULL
pointer when requeueing unmapped request"). But free_rq_clone() with
clone->q == NULL is valid usage for the case where
dm_kill_unmapped_request() initiates request cleanup.
Fix this false warning by just removing the WARN_ON -- it only generated
false positives and was never useful in catching the intended case
(completing clone request not being mapped e.g. clone->q being NULL).
Fixes: aa6df8d ("dm: fix free_rq_clone() NULL pointer when requeueing unmapped request")
Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reported-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Geert Uytterhoeven [Tue, 26 May 2015 10:58:37 +0000 (12:58 +0200)]
ARM: multi_v7_defconfig: Replace CONFIG_USB_ISP1760_HCD by CONFIG_USB_ISP1760
Since commit
100832abf065bc18 ("usb: isp1760: Make HCD support
optional"), CONFIG_USB_ISP1760_HCD is automatically selected when
needed. Enabling that option in the defconfig is now a no-op, and no
longer enables ISP1760 HCD support.
Re-enable the ISP1760 driver in the defconfig by enabling
USB_ISP1760_HOST_ROLE instead.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Fri, 29 May 2015 12:30:35 +0000 (14:30 +0200)]
Merge tag 'imx-fixes-4.1-3' of git://git./linux/kernel/git/shawnguo/linux into fixes
Merge "The i.MX fixes for 4.1, 3rd round" from Shawn Guo:
It includes a couple of fixes for i.MX6 GPC code to let the new kernel
be able to boot with old DTBs:
- Booting v4.1-rc kernel with old DTBs will fail with a fat warning
(require low-level debug to be seen), due to the adoption of stacked
IRQ domain. The first fix improves the situation by allowing kernel
boot up with old DTBs, although suspend/resume still breaks.
- Booting new kernel with old DTBs that do not have power-domain info
will result in a hang. The second patch fixes the hang by skipping
the kernel power-domain registration if DTB has no power-domain info.
* tag 'imx-fixes-4.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: imx6: gpc: don't register power domain if DT data is missing
ARM: imx6: allow booting with old DT
Arnd Bergmann [Fri, 29 May 2015 12:29:37 +0000 (14:29 +0200)]
Merge tag 'samsung-fixes-3' of git://git./linux/kernel/git/kgene/linux-samsung into fixes
Merge "Samsung fix for v4.1" from Kukjin Kim:
- Set display clock correctly for exynos4412-trats2
: fix the following error
exynos-drm: No connectors reported connected with modes
[drm] Cannot find any crtc or sizes - going 1024x768
* tag 'samsung-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
ARM: dts: set display clock correctly for exynos4412-trats2
Takashi Iwai [Fri, 29 May 2015 07:43:29 +0000 (09:43 +0200)]
ALSA: hda - Fix lost sound due to stream_pm ops cleanup
The commit [
49fb18972581: ALSA: hda - Set stream_pm ops automatically
by generic parser] resulted in regressions on some Realtek and VIA
codecs because these drivers set patch_ops after calling the generic
parser, thus stream_pm got cleared to NULL again. I haven't noticed
since I tested with IDT codec.
Restore (partial revert) the stream_pm ops for them to fix the
regression.
Fixes:
49fb18972581 ('ALSA: hda - Set stream_pm ops automatically by generic parser')
Reported-by: Jeremiah Mahler <jmmahler@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Al Viro [Fri, 29 May 2015 03:09:19 +0000 (23:09 -0400)]
d_walk() might skip too much
when we find that a child has died while we'd been trying to ascend,
we should go into the first live sibling itself, rather than its sibling.
Off-by-one in question had been introduced in "deal with deadlock in
d_walk()" and the fix needs to be backported to all branches this one
has been backported to.
Cc: stable@vger.kernel.org # 3.2 and later
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>