platform/kernel/linux-exynos.git
5 years agos390/qdio: reset old sbal_state flags
Julian Wiedmann [Wed, 16 May 2018 07:37:25 +0000 (09:37 +0200)]
s390/qdio: reset old sbal_state flags

commit 64e03ff72623b8c2ea89ca3cb660094e019ed4ae upstream.

When allocating a new AOB fails, handle_outbound() is still capable of
transmitting the selected buffer (just without async completion).

But if a previous transfer on this queue slot used async completion, its
sbal_state flags field is still set to QDIO_OUTBUF_STATE_FLAG_PENDING.
So when the upper layer driver sees this stale flag, it expects an async
completion that never happens.

Fix this by unconditionally clearing the flags field.

Fixes: 104ea556ee7f ("qdio: support asynchronous delivery of storage blocks")
Cc: <stable@vger.kernel.org> #v3.2+
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agos390: fix br_r1_trampoline for machines without exrl
Martin Schwidefsky [Mon, 6 Aug 2018 12:26:39 +0000 (14:26 +0200)]
s390: fix br_r1_trampoline for machines without exrl

commit 26f843848bae973817b3587780ce6b7b0200d3e4 upstream.

For machines without the exrl instruction the BFP jit generates
code that uses an "br %r1" instruction located in the lowcore page.
Unfortunately there is a cut & paste error that puts an additional
"larl %r1,.+14" instruction in the code that clobbers the branch
target address in %r1. Remove the larl instruction.

Cc: <stable@vger.kernel.org> # v4.17+
Fixes: de5cb6eb51 ("s390: use expoline thunks in the BPF JIT")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agos390/mm: fix addressing exception after suspend/resume
Gerald Schaefer [Tue, 7 Aug 2018 16:57:11 +0000 (18:57 +0200)]
s390/mm: fix addressing exception after suspend/resume

commit 37a366face294facb9c9d9fdd9f5b64a27456cbd upstream.

Commit c9b5ad546e7d "s390/mm: tag normal pages vs pages used in page tables"
accidentally changed the logic in arch_set_page_states(), which is used by
the suspend/resume code. set_page_stable(page, order) was changed to
set_page_stable_dat(page, 0). After this, only the first page of higher order
pages will be set to stable, and a write to one of the unstable pages will
result in an addressing exception.

Fix this by using "order" again, instead of "0".

Fixes: c9b5ad546e7d ("s390/mm: tag normal pages vs pages used in page tables")
Cc: stable@vger.kernel.org # 4.14+
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/entry/64: Wipe KASAN stack shadow before rewind_stack_do_exit()
Jann Horn [Tue, 28 Aug 2018 18:40:33 +0000 (20:40 +0200)]
x86/entry/64: Wipe KASAN stack shadow before rewind_stack_do_exit()

commit f12d11c5c184626b4befdee3d573ec8237405a33 upstream.

Reset the KASAN shadow state of the task stack before rewinding RSP.
Without this, a kernel oops will leave parts of the stack poisoned, and
code running under do_exit() can trip over such poisoned regions and cause
nonsensical false-positive KASAN reports about stack-out-of-bounds bugs.

This does not wipe the exception stacks; if an oops happens on an exception
stack, it might result in random KASAN false-positives from other tasks
afterwards. This is probably relatively uninteresting, since if the kernel
oopses on an exception stack, there are most likely bigger things to worry
about. It'd be more interesting if vmapped stacks and KASAN were
compatible, since then handle_stack_overflow() would oops from exception
stack context.

Fixes: 2deb4be28077 ("x86/dumpstack: When OOPSing, rewind the stack before do_exit()")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: kasan-dev@googlegroups.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180828184033.93712-1-jannh@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agohwmon: (nct6775) Fix potential Spectre v1
Gustavo A. R. Silva [Wed, 15 Aug 2018 13:14:37 +0000 (08:14 -0500)]
hwmon: (nct6775) Fix potential Spectre v1

commit d49dbfade96d5b0863ca8a90122a805edd5ef50a upstream.

val can be indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

vers/hwmon/nct6775.c:2698 store_pwm_weight_temp_sel() warn: potential
spectre issue 'data->temp_src' [r]

Fix this by sanitizing val before using it to index data->temp_src

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/speculation/l1tf: Increase l1tf memory limit for Nehalem+
Andi Kleen [Fri, 24 Aug 2018 17:03:50 +0000 (10:03 -0700)]
x86/speculation/l1tf: Increase l1tf memory limit for Nehalem+

commit cc51e5428ea54f575d49cfcede1d4cb3a72b4ec4 upstream.

On Nehalem and newer core CPUs the CPU cache internally uses 44 bits
physical address space. The L1TF workaround is limited by this internal
cache address width, and needs to have one bit free there for the
mitigation to work.

Older client systems report only 36bit physical address space so the range
check decides that L1TF is not mitigated for a 36bit phys/32GB system with
some memory holes.

But since these actually have the larger internal cache width this warning
is bogus because it would only really be needed if the system had more than
43bits of memory.

Add a new internal x86_cache_bits field. Normally it is the same as the
physical bits field reported by CPUID, but for Nehalem and newerforce it to
be at least 44bits.

Change the L1TF memory size warning to use the new cache_bits field to
avoid bogus warnings and remove the bogus comment about memory size.

Fixes: 17dbca119312 ("x86/speculation/l1tf: Add sysfs reporting for l1tf")
Reported-by: George Anchev <studio@anchev.net>
Reported-by: Christopher Snowhill <kode54@gmail.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Michael Hocko <mhocko@suse.com>
Cc: vbabka@suse.cz
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180824170351.34874-1-andi@firstfloor.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/spectre: Add missing family 6 check to microcode check
Andi Kleen [Fri, 24 Aug 2018 17:03:51 +0000 (10:03 -0700)]
x86/spectre: Add missing family 6 check to microcode check

commit 1ab534e85c93945f7862378d8c8adcf408205b19 upstream.

The check for Spectre microcodes does not check for family 6, only the
model numbers.

Add a family 6 check to avoid ambiguity with other families.

Fixes: a5b296636453 ("x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodes")
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180824170351.34874-2-andi@firstfloor.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/irqflags: Mark native_restore_fl extern inline
Nick Desaulniers [Mon, 27 Aug 2018 21:40:09 +0000 (14:40 -0700)]
x86/irqflags: Mark native_restore_fl extern inline

commit 1f59a4581b5ecfe9b4f049a7a2cf904d8352842d upstream.

This should have been marked extern inline in order to pick up the out
of line definition in arch/x86/kernel/irqflags.S.

Fixes: 208cbb325589 ("x86/irqflags: Provide a declaration for native_save_fl")
Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180827214011.55428-1-ndesaulniers@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/nmi: Fix NMI uaccess race against CR3 switching
Andy Lutomirski [Wed, 29 Aug 2018 15:47:18 +0000 (08:47 -0700)]
x86/nmi: Fix NMI uaccess race against CR3 switching

commit 4012e77a903d114f915fc607d6d2ed54a3d6c9b1 upstream.

A NMI can hit in the middle of context switching or in the middle of
switch_mm_irqs_off().  In either case, CR3 might not match current->mm,
which could cause copy_from_user_nmi() and friends to read the wrong
memory.

Fix it by adding a new nmi_uaccess_okay() helper and checking it in
copy_from_user_nmi() and in __copy_from_user_nmi()'s callers.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Rik van Riel <riel@surriel.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jann Horn <jannh@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/dd956eba16646fd0b15c3c0741269dfd84452dac.1535557289.git.luto@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/vdso: Fix lsl operand order
Samuel Neves [Sat, 1 Sep 2018 20:14:52 +0000 (21:14 +0100)]
x86/vdso: Fix lsl operand order

commit e78e5a91456fcecaa2efbb3706572fe043766f4d upstream.

In the __getcpu function, lsl is using the wrong target and destination
registers. Luckily, the compiler tends to choose %eax for both variables,
so it has been working so far.

Fixes: a582c540ac1b ("x86/vdso: Use RDPID in preference to LSL when available")
Signed-off-by: Samuel Neves <sneves@dei.uc.pt>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180901201452.27828-1-sneves@dei.uc.pt
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agopinctrl: freescale: off by one in imx1_pinconf_group_dbg_show()
Dan Carpenter [Fri, 13 Jul 2018 14:55:15 +0000 (17:55 +0300)]
pinctrl: freescale: off by one in imx1_pinconf_group_dbg_show()

commit 19da44cd33a3a6ff7c97fff0189999ff15b241e4 upstream.

The info->groups[] array is allocated in imx1_pinctrl_parse_dt().  It
has info->ngroups elements.  Thus the > here should be >= to prevent
reading one element beyond the end of the array.

Cc: stable@vger.kernel.org
Fixes: 30612cd90005 ("pinctrl: imx1 core driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Uwe Kleine-König <u.kleine-könig@pengutronix.de>
Acked-by: Dong Aisheng <Aisheng.dong@nxp.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoASoC: sirf: Fix potential NULL pointer dereference
Gustavo A. R. Silva [Thu, 26 Jul 2018 20:49:10 +0000 (15:49 -0500)]
ASoC: sirf: Fix potential NULL pointer dereference

commit ae1c696a480c67c45fb23b35162183f72c6be0e1 upstream.

There is a potential execution path in which function
platform_get_resource() returns NULL. If this happens,
we will end up having a NULL pointer dereference.

Fix this by replacing devm_ioremap with devm_ioremap_resource,
which has the NULL check and the memory region request.

This code was detected with the help of Coccinelle.

Cc: stable@vger.kernel.org
Fixes: 2bd8d1d5cf89 ("ASoC: sirf: Add audio usp interface driver")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoASoC: zte: Fix incorrect PCM format bit usages
Takashi Iwai [Wed, 25 Jul 2018 20:40:49 +0000 (22:40 +0200)]
ASoC: zte: Fix incorrect PCM format bit usages

commit c889a45d229938a94b50aadb819def8bb11a6a54 upstream.

zx-tdm driver sets the DAI driver definitions with the format bits
wrongly set with SNDRV_PCM_FORMAT_*, instead of SNDRV_PCM_FMTBIT_*.

This patch corrects the definitions.

Spotted by a sparse warning:
  sound/soc/zte/zx-tdm.c:363:35: warning: restricted snd_pcm_format_t degrades to integer

Fixes: 870e0ddc4345 ("ASoC: zx-tdm: add zte's tdm controller driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoASoC: dpcm: don't merge format from invalid codec dai
Jerome Brunet [Wed, 27 Jun 2018 15:36:38 +0000 (17:36 +0200)]
ASoC: dpcm: don't merge format from invalid codec dai

commit 4febced15ac8ddb9cf3e603edb111842e4863d9a upstream.

When merging codec formats, dpcm_runtime_base_format() should skip
the codecs which are not supporting the current stream direction.

At the moment, if a BE link has more than one codec, and only one
of these codecs has no capture DAI, it becomes impossible to start
a capture stream because the merged format would be 0.

Skipping invalid codec DAI solves the problem.

Fixes: b073ed4e2126 ("ASoC: soc-pcm: DPCM cares BE format")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agob43/leds: Ensure NUL-termination of LED name string
Michael Buesch [Tue, 31 Jul 2018 19:14:04 +0000 (21:14 +0200)]
b43/leds: Ensure NUL-termination of LED name string

commit 2aa650d1950fce94f696ebd7db30b8830c2c946f upstream.

strncpy might not NUL-terminate the string, if the name equals the buffer size.
Use strlcpy instead.

Signed-off-by: Michael Buesch <m@bues.ch>
Cc: stable@vger.kernel.org
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agob43legacy/leds: Ensure NUL-termination of LED name string
Michael Buesch [Tue, 31 Jul 2018 19:14:16 +0000 (21:14 +0200)]
b43legacy/leds: Ensure NUL-termination of LED name string

commit 4d77a89e3924b12f4a5628b21237e57ab4703866 upstream.

strncpy might not NUL-terminate the string, if the name equals the buffer size.
Use strlcpy instead.

Signed-off-by: Michael Buesch <m@bues.ch>
Cc: stable@vger.kernel.org
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoudl-kms: avoid division
Mikulas Patocka [Sun, 3 Jun 2018 14:41:00 +0000 (16:41 +0200)]
udl-kms: avoid division

commit 91ba11fb7d7ca0a3bbe8a512e65e666e2ec1e889 upstream.

Division is slow, so it shouldn't be done by the pixel generating code.
The driver supports only 2 or 4 bytes per pixel, so we can replace
division with a shift.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoudl-kms: fix crash due to uninitialized memory
Mikulas Patocka [Sun, 3 Jun 2018 14:40:57 +0000 (16:40 +0200)]
udl-kms: fix crash due to uninitialized memory

commit 09a00abe3a9941c2715ca83eb88172cd2f54d8fd upstream.

We must use kzalloc when allocating the fb_deferred_io structure.
Otherwise, the field first_io is undefined and it causes a crash.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoudl-kms: handle allocation failure
Mikulas Patocka [Sun, 3 Jun 2018 14:40:56 +0000 (16:40 +0200)]
udl-kms: handle allocation failure

commit 542bb9788a1f485eb1a2229178f665d8ea166156 upstream.

Allocations larger than PAGE_ALLOC_COSTLY_ORDER are unreliable and they
may fail anytime. This patch fixes the udl kms driver so that when a large
alloactions fails, it tries to do multiple smaller allocations.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoudl-kms: change down_interruptible to down
Mikulas Patocka [Sun, 3 Jun 2018 14:40:55 +0000 (16:40 +0200)]
udl-kms: change down_interruptible to down

commit 8456b99c16d193c4c3b7df305cf431e027f0189c upstream.

If we leave urbs around, it causes not only leak, but also memory
corruption. This patch fixes the function udl_free_urb_list, so that it
always waits for all urbs that are in progress.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agofuse: Add missed unlock_page() to fuse_readpages_fill()
Kirill Tkhai [Thu, 19 Jul 2018 12:49:39 +0000 (15:49 +0300)]
fuse: Add missed unlock_page() to fuse_readpages_fill()

commit 109728ccc5933151c68d1106e4065478a487a323 upstream.

The above error path returns with page unlocked, so this place seems also
to behave the same.

Fixes: f8dbdf81821b ("fuse: rework fuse_readpages()")
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agofuse: Fix oops at process_init_reply()
Miklos Szeredi [Thu, 26 Jul 2018 14:13:11 +0000 (16:13 +0200)]
fuse: Fix oops at process_init_reply()

commit e8f3bd773d22f488724dffb886a1618da85c2966 upstream.

syzbot is hitting NULL pointer dereference at process_init_reply().
This is because deactivate_locked_super() is called before response for
initial request is processed.

Fix this by aborting and waiting for all requests (including FUSE_INIT)
before resetting fc->sb.

Original patch by Tetsuo Handa <penguin-kernel@I-love.SKAURA.ne.jp>.

Reported-by: syzbot <syzbot+b62f08f4d5857755e3bc@syzkaller.appspotmail.com>
Fixes: e27c9d3877a0 ("fuse: fuse: add time_gran to INIT_OUT")
Cc: <stable@vger.kernel.org> # v3.19
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agofuse: umount should wait for all requests
Miklos Szeredi [Thu, 26 Jul 2018 14:13:11 +0000 (16:13 +0200)]
fuse: umount should wait for all requests

commit b8f95e5d13f5f0191dcb4b9113113d241636e7cb upstream.

fuse_abort_conn() does not guarantee that all async requests have actually
finished aborting (i.e. their ->end() function is called).  This could
actually result in still used inodes after umount.

Add a helper to wait until all requests are fully done.  This is done by
looking at the "num_waiting" counter.  When this counter drops to zero, we
can be sure that no more requests are outstanding.

Fixes: 0d8e84b0432b ("fuse: simplify request abort")
Cc: <stable@vger.kernel.org> # v4.2
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agofuse: fix unlocked access to processing queue
Miklos Szeredi [Thu, 26 Jul 2018 14:13:11 +0000 (16:13 +0200)]
fuse: fix unlocked access to processing queue

commit 45ff350bbd9d0f0977ff270a0d427c71520c0c37 upstream.

fuse_dev_release() assumes that it's the only one referencing the
fpq->processing list, but that's not true, since fuse_abort_conn() can be
doing the same without any serialization between the two.

Fixes: c3696046beb3 ("fuse: separate pqueue for clones")
Cc: <stable@vger.kernel.org> # v4.2
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agofuse: fix double request_end()
Miklos Szeredi [Thu, 26 Jul 2018 14:13:11 +0000 (16:13 +0200)]
fuse: fix double request_end()

commit 87114373ea507895a62afb10d2910bd9adac35a8 upstream.

Refcounting of request is broken when fuse_abort_conn() is called and
request is on the fpq->io list:

 - ref is taken too late
 - then it is not dropped

Fixes: 0d8e84b0432b ("fuse: simplify request abort")
Cc: <stable@vger.kernel.org> # v4.2
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agofuse: fix initial parallel dirops
Miklos Szeredi [Thu, 26 Jul 2018 14:13:11 +0000 (16:13 +0200)]
fuse: fix initial parallel dirops

commit 63576c13bd17848376c8ba4a98f5d5151140c4ac upstream.

If parallel dirops are enabled in FUSE_INIT reply, then first operation may
leave fi->mutex held.

Reported-by: syzbot <syzbot+3f7b29af1baa9d0a55be@syzkaller.appspotmail.com>
Fixes: 5c672ab3f0ee ("fuse: serialize dirops by default")
Cc: <stable@vger.kernel.org> # v4.7
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agofuse: Don't access pipe->buffers without pipe_lock()
Andrey Ryabinin [Tue, 17 Jul 2018 16:00:33 +0000 (19:00 +0300)]
fuse: Don't access pipe->buffers without pipe_lock()

commit a2477b0e67c52f4364a47c3ad70902bc2a61bd4c upstream.

fuse_dev_splice_write() reads pipe->buffers to determine the size of
'bufs' array before taking the pipe_lock(). This is not safe as
another thread might change the 'pipe->buffers' between the allocation
and taking the pipe_lock(). So we end up with too small 'bufs' array.

Move the bufs allocations inside pipe_lock()/pipe_unlock() to fix this.

Fixes: dd3bb14f44a6 ("fuse: support splice() writing to fuse device")
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: <stable@vger.kernel.org> # v2.6.35
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/kvm/vmx: Remove duplicate l1d flush definitions
Josh Poimboeuf [Tue, 14 Aug 2018 17:32:19 +0000 (12:32 -0500)]
x86/kvm/vmx: Remove duplicate l1d flush definitions

commit 94d7a86c21a3d6046bf4616272313cb7d525075a upstream.

These are already defined higher up in the file.

Fixes: 7db92e165ac8 ("x86/kvm: Move l1tf setup function")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/d7ca03ae210d07173452aeed85ffe344301219a5.1534253536.git.jpoimboe@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoKVM: x86: SVM: Call x86_spec_ctrl_set_guest/host() with interrupts disabled
Thomas Gleixner [Sun, 12 Aug 2018 18:41:45 +0000 (20:41 +0200)]
KVM: x86: SVM: Call x86_spec_ctrl_set_guest/host() with interrupts disabled

commit 024d83cadc6b2af027e473720f3c3da97496c318 upstream.

Mikhail reported the following lockdep splat:

WARNING: possible irq lock inversion dependency detected
CPU 0/KVM/10284 just changed the state of lock:
  000000000d538a88 (&st->lock){+...}, at:
  speculative_store_bypass_update+0x10b/0x170

but this lock was taken by another, HARDIRQ-safe lock
in the past:

(&(&sighand->siglock)->rlock){-.-.}

   and interrupts could create inverse lock ordering between them.

Possible interrupt unsafe locking scenario:

    CPU0                    CPU1
    ----                    ----
   lock(&st->lock);
                           local_irq_disable();
                           lock(&(&sighand->siglock)->rlock);
                           lock(&st->lock);
    <Interrupt>
     lock(&(&sighand->siglock)->rlock);
     *** DEADLOCK ***

The code path which connects those locks is:

   speculative_store_bypass_update()
   ssb_prctl_set()
   do_seccomp()
   do_syscall_64()

In svm_vcpu_run() speculative_store_bypass_update() is called with
interupts enabled via x86_virt_spec_ctrl_set_guest/host().

This is actually a false positive, because GIF=0 so interrupts are
disabled even if IF=1; however, we can easily move the invocations of
x86_virt_spec_ctrl_set_guest/host() into the interrupt disabled region to
cure it, and it's a good idea to keep the GIF=0/IF=1 area as small
and self-contained as possible.

Fixes: 1f50ddb4f418 ("x86/speculation: Handle HT correctly on AMD")
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: kvm@vger.kernel.org
Cc: x86@kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/process: Re-export start_thread()
Rian Hunter [Sun, 19 Aug 2018 23:08:53 +0000 (16:08 -0700)]
x86/process: Re-export start_thread()

commit dc76803e57cc86589c4efcb5362918f9b0c0436f upstream.

The consolidation of the start_thread() functions removed the export
unintentionally. This breaks binfmt handlers built as a module.

Add it back.

Fixes: e634d8fc792c ("x86-64: merge the standard and compat start_thread() functions")
Signed-off-by: Rian Hunter <rian@alum.mit.edu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Dmitry Safonov <dima@arista.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180819230854.7275-1-rian@alum.mit.edu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/vdso: Fix vDSO build if a retpoline is emitted
Andy Lutomirski [Thu, 16 Aug 2018 19:41:15 +0000 (12:41 -0700)]
x86/vdso: Fix vDSO build if a retpoline is emitted

commit 2e549b2ee0e358bc758480e716b881f9cabedb6a upstream.

Currently, if the vDSO ends up containing an indirect branch or
call, GCC will emit the "external thunk" style of retpoline, and it
will fail to link.

Fix it by building the vDSO with inline retpoline thunks.

I haven't seen any reports of this triggering on an unpatched
kernel.

Fixes: commit 76b043848fd2 ("x86/retpoline: Add initial retpoline support")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Matt Rickard <matt@softrans.com.au>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jason Vas Dias <jason.vas.dias@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/c76538cd3afbe19c6246c2d1715bc6a60bd63985.1534448381.git.luto@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/speculation/l1tf: Suggest what to do on systems with too much RAM
Vlastimil Babka [Thu, 23 Aug 2018 14:21:29 +0000 (16:21 +0200)]
x86/speculation/l1tf: Suggest what to do on systems with too much RAM

commit 6a012288d6906fee1dbc244050ade1dafe4a9c8d upstream.

Two users have reported [1] that they have an "extremely unlikely" system
with more than MAX_PA/2 memory and L1TF mitigation is not effective.

Make the warning more helpful by suggesting the proper mem=X kernel boot
parameter to make it effective and a link to the L1TF document to help
decide if the mitigation is worth the unusable RAM.

[1] https://bugzilla.suse.com/show_bug.cgi?id=1105536

Suggested-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/966571f0-9d7f-43dc-92c6-a10eec7a1254@suse.cz
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/speculation/l1tf: Fix off-by-one error when warning that system has too much RAM
Vlastimil Babka [Thu, 23 Aug 2018 13:44:18 +0000 (15:44 +0200)]
x86/speculation/l1tf: Fix off-by-one error when warning that system has too much RAM

commit b0a182f875689647b014bc01d36b340217792852 upstream.

Two users have reported [1] that they have an "extremely unlikely" system
with more than MAX_PA/2 memory and L1TF mitigation is not effective. In
fact it's a CPU with 36bits phys limit (64GB) and 32GB memory, but due to
holes in the e820 map, the main region is almost 500MB over the 32GB limit:

[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000081effffff] usable

Suggestions to use 'mem=32G' to enable the L1TF mitigation while losing the
500MB revealed, that there's an off-by-one error in the check in
l1tf_select_mitigation().

l1tf_pfn_limit() returns the last usable pfn (inclusive) and the range
check in the mitigation path does not take this into account.

Instead of amending the range check, make l1tf_pfn_limit() return the first
PFN which is over the limit which is less error prone. Adjust the other
users accordingly.

[1] https://bugzilla.suse.com/show_bug.cgi?id=1105536

Fixes: 17dbca119312 ("x86/speculation/l1tf: Add sysfs reporting for l1tf")
Reported-by: George Anchev <studio@anchev.net>
Reported-by: Christopher Snowhill <kode54@gmail.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180823134418.17008-1-vbabka@suse.cz
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/speculation/l1tf: Fix overflow in l1tf_pfn_limit() on 32bit
Vlastimil Babka [Mon, 20 Aug 2018 09:58:35 +0000 (11:58 +0200)]
x86/speculation/l1tf: Fix overflow in l1tf_pfn_limit() on 32bit

commit 9df9516940a61d29aedf4d91b483ca6597e7d480 upstream.

On 32bit PAE kernels on 64bit hardware with enough physical bits,
l1tf_pfn_limit() will overflow unsigned long. This in turn affects
max_swapfile_size() and can lead to swapon returning -EINVAL. This has been
observed in a 32bit guest with 42 bits physical address size, where
max_swapfile_size() overflows exactly to 1 << 32, thus zero, and produces
the following warning to dmesg:

[    6.396845] Truncating oversized swap area, only using 0k out of 2047996k

Fix this by using unsigned long long instead.

Fixes: 17dbca119312 ("x86/speculation/l1tf: Add sysfs reporting for l1tf")
Fixes: 377eeaa8e11f ("x86/speculation/l1tf: Limit swap file size to MAX_PA/2")
Reported-by: Dominique Leuenberger <dimstar@suse.de>
Reported-by: Adrian Schroeter <adrian@suse.de>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180820095835.5298-1-vbabka@suse.cz
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agomm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE
Peter Zijlstra [Wed, 22 Aug 2018 15:30:15 +0000 (17:30 +0200)]
mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

commit d86564a2f085b79ec046a5cba90188e612352806 upstream.

Jann reported that x86 was missing required TLB invalidates when he
hit the !*batch slow path in tlb_remove_table().

This is indeed the case; RCU_TABLE_FREE does not provide TLB (cache)
invalidates, the PowerPC-hash where this code originated and the
Sparc-hash where this was subsequently used did not need that. ARM
which later used this put an explicit TLB invalidate in their
__p*_free_tlb() functions, and PowerPC-radix followed that example.

But when we hooked up x86 we failed to consider this. Fix this by
(optionally) hooking tlb_remove_table() into the TLB invalidate code.

NOTE: s390 was also needing something like this and might now
      be able to use the generic code again.

[ Modified to be on top of Nick's cleanups, which simplified this patch
  now that tlb_flush_mmu_tlbonly() really only flushes the TLB - Linus ]

Fixes: 9e52fc2b50de ("x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rik van Riel <riel@surriel.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agomm: move tlb_table_flush to tlb_flush_mmu_free
Nicholas Piggin [Thu, 23 Aug 2018 08:47:08 +0000 (18:47 +1000)]
mm: move tlb_table_flush to tlb_flush_mmu_free

commit db7ddef301128dad394f1c0f77027f86ee9a4edb upstream.

There is no need to call this from tlb_flush_mmu_tlbonly, it logically
belongs with tlb_flush_mmu_free.  This makes future fixes simpler.

[ This was originally done to allow code consolidation for the
  mmu_notifier fix, but it also ends up helping simplify the
  HAVE_RCU_TABLE_INVALIDATE fix.    - Linus ]

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoplatform/x86: ideapad-laptop: Apply no_hw_rfkill to Y20-15IKBM, too
Takashi Iwai [Fri, 22 Jun 2018 08:59:17 +0000 (10:59 +0200)]
platform/x86: ideapad-laptop: Apply no_hw_rfkill to Y20-15IKBM, too

commit 58e73aa177850babb947555257fd4f79e5275cf1 upstream.

The commit 5d9f40b56630 ("platform/x86: ideapad-laptop: Add
Y520-15IKBN to no_hw_rfkill") added the entry for Y20-15IKBN, and it
turned out that another variant, Y20-15IKBM, also requires the
no_hw_rfkill.

Trim the last letter from the string so that it matches to both
Y20-15IKBN and Y20-15IKBM models.

Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=1098626
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event
Michal Wnukowski [Wed, 15 Aug 2018 22:51:57 +0000 (15:51 -0700)]
nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event

commit f1ed3df20d2d223e0852cc4ac1f19bba869a7e3c upstream.

In many architectures loads may be reordered with older stores to
different locations.  In the nvme driver the following two operations
could be reordered:

 - Write shadow doorbell (dbbuf_db) into memory.
 - Read EventIdx (dbbuf_ei) from memory.

This can result in a potential race condition between driver and VM host
processing requests (if given virtual NVMe controller has a support for
shadow doorbell).  If that occurs, then the NVMe controller may decide to
wait for MMIO doorbell from guest operating system, and guest driver may
decide not to issue MMIO doorbell on any of subsequent commands.

This issue is purely timing-dependent one, so there is no easy way to
reproduce it. Currently the easiest known approach is to run "Oracle IO
Numbers" (orion) that is shipped with Oracle DB:

orion -run advanced -num_large 0 -size_small 8 -type rand -simulate \
concat -write 40 -duration 120 -matrix row -testname nvme_test

Where nvme_test is a .lun file that contains a list of NVMe block
devices to run test against. Limiting number of vCPUs assigned to given
VM instance seems to increase chances for this bug to occur. On test
environment with VM that got 4 NVMe drives and 1 vCPU assigned the
virtual NVMe controller hang could be observed within 10-20 minutes.
That correspond to about 400-500k IO operations processed (or about
100GB of IO read/writes).

Orion tool was used as a validation and set to run in a loop for 36
hours (equivalent of pushing 550M IO operations). No issues were
observed. That suggest that the patch fixes the issue.

Fixes: f9f38e33389c ("nvme: improve performance for virtual NVMe devices")
Signed-off-by: Michal Wnukowski <wnukowski@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
[hch: updated changelog and comment a bit]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoext4: reset error code in ext4_find_entry in fallback
Eric Sandeen [Sun, 29 Jul 2018 21:13:42 +0000 (17:13 -0400)]
ext4: reset error code in ext4_find_entry in fallback

commit f39b3f45dbcb0343822cce31ea7636ad66e60bc2 upstream.

When ext4_find_entry() falls back to "searching the old fashioned
way" due to a corrupt dx dir, it needs to reset the error code
to NULL so that the nonstandard ERR_BAD_DX_DIR code isn't returned
to userspace.

https://bugzilla.kernel.org/show_bug.cgi?id=199947

Reported-by: Anatoly Trosinenko <anatoly.trosinenko@yandex.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoext4: sysfs: print ext4_super_block fields as little-endian
Arnd Bergmann [Sun, 29 Jul 2018 19:48:00 +0000 (15:48 -0400)]
ext4: sysfs: print ext4_super_block fields as little-endian

commit a4d2aadca184ece182418950d45ba4ffc7b652d2 upstream.

While working on extended rand for last_error/first_error timestamps,
I noticed that the endianess is wrong; we access the little-endian
fields in struct ext4_super_block as native-endian when we print them.

This adds a special case in ext4_attr_show() and ext4_attr_store()
to byteswap the superblock fields if needed.

In older kernels, this code was part of super.c, it got moved to
sysfs.c in linux-4.4.

Cc: stable@vger.kernel.org
Fixes: 52c198c6820f ("ext4: add sysfs entry showing whether the fs contains errors")
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoext4: check for NUL characters in extended attribute's name
Theodore Ts'o [Wed, 1 Aug 2018 16:36:52 +0000 (12:36 -0400)]
ext4: check for NUL characters in extended attribute's name

commit 7d95178c77014dbd8dce36ee40bbbc5e6c121ff5 upstream.

Extended attribute names are defined to be NUL-terminated, so the name
must not contain a NUL character.  This is important because there are
places when remove extended attribute, the code uses strlen to
determine the length of the entry.  That should probably be fixed at
some point, but code is currently really messy, so the simplest fix
for now is to simply validate that the extended attributes are sane.

https://bugzilla.kernel.org/show_bug.cgi?id=200401

Reported-by: Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agostop_machine: Atomically queue and wake stopper threads
Prasad Sodagudi [Fri, 3 Aug 2018 20:56:06 +0000 (13:56 -0700)]
stop_machine: Atomically queue and wake stopper threads

commit cfd355145c32bb7ccb65fccbe2d67280dc2119e1 upstream.

When cpu_stop_queue_work() releases the lock for the stopper
thread that was queued into its wake queue, preemption is
enabled, which leads to the following deadlock:

CPU0                              CPU1
sched_setaffinity(0, ...)
__set_cpus_allowed_ptr()
stop_one_cpu(0, ...)              stop_two_cpus(0, 1, ...)
cpu_stop_queue_work(0, ...)       cpu_stop_queue_two_works(0, ..., 1, ...)

-grabs lock for migration/0-
                                  -spins with preemption disabled,
                                   waiting for migration/0's lock to be
                                   released-

-adds work items for migration/0
and queues migration/0 to its
wake_q-

-releases lock for migration/0
 and preemption is enabled-

-current thread is preempted,
and __set_cpus_allowed_ptr
has changed the thread's
cpu allowed mask to CPU1 only-

                                  -acquires migration/0 and migration/1's
                                   locks-

                                  -adds work for migration/0 but does not
                                   add migration/0 to wake_q, since it is
                                   already in a wake_q-

                                  -adds work for migration/1 and adds
                                   migration/1 to its wake_q-

                                  -releases migration/0 and migration/1's
                                   locks, wakes migration/1, and enables
                                   preemption-

                                  -since migration/1 is requested to run,
                                   migration/1 begins to run and waits on
                                   migration/0, but migration/0 will never
                                   be able to run, since the thread that
                                   can wake it is affine to CPU1-

Disable preemption in cpu_stop_queue_work() before queueing works for
stopper threads, and queueing the stopper thread in the wake queue, to
ensure that the operation of queueing the works and waking the stopper
threads is atomic.

Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock")
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: peterz@infradead.org
Cc: matt@codeblueprint.co.uk
Cc: bigeasy@linutronix.de
Cc: gregkh@linuxfoundation.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1533329766-4856-1-git-send-email-isaacm@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Co-Developed-by: Isaac J. Manjarres <isaacm@codeaurora.org>
5 years agostop_machine: Reflow cpu_stop_queue_two_works()
Peter Zijlstra [Mon, 30 Jul 2018 11:21:40 +0000 (13:21 +0200)]
stop_machine: Reflow cpu_stop_queue_two_works()

commit b80a2bfce85e1051056d98d04ecb2d0b55cbbc1c upstream.

The code flow in cpu_stop_queue_two_works() is a little arcane; fix this by
lifting the preempt_disable() to the top to create more natural nesting wrt
the spinlocks and make the wake_up_q() and preempt_enable() unconditional
at the end.

Furthermore, enable preemption in the -EDEADLK case, such that we spin-wait
with preemption enabled.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: isaacm@codeaurora.org
Cc: matt@codeblueprint.co.uk
Cc: psodagud@codeaurora.org
Cc: gregkh@linuxfoundation.org
Cc: pkondeti@codeaurora.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180730112140.GH2494@hirez.programming.kicks-ass.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agos390/kvm: fix deadlock when killed by oom
Claudio Imbrenda [Mon, 16 Jul 2018 08:38:57 +0000 (10:38 +0200)]
s390/kvm: fix deadlock when killed by oom

commit 306d6c49ac9ded11114cb53b0925da52f2c2ada1 upstream.

When the oom killer kills a userspace process in the page fault handler
while in guest context, the fault handler fails to release the mm_sem
if the FAULT_FLAG_RETRY_NOWAIT option is set. This leads to a deadlock
when tearing down the mm when the process terminates. This bug can only
happen when pfault is enabled, so only KVM clients are affected.

The problem arises in the rare cases in which handle_mm_fault does not
release the mm_sem. This patch fixes the issue by manually releasing
the mm_sem when needed.

Fixes: 24eb3a824c4f3 ("KVM: s390: Add FAULT_FLAG_RETRY_NOWAIT for guest fault")
Cc: <stable@vger.kernel.org> # 3.15+
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoKVM: arm/arm64: Skip updating PTE entry if no change
Punit Agrawal [Mon, 13 Aug 2018 10:43:51 +0000 (11:43 +0100)]
KVM: arm/arm64: Skip updating PTE entry if no change

commit 976d34e2dab10ece5ea8fe7090b7692913f89084 upstream.

When there is contention on faulting in a particular page table entry
at stage 2, the break-before-make requirement of the architecture can
lead to additional refaulting due to TLB invalidation.

Avoid this by skipping a page table update if the new value of the PTE
matches the previous value.

Cc: stable@vger.kernel.org
Fixes: d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoKVM: arm/arm64: Skip updating PMD entry if no change
Punit Agrawal [Mon, 13 Aug 2018 10:43:50 +0000 (11:43 +0100)]
KVM: arm/arm64: Skip updating PMD entry if no change

commit 86658b819cd0a9aa584cd84453ed268a6f013770 upstream.

Contention on updating a PMD entry by a large number of vcpus can lead
to duplicate work when handling stage 2 page faults. As the page table
update follows the break-before-make requirement of the architecture,
it can lead to repeated refaults due to clearing the entry and
flushing the tlbs.

This problem is more likely when -

* there are large number of vcpus
* the mapping is large block mapping

such as when using PMD hugepages (512MB) with 64k pages.

Fix this by skipping the page table update if there is no change in
the entry being updated.

Cc: stable@vger.kernel.org
Fixes: ad361f093c1e ("KVM: ARM: Support hugetlbfs backed huge pages")
Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoarm64: dts: rockchip: corrected uart1 clock-names for rk3328
Huibin Hong [Fri, 6 Jul 2018 08:03:57 +0000 (16:03 +0800)]
arm64: dts: rockchip: corrected uart1 clock-names for rk3328

commit d0414fdd58eb51ffd6528280fd66705123663964 upstream.

Corrected the uart clock-names or the uart driver might fail.

Fixes: 52e02d377a72 ("arm64: dts: rockchip: add core dtsi file for RK3328 SoCs")
Cc: stable@vger.kernel.org
Signed-off-by: Huibin Hong <huibin.hong@rock-chips.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoarm64: mm: check for upper PAGE_SHIFT bits in pfn_valid()
Greg Hackmann [Wed, 15 Aug 2018 19:51:21 +0000 (12:51 -0700)]
arm64: mm: check for upper PAGE_SHIFT bits in pfn_valid()

commit 5ad356eabc47d26a92140a0c4b20eba471c10de3 upstream.

ARM64's pfn_valid() shifts away the upper PAGE_SHIFT bits of the input
before seeing if the PFN is valid.  This leads to false positives when
some of the upper bits are set, but the lower bits match a valid PFN.

For example, the following userspace code looks up a bogus entry in
/proc/kpageflags:

    int pagemap = open("/proc/self/pagemap", O_RDONLY);
    int pageflags = open("/proc/kpageflags", O_RDONLY);
    uint64_t pfn, val;

    lseek64(pagemap, [...], SEEK_SET);
    read(pagemap, &pfn, sizeof(pfn));
    if (pfn & (1UL << 63)) {        /* valid PFN */
        pfn &= ((1UL << 55) - 1);   /* clear flag bits */
        pfn |= (1UL << 55);
        lseek64(pageflags, pfn * sizeof(uint64_t), SEEK_SET);
        read(pageflags, &val, sizeof(val));
    }

On ARM64 this causes the userspace process to crash with SIGSEGV rather
than reading (1 << KPF_NOPAGE).  kpageflags_read() treats the offset as
valid, and stable_page_flags() will try to access an address between the
user and kernel address ranges.

Fixes: c1cc1552616d ("arm64: MMU initialisation")
Cc: stable@vger.kernel.org
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agokprobes/arm64: Fix %p uses in error messages
Masami Hiramatsu [Sat, 28 Apr 2018 12:38:04 +0000 (21:38 +0900)]
kprobes/arm64: Fix %p uses in error messages

commit 0722867dcbc28cc9b269b57acd847c7c1aa638d6 upstream.

Fix %p uses in error messages by removing it because
those are redundant or meaningless.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Howells <dhowells@redhat.com>
Cc: David S . Miller <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jon Medhurst <tixy@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Tobin C . Harding <me@tobin.cc>
Cc: acme@kernel.org
Cc: akpm@linux-foundation.org
Cc: brueckner@linux.vnet.ibm.com
Cc: linux-arch@vger.kernel.org
Cc: rostedt@goodmis.org
Cc: schwidefsky@de.ibm.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/lkml/152491908405.9916.12425053035317241111.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoprintk/nmi: Prevent deadlock when accessing the main log buffer in NMI
Petr Mladek [Wed, 27 Jun 2018 14:20:28 +0000 (16:20 +0200)]
printk/nmi: Prevent deadlock when accessing the main log buffer in NMI

commit 03fc7f9c99c1e7ae2925d459e8487f1a6f199f79 upstream.

The commit 719f6a7040f1bdaf96 ("printk: Use the main logbuf in NMI
when logbuf_lock is available") brought back the possible deadlocks
in printk() and NMI.

The check of logbuf_lock is done only in printk_nmi_enter() to prevent
mixed output. But another CPU might take the lock later, enter NMI, and:

      + Both NMIs might be serialized by yet another lock, for example,
the one in nmi_cpu_backtrace().

      + The other CPU might get stopped in NMI, see smp_send_stop()
in panic().

The only safe solution is to use trylock when storing the message
into the main log-buffer. It might cause reordering when some lines
go to the main lock buffer directly and others are delayed via
the per-CPU buffer. It means that it is not useful in general.

This patch replaces the problematic NMI deferred context with NMI
direct context. It can be used to mark a code that might produce
many messages in NMI and the risk of losing them is more critical
than problems with eventual reordering.

The context is then used when dumping trace buffers on oops. It was
the primary motivation for the original fix. Also the reordering is
even smaller issue there because some traces have their own time stamps.

Finally, nmi_cpu_backtrace() need not longer be serialized because
it will always us the per-CPU buffers again.

Fixes: 719f6a7040f1bdaf96 ("printk: Use the main logbuf in NMI when logbuf_lock is available")
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20180627142028.11259-1-pmladek@suse.com
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoprintk: Create helper function to queue deferred console handling
Petr Mladek [Wed, 27 Jun 2018 14:08:16 +0000 (16:08 +0200)]
printk: Create helper function to queue deferred console handling

commit a338f84dc196f44b63ba0863d2f34fd9b1613572 upstream.

It is just a preparation step. The patch does not change
the existing behavior.

Link: http://lkml.kernel.org/r/20180627140817.27764-3-pmladek@suse.com
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoprintk: Split the code for storing a message into the log buffer
Petr Mladek [Wed, 27 Jun 2018 14:08:15 +0000 (16:08 +0200)]
printk: Split the code for storing a message into the log buffer

commit ba552399954dde1b388f7749fecad5c349216981 upstream.

It is just a preparation step. The patch does not change
the existing behavior.

Link: http://lkml.kernel.org/r/20180627140817.27764-2-pmladek@suse.com
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoiommu/arm-smmu: Error out only if not enough context interrupts
Vivek Gautam [Thu, 19 Jul 2018 17:53:56 +0000 (23:23 +0530)]
iommu/arm-smmu: Error out only if not enough context interrupts

commit d1e20222d5372e951bbb2fd3f6489ec4a6ea9b11 upstream.

Currently we check if the number of context banks is not equal to
num_context_interrupts. However, there are booloaders such as, one
on sdm845 that reserves few context banks and thus kernel views
less than the total available context banks.
So, although the hardware definition in device tree would mention
the correct number of context interrupts, this number can be
greater than the number of context banks visible to smmu in kernel.
We should therefore error out only when the number of context banks
is greater than the available number of context interrupts.

Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Suggested-by: Tomasz Figa <tfiga@chromium.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
[will: drop useless printk]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoBtrfs: fix btrfs_write_inode vs delayed iput deadlock
Josef Bacik [Fri, 20 Jul 2018 18:46:10 +0000 (11:46 -0700)]
Btrfs: fix btrfs_write_inode vs delayed iput deadlock

commit 3c4276936f6fbe52884b4ea4e6cc120b890a0f9f upstream.

We recently ran into the following deadlock involving
btrfs_write_inode():

[  +0.005066]  __schedule+0x38e/0x8c0
[  +0.007144]  schedule+0x36/0x80
[  +0.006447]  bit_wait+0x11/0x60
[  +0.006446]  __wait_on_bit+0xbe/0x110
[  +0.007487]  ? bit_wait_io+0x60/0x60
[  +0.007319]  __inode_wait_for_writeback+0x96/0xc0
[  +0.009568]  ? autoremove_wake_function+0x40/0x40
[  +0.009565]  inode_wait_for_writeback+0x21/0x30
[  +0.009224]  evict+0xb0/0x190
[  +0.006099]  iput+0x1a8/0x210
[  +0.006103]  btrfs_run_delayed_iputs+0x73/0xc0
[  +0.009047]  btrfs_commit_transaction+0x799/0x8c0
[  +0.009567]  btrfs_write_inode+0x81/0xb0
[  +0.008008]  __writeback_single_inode+0x267/0x320
[  +0.009569]  writeback_sb_inodes+0x25b/0x4e0
[  +0.008702]  wb_writeback+0x102/0x2d0
[  +0.007487]  wb_workfn+0xa4/0x310
[  +0.006794]  ? wb_workfn+0xa4/0x310
[  +0.007143]  process_one_work+0x150/0x410
[  +0.008179]  worker_thread+0x6d/0x520
[  +0.007490]  kthread+0x12c/0x160
[  +0.006620]  ? put_pwq_unlocked+0x80/0x80
[  +0.008185]  ? kthread_park+0xa0/0xa0
[  +0.007484]  ? do_syscall_64+0x53/0x150
[  +0.007837]  ret_from_fork+0x29/0x40

Writeback calls:

btrfs_write_inode
  btrfs_commit_transaction
    btrfs_run_delayed_iputs

If iput() is called on that same inode, evict() will wait for writeback
forever.

btrfs_write_inode() was originally added way back in 4730a4bc5bf3
("btrfs_dirty_inode") to support O_SYNC writes. However, ->write_inode()
hasn't been used for O_SYNC since 148f948ba877 ("vfs: Introduce new
helpers for syncing after writing to O_SYNC file or IS_SYNC inode"), so
btrfs_write_inode() is actually unnecessary (and leads to a bunch of
unnecessary commits). Get rid of it, which also gets rid of the
deadlock.

CC: stable@vger.kernel.org # 3.2+
Signed-off-by: Josef Bacik <jbacik@fb.com>
[Omar: new commit message]
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agobtrfs: don't leak ret from do_chunk_alloc
Josef Bacik [Thu, 19 Jul 2018 14:49:51 +0000 (10:49 -0400)]
btrfs: don't leak ret from do_chunk_alloc

commit 4559b0a71749c442d34f7cfb9e72c9e58db83948 upstream.

If we're trying to make a data reservation and we have to allocate a
data chunk we could leak ret == 1, as do_chunk_alloc() will return 1 if
it allocated a chunk.  Since the end of the function is the success path
just return 0.

CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agobtrfs: use correct compare function of dirty_metadata_bytes
Ethan Lien [Mon, 2 Jul 2018 07:44:58 +0000 (15:44 +0800)]
btrfs: use correct compare function of dirty_metadata_bytes

commit d814a49198eafa6163698bdd93961302f3a877a4 upstream.

We use customized, nodesize batch value to update dirty_metadata_bytes.
We should also use batch version of compare function or we will easily
goto fast path and get false result from percpu_counter_compare().

Fixes: e2d845211eda ("Btrfs: use percpu counter for dirty metadata count")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Ethan Lien <ethanlien@synology.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agosmb3: fill in statfs fsid and correct namelen
Steve French [Mon, 25 Jun 2018 04:18:52 +0000 (23:18 -0500)]
smb3: fill in statfs fsid and correct namelen

commit 21ba3845b59c733a79ed4fe1c4f3732e7ece9df7 upstream.

Fil in the correct namelen (typically 255 not 4096) in the
statfs response and also fill in a reasonably unique fsid
(in this case taken from the volume id, and the creation time
of the volume).

In the case of the POSIX statfs all fields are now filled in,
and in the case of non-POSIX mounts, all fields are filled
in which can be.

Signed-off-by: Steve French <stfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agosmb3: don't request leases in symlink creation and query
Steve French [Sat, 28 Jul 2018 03:01:49 +0000 (22:01 -0500)]
smb3: don't request leases in symlink creation and query

commit 22783155f4bf956c346a81624ec9258930a6fe06 upstream.

Fixes problem pointed out by Pavel in discussions about commit
729c0c9dd55204f0c9a823ac8a7bfa83d36c7e78

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
CC: Stable <stable@vger.kernel.org> # 3.18.x+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agosmb3: Do not send SMB3 SET_INFO if nothing changed
Steve French [Fri, 3 Aug 2018 01:28:18 +0000 (20:28 -0500)]
smb3: Do not send SMB3 SET_INFO if nothing changed

commit fd09b7d3b352105f08b8e02f7afecf7e816380ef upstream.

An earlier commit had a typo which prevented the
optimization from working:

commit 18dd8e1a65dd ("Do not send SMB3 SET_INFO request if nothing is changing")

Thank you to Metze for noticing this.  Also clear a
reserved field in the FILE_BASIC_INFO struct we send
that should be zero (all the other fields in that
struct were set or cleared explicitly already in
cifs_set_file_info).

Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
CC: Stable <stable@vger.kernel.org> # 4.9.x+
Reported-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agosmb3: enumerating snapshots was leaving part of the data off end
Steve French [Thu, 9 Aug 2018 19:33:12 +0000 (14:33 -0500)]
smb3: enumerating snapshots was leaving part of the data off end

commit e02789a53d71334b067ad72eee5d4e88a0158083 upstream.

When enumerating snapshots, the last few bytes of the final
snapshot could be left off since we were miscalculating the
length returned (leaving off the sizeof struct SRV_SNAPSHOT_ARRAY)
See MS-SMB2 section 2.2.32.2. In addition fixup the length used
to allow smaller buffer to be passed in, in order to allow
returning the size of the whole snapshot array more easily.

Sample userspace output with a kernel patched with this
(mounted to a Windows volume with two snapshots).
Before this patch, the second snapshot would be missing a
few bytes at the end.

~/cifs-2.6# ~/enum-snapshots /mnt/file
press enter to issue the ioctl to retrieve snapshot information ...

size of snapshot array = 102
Num snapshots: 2 Num returned: 2 Array Size: 102

Snapshot 0:@GMT-2018.06.30-19.34.17
Snapshot 1:@GMT-2018.06.30-19.33.37

CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocifs: check kmalloc before use
Nicholas Mc Guire [Thu, 23 Aug 2018 10:24:02 +0000 (12:24 +0200)]
cifs: check kmalloc before use

commit 126c97f4d0d1b5b956e8b0740c81a2b2a2ae548c upstream.

The kmalloc was not being checked - if it fails issue a warning
and return -ENOMEM to the caller.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Fixes: b8da344b74c8 ("cifs: dynamic allocation of ntlmssp blob")
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
cc: Stable <stable@vger.kernel.org>`
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocifs: add missing debug entries for kconfig options
Steve French [Thu, 28 Jun 2018 23:46:40 +0000 (18:46 -0500)]
cifs: add missing debug entries for kconfig options

commit 950132afd59385caf6e2b84e5235d069fa10681d upstream.

/proc/fs/cifs/DebugData displays the features (Kconfig options)
used to build cifs.ko but it was missing some, and needed comma
separator.  These can be useful in debugging certain problems
so we know which optional features were enabled in the user's build.
Also clarify them, by making them more closely match the
corresponding CONFIG_CIFS_* parm.

Old format:
Features: dfs fscache posix spnego xattr acl

New format:
Features: DFS,FSCACHE,SMB_DIRECT,STATS,DEBUG2,ALLOW_INSECURE_LEGACY,CIFS_POSIX,UPCALL(SPNEGO),XATTR,ACL

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Paulo Alcantara <palcantara@suse.de>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agomei: don't update offset in write
Alexander Usyskin [Mon, 9 Jul 2018 09:21:44 +0000 (12:21 +0300)]
mei: don't update offset in write

commit a103af1b64d74853a5e08ca6c86aeb0e5c6ca4f1 upstream.

MEI enables writes of complete messages only
while read can be performed in parts, hence
write should not update the file offset to
not break interleaving partial reads with writes.

Cc: <stable@vger.kernel.org>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agomm/memory.c: check return value of ioremap_prot
jie@chenjie6@huwei.com [Sat, 11 Aug 2018 00:23:06 +0000 (17:23 -0700)]
mm/memory.c: check return value of ioremap_prot

[ Upstream commit 24eee1e4c47977bdfb71d6f15f6011e7b6188d04 ]

ioremap_prot() can return NULL which could lead to an oops.

Link: http://lkml.kernel.org/r/1533195441-58594-1-git-send-email-chenjie6@huawei.com
Signed-off-by: chen jie <chenjie6@huawei.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: chenjie <chenjie6@huawei.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoscsi: vmw_pvscsi: Return DID_RESET for status SAM_STAT_COMMAND_TERMINATED
Jim Gill [Thu, 2 Aug 2018 21:13:30 +0000 (14:13 -0700)]
scsi: vmw_pvscsi: Return DID_RESET for status SAM_STAT_COMMAND_TERMINATED

[ Upstream commit e95153b64d03c2b6e8d62e51bdcc33fcad6e0856 ]

Commands that are reset are returned with status
SAM_STAT_COMMAND_TERMINATED. PVSCSI currently returns DID_OK |
SAM_STAT_COMMAND_TERMINATED which fails the command. Instead, set hostbyte
to DID_RESET to allow upper layers to retry.

Tested by copying a large file between two pvscsi disks on same adapter
while performing a bus reset at 1-second intervals. Before fix, commands
sometimes fail with DID_OK. After fix, commands observed to fail with
DID_RESET.

Signed-off-by: Jim Gill <jgill@vmware.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoscsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO
Johannes Thumshirn [Tue, 31 Jul 2018 13:46:03 +0000 (15:46 +0200)]
scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO

[ Upstream commit 1550ec458e0cf1a40a170ab1f4c46e3f52860f65 ]

When receiving a LOGO request we forget to clear the FC_RP_STARTED flag
before starting the rport delete routine.

As the started flag was not cleared, we're not deleting the rport but
waiting for a restart and thus are keeping the reference count of the rdata
object at 1.

This leads to the following kmemleak report:
unreferenced object 0xffff88006542aa00 (size 512):
  comm "kworker/0:2", pid 24, jiffies 4294899222 (age 226.880s)
  hex dump (first 32 bytes):
    68 96 fe 65 00 88 ff ff 00 00 00 00 00 00 00 00  h..e............
    01 00 00 00 08 00 00 00 02 c5 45 24 ac b8 00 10  ..........E$....
  backtrace:
    [<(____ptrval____)>] fcoe_ctlr_vn_add.isra.5+0x7f/0x770 [libfcoe]
    [<(____ptrval____)>] fcoe_ctlr_vn_recv+0x12af/0x27f0 [libfcoe]
    [<(____ptrval____)>] fcoe_ctlr_recv_work+0xd01/0x32f0 [libfcoe]
    [<(____ptrval____)>] process_one_work+0x7ff/0x1420
    [<(____ptrval____)>] worker_thread+0x87/0xef0
    [<(____ptrval____)>] kthread+0x2db/0x390
    [<(____ptrval____)>] ret_from_fork+0x35/0x40
    [<(____ptrval____)>] 0xffffffffffffffff

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reported-by: ard <ard@kwaak.net>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoscsi: fcoe: drop frames in ELS LOGO error path
Johannes Thumshirn [Tue, 31 Jul 2018 13:46:02 +0000 (15:46 +0200)]
scsi: fcoe: drop frames in ELS LOGO error path

[ Upstream commit 63d0e3dffda311e77b9a8c500d59084e960a824a ]

Drop the frames in the ELS LOGO error path instead of just returning an
error.

This fixes the following kmemleak report:
unreferenced object 0xffff880064cb1000 (size 424):
  comm "kworker/0:2", pid 24, jiffies 4294904293 (age 68.504s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<(____ptrval____)>] _fc_frame_alloc+0x2c/0x180 [libfc]
    [<(____ptrval____)>] fc_lport_enter_logo+0x106/0x360 [libfc]
    [<(____ptrval____)>] fc_fabric_logoff+0x8c/0xc0 [libfc]
    [<(____ptrval____)>] fcoe_if_destroy+0x79/0x3b0 [fcoe]
    [<(____ptrval____)>] fcoe_destroy_work+0xd2/0x170 [fcoe]
    [<(____ptrval____)>] process_one_work+0x7ff/0x1420
    [<(____ptrval____)>] worker_thread+0x87/0xef0
    [<(____ptrval____)>] kthread+0x2db/0x390
    [<(____ptrval____)>] ret_from_fork+0x35/0x40
    [<(____ptrval____)>] 0xffffffffffffffff

which can be triggered by issuing
echo eth0 > /sys/bus/fcoe/ctlr_destroy

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoscsi: fcoe: fix use-after-free in fcoe_ctlr_els_send
Johannes Thumshirn [Tue, 31 Jul 2018 13:46:01 +0000 (15:46 +0200)]
scsi: fcoe: fix use-after-free in fcoe_ctlr_els_send

[ Upstream commit 2d7d4fd35e6e15b47c13c70368da83add19f01e7 ]

KASAN reports a use-after-free in fcoe_ctlr_els_send() when we're sending a
LOGO and have FIP debugging enabled. This is because we're first freeing
the skb and then printing the frame's DID. But the DID is a member of the
FC frame header which in turn is the skb's payload.

Exchange the debug print and kfree_skb() calls so we're not touching the
freed data.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agogpiolib-acpi: make sure we trigger edge events at least once on boot
Benjamin Tissoires [Thu, 12 Jul 2018 15:25:06 +0000 (17:25 +0200)]
gpiolib-acpi: make sure we trigger edge events at least once on boot

[ Upstream commit ca876c7483b697b498868b1f575997191b077885 ]

On some systems using edge triggered ACPI Event Interrupts, the initial
state at boot is not setup by the firmware, instead relying on the edge
irq event handler running at least once to setup the initial state.

2 known examples of this are:

1) The Surface 3 has its _LID state controlled by an ACPI operation region
 triggered by a GPIO event:

 OperationRegion (GPOR, GeneralPurposeIo, Zero, One)
 Field (GPOR, ByteAcc, NoLock, Preserve)
 {
     Connection (
         GpioIo (Shared, PullNone, 0x0000, 0x0000, IoRestrictionNone,
             "\\_SB.GPO0", 0x00, ResourceConsumer, ,
             )
             {   // Pin list
                 0x004C
             }
     ),
     HELD,   1
 }

 Method (_E4C, 0, Serialized)  // _Exx: Edge-Triggered GPE
 {
     If ((HELD == One))
     {
         ^^LID.LIDB = One
     }
     Else
     {
         ^^LID.LIDB = Zero
         Notify (LID, 0x80) // Status Change
     }

     Notify (^^PCI0.SPI1.NTRG, One) // Device Check
 }

 Currently, the state of LIDB is wrong until the user actually closes or
 open the cover. We need to trigger the GPIO event once to update the
 internal ACPI state.

 Coincidentally, this also enables the Surface 2 integrated HID sensor hub
 which also requires an ACPI gpio operation region to start initialization.

2) Various Bay Trail based tablets come with an external USB mux and
 TI T1210B USB phy to enable USB gadget mode. The mux is controlled by a
 GPIO which is controlled by an edge triggered ACPI Event Interrupt which
 monitors the micro-USB ID pin.

 When the tablet is connected to a PC (or no cable is plugged in), the ID
 pin is high and the tablet should be in gadget mode. But the GPIO
 controlling the mux is initialized by the firmware so that the USB data
 lines are muxed to the host controller.

 This means that if the user wants to use gadget mode, the user needs to
 first plug in a host-cable to force the ID pin low and then unplug it
 and connect the tablet to a PC, to get the ACPI event handler to run and
 switch the mux to device mode,

This commit fixes both by running the event-handler once on boot.

Note that the running of the event-handler is done from a late_initcall,
this is done because the handler AML code may rely on OperationRegions
registered by other builtin drivers. This avoids errors like these:

[    0.133026] ACPI Error: No handler for Region [XSCG] ((____ptrval____)) [GenericSerialBus] (20180531/evregion-132)
[    0.133036] ACPI Error: Region GenericSerialBus (ID=9) has no handler (20180531/exfldio-265)
[    0.133046] ACPI Error: Method parse/execution failed \_SB.GPO2._E12, AE_NOT_EXIST (20180531/psparse-516)

Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
[hdegoede: Document BYT USB mux reliance on initial trigger]
[hdegoede: Run event handler from a late_initcall, rather then immediately]
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agomemcg: remove memcg_cgroup::id from IDR on mem_cgroup_css_alloc() failure
Kirill Tkhai [Thu, 2 Aug 2018 22:36:01 +0000 (15:36 -0700)]
memcg: remove memcg_cgroup::id from IDR on mem_cgroup_css_alloc() failure

[ Upstream commit 7e97de0b033bcac4fa9a35cef72e0c06e6a22c67 ]

In case of memcg_online_kmem() failure, memcg_cgroup::id remains hashed
in mem_cgroup_idr even after memcg memory is freed.  This leads to leak
of ID in mem_cgroup_idr.

This patch adds removal into mem_cgroup_css_alloc(), which fixes the
problem.  For better readability, it adds a generic helper which is used
in mem_cgroup_alloc() and mem_cgroup_id_put_many() as well.

Link: http://lkml.kernel.org/r/152354470916.22460.14397070748001974638.stgit@localhost.localdomain
Fixes 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure after many small jobs")
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodrivers: net: lmc: fix case value for target abort error
Colin Ian King [Wed, 1 Aug 2018 17:22:41 +0000 (18:22 +0100)]
drivers: net: lmc: fix case value for target abort error

[ Upstream commit afb41bb039656f0cecb54eeb8b2e2088201295f5 ]

Current value for a target abort error is 0x010, however, this value
should in fact be 0x002.  As it stands, the range of error is 0..7 so
it is currently never being detected.  This bug has been in the driver
since the early 2.6.12 days (or before).

Detected by CoverityScan, CID#744290 ("Logically dead code")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoSquashfs: Compute expected length from inode size rather than block length
Phillip Lougher [Thu, 2 Aug 2018 15:45:15 +0000 (16:45 +0100)]
Squashfs: Compute expected length from inode size rather than block length

[ Upstream commit a3f94cb99a854fa381fe7fadd97c4f61633717a5 ]

Previously in squashfs_readpage() when copying data into the page
cache, it used the length of the datablock read from the filesystem
(after decompression).  However, if the filesystem has been corrupted
this data block may be short, which will leave pages unfilled.

The fix for this is to compute the expected number of bytes to copy
from the inode size, and use this to detect if the block is short.

Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Tested-by: Willy Tarreau <w@1wt.eu>
Cc: Анатолий Тросиненко <anatoly.trosinenko@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agomm: delete historical BUG from zap_pmd_range()
Hugh Dickins [Wed, 1 Aug 2018 18:31:52 +0000 (11:31 -0700)]
mm: delete historical BUG from zap_pmd_range()

[ Upstream commit 53406ed1bcfdabe4b5bc35e6d17946c6f9f563e2 ]

Delete the old VM_BUG_ON_VMA() from zap_pmd_range(), which asserted
that mmap_sem must be held when splitting an "anonymous" vma there.
Whether that's still strictly true nowadays is not entirely clear,
but the danger of sometimes crashing on the BUG is now fairly clear.

Even with the new stricter rules for anonymous vma marking, the
condition it checks for can possible trigger. Commit 44960f2a7b63
("staging: ashmem: Fix SIGBUS crash when traversing mmaped ashmem
pages") is good, and originally I thought it was safe from that
VM_BUG_ON_VMA(), because the /dev/ashmem fd exposed to the user is
disconnected from the vm_file in the vma, and madvise(,,MADV_REMOVE)
insists on VM_SHARED.

But after I read John's earlier mail, drawing attention to the
vfs_fallocate() in there: I may be wrong, and I don't know if Android
has THP in the config anyway, but it looks to me like an
unmap_mapping_range() from ashmem's vfs_fallocate() could hit precisely
the VM_BUG_ON_VMA(), once it's vma_is_anonymous().

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agosquashfs metadata 2: electric boogaloo
Linus Torvalds [Wed, 1 Aug 2018 17:38:43 +0000 (10:38 -0700)]
squashfs metadata 2: electric boogaloo

[ Upstream commit cdbb65c4c7ead680ebe54f4f0d486e2847a500ea ]

Anatoly continues to find issues with fuzzed squashfs images.

This time, corrupt, missing, or undersized data for the page filling
wasn't checked for, because the squashfs_{copy,read}_cache() functions
did the squashfs_copy_data() call without checking the resulting data
size.

Which could result in the page cache pages being incompletely filled in,
and no error indication to the user space reading garbage data.

So make a helper function for the "fill in pages" case, because the
exact same incomplete sequence existed in two places.

[ I should have made a squashfs branch for these things, but I didn't
  intend to start doing them in the first place.

  My historical connection through cramfs is why I got into looking at
  these issues at all, and every time I (continue to) think it's a
  one-off.

  Because _this_ time is always the last time. Right?   - Linus ]

Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Tested-by: Willy Tarreau <w@1wt.eu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoenic: do not call enic_change_mtu in enic_probe
Govindarajulu Varadarajan [Mon, 30 Jul 2018 16:56:54 +0000 (09:56 -0700)]
enic: do not call enic_change_mtu in enic_probe

[ Upstream commit cb5c6568867325f9905e80c96531d963bec8e5ea ]

In commit ab123fe071c9 ("enic: handle mtu change for vf properly")
ASSERT_RTNL() is added to _enic_change_mtu() to prevent it from being
called without rtnl held. enic_probe() calls enic_change_mtu()
without rtnl held. At this point netdev is not registered yet.
Remove call to enic_change_mtu and assign the mtu to netdev->mtu.

Fixes: ab123fe071c9 ("enic: handle mtu change for vf properly")
Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agosparc: use asm-generic version of msi.h
Thomas Petazzoni [Tue, 24 Jul 2018 11:53:05 +0000 (13:53 +0200)]
sparc: use asm-generic version of msi.h

[ Upstream commit 12be1036c536f849ad6f9bba73cffa708aa965c3 ]

This is necessary to be able to include <linux/msi.h> when
CONFIG_GENERIC_MSI_IRQ_DOMAIN is enabled. Without this, a build with
CONFIG_GENERIC_MSI_IRQ_DOMAIN fails with:

   In file included from drivers//ata/ahci.c:45:0:
>> include/linux/msi.h:226:10: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
             msi_alloc_info_t *arg);
             ^~~~~~~~~~~~~~~~
             sg_alloc_fn
   include/linux/msi.h:230:9: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
            msi_alloc_info_t *arg);
            ^~~~~~~~~~~~~~~~
            sg_alloc_fn
   include/linux/msi.h:239:12: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
               msi_alloc_info_t *arg);
               ^~~~~~~~~~~~~~~~
               sg_alloc_fn
   include/linux/msi.h:240:22: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
     void  (*msi_finish)(msi_alloc_info_t *arg, int retval);
                         ^~~~~~~~~~~~~~~~
                         sg_alloc_fn
   include/linux/msi.h:241:20: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
     void  (*set_desc)(msi_alloc_info_t *arg,
                       ^~~~~~~~~~~~~~~~
                       sg_alloc_fn
   include/linux/msi.h:316:18: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
           int nvec, msi_alloc_info_t *args);
                     ^~~~~~~~~~~~~~~~
                     sg_alloc_fn
   include/linux/msi.h:318:29: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
            int virq, int nvec, msi_alloc_info_t *args);
                                ^~~~~~~~~~~~~~~~
                                sg_alloc_fn

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agosparc/time: Add missing __init to init_tick_ops()
Steven Rostedt (VMware) [Wed, 6 Jun 2018 14:11:10 +0000 (10:11 -0400)]
sparc/time: Add missing __init to init_tick_ops()

[ Upstream commit 6f57ed681ed817a4ec444e83f3aa2ad695d5ef34 ]

Code that was added to force gcc not to inline any function that isn't
explicitly declared as inline uncovered that init_tick_ops() isn't
marked as "__init". It is only called by __init functions and more
importantly it too calls an __init function which would require it to be
__init as well.

Link: http://lkml.kernel.org/r/201806060444.hdHcKOBy%fengguang.wu@intel.com
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoarc: fix type warnings in arc/mm/cache.c
Randy Dunlap [Fri, 27 Jul 2018 03:16:35 +0000 (20:16 -0700)]
arc: fix type warnings in arc/mm/cache.c

[ Upstream commit ec837d620c750c0d4996a907c8c4f7febe1bbeee ]

Fix type warnings in arch/arc/mm/cache.c.

../arch/arc/mm/cache.c: In function 'flush_anon_page':
../arch/arc/mm/cache.c:1062:55: warning: passing argument 2 of '__flush_dcache_page' makes integer from pointer without a cast [-Wint-conversion]
  __flush_dcache_page((phys_addr_t)page_address(page), page_address(page));
                                                       ^~~~~~~~~~~~~~~~~~
../arch/arc/mm/cache.c:1013:59: note: expected 'long unsigned int' but argument is of type 'void *'
 void __flush_dcache_page(phys_addr_t paddr, unsigned long vaddr)
                                             ~~~~~~~~~~~~~~^~~~~

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: linux-snps-arc@lists.infradead.org
Cc: Elad Kanfi <eladkan@mellanox.com>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: Ofer Levi <oferle@mellanox.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoarc: fix build errors in arc/include/asm/delay.h
Randy Dunlap [Fri, 27 Jul 2018 03:16:35 +0000 (20:16 -0700)]
arc: fix build errors in arc/include/asm/delay.h

[ Upstream commit 2423665ec53f2a29191b35382075e9834288a975 ]

Fix build errors in arch/arc/'s delay.h:
- add "extern unsigned long loops_per_jiffy;"
- add <asm-generic/types.h> for "u64"

In file included from ../drivers/infiniband/hw/cxgb3/cxio_hal.c:32:
../arch/arc/include/asm/delay.h: In function '__udelay':
../arch/arc/include/asm/delay.h:61:12: error: 'u64' undeclared (first use in this function)
  loops = ((u64) usecs * 4295 * HZ * loops_per_jiffy) >> 32;
            ^~~

In file included from ../drivers/infiniband/hw/cxgb3/cxio_hal.c:32:
../arch/arc/include/asm/delay.h: In function '__udelay':
../arch/arc/include/asm/delay.h:63:37: error: 'loops_per_jiffy' undeclared (first use in this function)
  loops = ((u64) usecs * 4295 * HZ * loops_per_jiffy) >> 32;
                                     ^~~~~~~~~~~~~~~

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: linux-snps-arc@lists.infradead.org
Cc: Elad Kanfi <eladkan@mellanox.com>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: Ofer Levi <oferle@mellanox.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoarc: [plat-eznps] fix printk warning in arc/plat-eznps/mtm.c
Randy Dunlap [Fri, 27 Jul 2018 03:16:35 +0000 (20:16 -0700)]
arc: [plat-eznps] fix printk warning in arc/plat-eznps/mtm.c

[ Upstream commit 9e2ea405543d9ddfe05b351f1679e53bd9c11f80 ]

Fix printk format warning in arch/arc/plat-eznps/mtm.c:

In file included from ../include/linux/printk.h:7,
                 from ../include/linux/kernel.h:14,
                 from ../include/linux/list.h:9,
                 from ../include/linux/smp.h:12,
                 from ../arch/arc/plat-eznps/mtm.c:17:
../arch/arc/plat-eznps/mtm.c: In function 'set_mtm_hs_ctr':
../include/linux/kern_levels.h:5:18: warning: format '%d' expects argument of type 'int', but argument 2 has type 'long int' [-Wformat=]
 #define KERN_SOH "\001"  /* ASCII Start Of Header */
                  ^~~~~~
../include/linux/kern_levels.h:11:18: note: in expansion of macro 'KERN_SOH'
 #define KERN_ERR KERN_SOH "3" /* error conditions */
                  ^~~~~~~~
../include/linux/printk.h:308:9: note: in expansion of macro 'KERN_ERR'
  printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
         ^~~~~~~~
../arch/arc/plat-eznps/mtm.c:166:3: note: in expansion of macro 'pr_err'
   pr_err("** Invalid @nps_mtm_hs_ctr [%d] needs to be [%d:%d] (incl)\n",
   ^~~~~~
../arch/arc/plat-eznps/mtm.c:166:40: note: format string is defined here
   pr_err("** Invalid @nps_mtm_hs_ctr [%d] needs to be [%d:%d] (incl)\n",
                                       ~^
                                       %ld
The hs_ctr variable can just be int instead of long, so also change
kstrtol() to kstrtoint() and leave the format string as %d.

Also add 2 header files since they are used in mtm.c and we prefer
not to depend on accidental/indirect #includes.

Cc: linux-snps-arc@lists.infradead.org
Cc: Ofer Levi <oferle@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoarc: [plat-eznps] fix data type errors in platform headers
Randy Dunlap [Sun, 29 Jul 2018 18:10:51 +0000 (11:10 -0700)]
arc: [plat-eznps] fix data type errors in platform headers

[ Upstream commit b1f32ce1c3d2c11959b7e6a2c58dc5197c581966 ]

Add <linux/types.h> to fix build errors.
Both ctop.h and <soc/nps/common.h> use u32 types and cause many
errors.

Examples:
../include/soc/nps/common.h:71:4: error: unknown type name 'u32'
    u32 __reserved:20, cluster:4, core:4, thread:4;
../include/soc/nps/common.h:76:3: error: unknown type name 'u32'
   u32 value;
../include/soc/nps/common.h:124:4: error: unknown type name 'u32'
    u32 base:8, cl_x:4, cl_y:4,
../include/soc/nps/common.h:127:3: error: unknown type name 'u32'
   u32 value;

../arch/arc/plat-eznps/include/plat/ctop.h:83:4: error: unknown type name 'u32'
    u32 gen:1, gdis:1, clk_gate_dis:1, asb:1,
../arch/arc/plat-eznps/include/plat/ctop.h:86:3: error: unknown type name 'u32'
   u32 value;
../arch/arc/plat-eznps/include/plat/ctop.h:93:4: error: unknown type name 'u32'
    u32 csa:22, dmsid:6, __reserved:3, cs:1;
../arch/arc/plat-eznps/include/plat/ctop.h:95:3: error: unknown type name 'u32'
   u32 value;

Cc: linux-snps-arc@lists.infradead.org
Cc: Ofer Levi <oferle@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoARC: [plat-eznps] Add missing struct nps_host_reg_aux_dpc
Ofer Levi [Sat, 28 Jul 2018 07:54:41 +0000 (10:54 +0300)]
ARC: [plat-eznps] Add missing struct nps_host_reg_aux_dpc

[ Upstream commit 05b466bf846d2e8d2f0baf8dfd81a42cc933e237 ]

Fixing compilation issue caused by missing struct nps_host_reg_aux_dpc
definition.

Fixes: 3f9cd874dcc87 ("ARC: [plat-eznps] avoid toggling of DPC register")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Ofer Levi <oferle@mellanox.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoenic: handle mtu change for vf properly
Govindarajulu Varadarajan [Fri, 27 Jul 2018 18:19:29 +0000 (11:19 -0700)]
enic: handle mtu change for vf properly

[ Upstream commit ab123fe071c9aa9680ecd62eb080eb26cff4892c ]

When driver gets notification for mtu change, driver does not handle it for
all RQs. It handles only RQ[0].

Fix is to use enic_change_mtu() interface to change mtu for vf.

Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonfp: flower: fix port metadata conversion bug
John Hurley [Sat, 28 Jul 2018 03:56:52 +0000 (20:56 -0700)]
nfp: flower: fix port metadata conversion bug

[ Upstream commit ee614c871014045b45fae149b7245fc22a0bbdd8 ]

Function nfp_flower_repr_get_type_and_port expects an enum nfp_repr_type
return value but, if the repr type is unknown, returns a value of type
enum nfp_flower_cmsg_port_type.  This means that if FW encodes the port
ID in a way the driver does not understand instead of dropping the frame
driver may attribute it to a physical port (uplink) provided the port
number is less than physical port count.

Fix this and ensure a net_device of NULL is returned if the repr can not
be determined.

Fixes: 1025351a88a4 ("nfp: add flower app")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agobpf: use GFP_ATOMIC instead of GFP_KERNEL in bpf_parse_prog()
Taehee Yoo [Sat, 28 Jul 2018 15:28:31 +0000 (00:28 +0900)]
bpf: use GFP_ATOMIC instead of GFP_KERNEL in bpf_parse_prog()

[ Upstream commit 71eb5255f55bdb484d35ff7c9a1803f453dfbf82 ]

bpf_parse_prog() is protected by rcu_read_lock().
so that GFP_KERNEL is not allowed in the bpf_parse_prog().

[51015.579396] =============================
[51015.579418] WARNING: suspicious RCU usage
[51015.579444] 4.18.0-rc6+ #208 Not tainted
[51015.579464] -----------------------------
[51015.579488] ./include/linux/rcupdate.h:303 Illegal context switch in RCU read-side critical section!
[51015.579510] other info that might help us debug this:
[51015.579532] rcu_scheduler_active = 2, debug_locks = 1
[51015.579556] 2 locks held by ip/1861:
[51015.579577]  #0: 00000000a8c12fd1 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x2e0/0x910
[51015.579711]  #1: 00000000bf815f8e (rcu_read_lock){....}, at: lwtunnel_build_state+0x96/0x390
[51015.579842] stack backtrace:
[51015.579869] CPU: 0 PID: 1861 Comm: ip Not tainted 4.18.0-rc6+ #208
[51015.579891] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015
[51015.579911] Call Trace:
[51015.579950]  dump_stack+0x74/0xbb
[51015.580000]  ___might_sleep+0x16b/0x3a0
[51015.580047]  __kmalloc_track_caller+0x220/0x380
[51015.580077]  kmemdup+0x1c/0x40
[51015.580077]  bpf_parse_prog+0x10e/0x230
[51015.580164]  ? kasan_kmalloc+0xa0/0xd0
[51015.580164]  ? bpf_destroy_state+0x30/0x30
[51015.580164]  ? bpf_build_state+0xe2/0x3e0
[51015.580164]  bpf_build_state+0x1bb/0x3e0
[51015.580164]  ? bpf_parse_prog+0x230/0x230
[51015.580164]  ? lock_is_held_type+0x123/0x1a0
[51015.580164]  lwtunnel_build_state+0x1aa/0x390
[51015.580164]  fib_create_info+0x1579/0x33d0
[51015.580164]  ? sched_clock_local+0xe2/0x150
[51015.580164]  ? fib_info_update_nh_saddr+0x1f0/0x1f0
[51015.580164]  ? sched_clock_local+0xe2/0x150
[51015.580164]  fib_table_insert+0x201/0x1990
[51015.580164]  ? lock_downgrade+0x610/0x610
[51015.580164]  ? fib_table_lookup+0x1920/0x1920
[51015.580164]  ? lwtunnel_valid_encap_type.part.6+0xcb/0x3a0
[51015.580164]  ? rtm_to_fib_config+0x637/0xbd0
[51015.580164]  inet_rtm_newroute+0xed/0x1b0
[51015.580164]  ? rtm_to_fib_config+0xbd0/0xbd0
[51015.580164]  rtnetlink_rcv_msg+0x331/0x910
[ ... ]

Fixes: 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoARC: dma [non-IOC] setup SMP_CACHE_BYTES and cache_line_size
Eugeniy Paltsev [Thu, 26 Jul 2018 13:15:43 +0000 (16:15 +0300)]
ARC: dma [non-IOC] setup SMP_CACHE_BYTES and cache_line_size

[ Upstream commit eb2777397fd83a4a7eaa26984d09d3babb845d2a ]

As for today we don't setup SMP_CACHE_BYTES and cache_line_size for
ARC, so they are set to L1_CACHE_BYTES by default. L1 line length
(L1_CACHE_BYTES) might be easily smaller than L2 line (which is
usually the case BTW). This breaks code.

For example this breaks ethernet infrastructure on HSDK/AXS103 boards
with IOC disabled, involving manual cache flushes
Functions which alloc and manage sk_buff packet data area rely on
SMP_CACHE_BYTES define. In the result we can share last L2 cache
line in sk_buff linear packet data area between DMA buffer and
some useful data in other structure. So we can lose this data when
we invalidate DMA buffer.

   sk_buff linear packet data area
                |
                |
                |         skb->end        skb->tail
                V            |                |
                             V                V
----------------------------------------------.
      packet data            | <tail padding> |  <useful data in other struct>
----------------------------------------------.

---------------------.--------------------------------------------------.
     SLC line        |             SLC (L2 cache) line (128B)           |
---------------------.--------------------------------------------------.
        ^                                     ^
        |                                     |
     These cache lines will be invalidated when we invalidate skb
     linear packet data area before DMA transaction starting.

This leads to issues painful to debug as it reproduces only if
(sk_buff->end - sk_buff->tail) < SLC_LINE_SIZE and
if we have some useful data right after sk_buff->end.

Fix that by hardcode SMP_CACHE_BYTES to max line length we may have.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoRevert "MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum"
Rafał Miłecki [Fri, 27 Jul 2018 11:13:39 +0000 (13:13 +0200)]
Revert "MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum"

[ Upstream commit d5ea019f8a381f88545bb26993b62ec24a2796b7 ]

This reverts commit 2a027b47dba6 ("MIPS: BCM47XX: Enable 74K Core
ExternalSync for PCIe erratum").

Enabling ExternalSync caused a regression for BCM4718A1 (used e.g. in
Netgear E3000 and ASUS RT-N16): it simply hangs during PCIe
initialization. It's likely that BCM4717A1 is also affected.

I didn't notice that earlier as the only BCM47XX devices with PCIe I
own are:
1) BCM4706 with 2 x 14e4:4331
2) BCM4706 with 14e4:4360 and 14e4:4331
it appears that BCM4706 is unaffected.

While BCM5300X-ES300-RDS.pdf seems to document that erratum and its
workarounds (according to quotes provided by Tokunori) it seems not even
Broadcom follows them.

According to the provided info Broadcom should define CONF7_ES in their
SDK's mipsinc.h and implement workaround in the si_mips_init(). Checking
both didn't reveal such code. It *could* mean Broadcom also had some
problems with the given workaround.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Reported-by: Michael Marley <michael@michaelmarley.com>
Patchwork: https://patchwork.linux-mips.org/patch/20032/
URL: https://bugs.openwrt.org/index.php?do=details&task_id=1688
Cc: Tokunori Ikegami <ikegami@allied-telesis.co.jp>
Cc: Hauke Mehrtens <hauke@hauke-m.de>
Cc: Chris Packham <chris.packham@alliedtelesis.co.nz>
Cc: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agotools/power turbostat: Read extended processor family from CPUID
Calvin Walton [Fri, 27 Jul 2018 11:50:53 +0000 (07:50 -0400)]
tools/power turbostat: Read extended processor family from CPUID

[ Upstream commit 5aa3d1a20a233d4a5f1ec3d62da3f19d9afea682 ]

This fixes the reported family on modern AMD processors (e.g. Ryzen,
which is family 0x17). Previously these processors all showed up as
family 0xf.

See the document
https://support.amd.com/TechDocs/56255_OSRR.pdf
section CPUID_Fn00000001_EAX for how to calculate the family
from the BaseFamily and ExtFamily values.

This matches the code in arch/x86/lib/cpu.c

Signed-off-by: Calvin Walton <calvin.walton@kepstin.ca>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agozswap: re-check zswap_is_full() after do zswap_shrink()
Li Wang [Thu, 26 Jul 2018 23:37:42 +0000 (16:37 -0700)]
zswap: re-check zswap_is_full() after do zswap_shrink()

[ Upstream commit 16e536ef47f567289a5699abee9ff7bb304bc12d ]

/sys/../zswap/stored_pages keeps rising in a zswap test with
"zswap.max_pool_percent=0" parameter.  But it should not compress or
store pages any more since there is no space in the compressed pool.

Reproduce steps:
  1. Boot kernel with "zswap.enabled=1"
  2. Set the max_pool_percent to 0
      # echo 0 > /sys/module/zswap/parameters/max_pool_percent
  3. Do memory stress test to see if some pages have been compressed
      # stress --vm 1 --vm-bytes $mem_available"M" --timeout 60s
  4. Watching the 'stored_pages' number increasing or not

The root cause is:

  When zswap_max_pool_percent is set to 0 via kernel parameter,
  zswap_is_full() will always return true due to zswap_shrink().  But if
  the shinking is able to reclain a page successfully the code then
  proceeds to compressing/storing another page, so the value of
  stored_pages will keep changing.

To solve the issue, this patch adds a zswap_is_full() check again after
  zswap_shrink() to make sure it's now under the max_pool_percent, and to
  not compress/store if we reached the limit.

Link: http://lkml.kernel.org/r/20180530103936.17812-1-liwang@redhat.com
Signed-off-by: Li Wang <liwang@redhat.com>
Acked-by: Dan Streetman <ddstreet@ieee.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Huang Ying <huang.ying.caritas@gmail.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoipc/sem.c: prevent queue.status tearing in semop
Davidlohr Bueso [Thu, 26 Jul 2018 23:37:19 +0000 (16:37 -0700)]
ipc/sem.c: prevent queue.status tearing in semop

[ Upstream commit f075faa300acc4f6301e348acde0a4580ed5f77c ]

In order for load/store tearing prevention to work, _all_ accesses to
the variable in question need to be done around READ and WRITE_ONCE()
macros.  Ensure everyone does so for q->status variable for
semtimedop().

Link: http://lkml.kernel.org/r/20180717052654.676-1-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agohinic: Link the logical network device to the pci device in sysfs
dann frazier [Mon, 23 Jul 2018 22:55:40 +0000 (16:55 -0600)]
hinic: Link the logical network device to the pci device in sysfs

[ Upstream commit 7856e8616273098dc6c09a6e084afd98a283ff0d ]

Otherwise interfaces get exposed under /sys/devices/virtual, which
doesn't give udev the context it needs for PCI-based predictable
interface names.

Signed-off-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoselftests/ftrace: Add snapshot and tracing_on test case
Masami Hiramatsu [Fri, 13 Jul 2018 16:28:44 +0000 (01:28 +0900)]
selftests/ftrace: Add snapshot and tracing_on test case

[ Upstream commit 82f4f3e69c5c29bce940dd87a2c0f16c51d48d17 ]

Add a testcase for checking snapshot and tracing_on
relationship. This ensures that the snapshotting doesn't
affect current tracing on/off settings.

Link: http://lkml.kernel.org/r/153149932412.11274.15289227592627901488.stgit@devbox
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: Hiraku Toyooka <hiraku.toyooka@cybertrust.co.jp>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocachefiles: Wait rather than BUG'ing on "Unexpected object collision"
Kiran Kumar Modukuri [Thu, 21 Jun 2018 20:25:53 +0000 (13:25 -0700)]
cachefiles: Wait rather than BUG'ing on "Unexpected object collision"

[ Upstream commit c2412ac45a8f8f1cd582723c1a139608694d410d ]

If we meet a conflicting object that is marked FSCACHE_OBJECT_IS_LIVE in
the active object tree, we have been emitting a BUG after logging
information about it and the new object.

Instead, we should wait for the CACHEFILES_OBJECT_ACTIVE flag to be cleared
on the old object (or return an error).  The ACTIVE flag should be cleared
after it has been removed from the active object tree.  A timeout of 60s is
used in the wait, so we shouldn't be able to get stuck there.

Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem")
Signed-off-by: Kiran Kumar Modukuri <kiran.modukuri@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocachefiles: Fix refcounting bug in backing-file read monitoring
Kiran Kumar Modukuri [Tue, 18 Jul 2017 23:25:49 +0000 (16:25 -0700)]
cachefiles: Fix refcounting bug in backing-file read monitoring

[ Upstream commit 934140ab028713a61de8bca58c05332416d037d1 ]

cachefiles_read_waiter() has the right to access a 'monitor' object by
virtue of being called under the waitqueue lock for one of the pages in its
purview.  However, it has no ref on that monitor object or on the
associated operation.

What it is allowed to do is to move the monitor object to the operation's
to_do list, but once it drops the work_lock, it's actually no longer
permitted to access that object.  However, it is trying to enqueue the
retrieval operation for processing - but it can only do this via a pointer
in the monitor object, something it shouldn't be doing.

If it doesn't enqueue the operation, the operation may not get processed.
If the order is flipped so that the enqueue is first, then it's possible
for the work processor to look at the to_do list before the monitor is
enqueued upon it.

Fix this by getting a ref on the operation so that we can trust that it
will still be there once we've added the monitor to the to_do list and
dropped the work_lock.  The op can then be enqueued after the lock is
dropped.

The bug can manifest in one of a couple of ways.  The first manifestation
looks like:

 FS-Cache:
 FS-Cache: Assertion failed
 FS-Cache: 6 == 5 is false
 ------------[ cut here ]------------
 kernel BUG at fs/fscache/operation.c:494!
 RIP: 0010:fscache_put_operation+0x1e3/0x1f0
 ...
 fscache_op_work_func+0x26/0x50
 process_one_work+0x131/0x290
 worker_thread+0x45/0x360
 kthread+0xf8/0x130
 ? create_worker+0x190/0x190
 ? kthread_cancel_work_sync+0x10/0x10
 ret_from_fork+0x1f/0x30

This is due to the operation being in the DEAD state (6) rather than
INITIALISED, COMPLETE or CANCELLED (5) because it's already passed through
fscache_put_operation().

The bug can also manifest like the following:

 kernel BUG at fs/fscache/operation.c:69!
 ...
    [exception RIP: fscache_enqueue_operation+246]
 ...
 #7 [ffff883fff083c10] fscache_enqueue_operation at ffffffffa0b793c6
 #8 [ffff883fff083c28] cachefiles_read_waiter at ffffffffa0b15a48
 #9 [ffff883fff083c48] __wake_up_common at ffffffff810af028

I'm not entirely certain as to which is line 69 in Lei's kernel, so I'm not
entirely clear which assertion failed.

Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem")
Reported-by: Lei Xue <carmark.dlut@gmail.com>
Reported-by: Vegard Nossum <vegard.nossum@gmail.com>
Reported-by: Anthony DeRobertis <aderobertis@metrics.net>
Reported-by: NeilBrown <neilb@suse.com>
Reported-by: Daniel Axtens <dja@axtens.net>
Reported-by: Kiran Kumar Modukuri <kiran.modukuri@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agofscache: Allow cancelled operations to be enqueued
Kiran Kumar Modukuri [Wed, 25 Jul 2018 13:31:20 +0000 (14:31 +0100)]
fscache: Allow cancelled operations to be enqueued

[ Upstream commit d0eb06afe712b7b103b6361f40a9a0c638524669 ]

Alter the state-check assertion in fscache_enqueue_operation() to allow
cancelled operations to be given processing time so they can be cleaned up.

Also fix a debugging statement that was requiring such operations to have
an object assigned.

Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem")
Reported-by: Kiran Kumar Modukuri <kiran.modukuri@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/boot: Fix if_changed build flip/flop bug
Kees Cook [Tue, 24 Jul 2018 23:08:27 +0000 (16:08 -0700)]
x86/boot: Fix if_changed build flip/flop bug

[ Upstream commit 92a4728608a8fd228c572bc8ff50dd98aa0ddf2a ]

Dirk Gouders reported that two consecutive "make" invocations on an
already compiled tree will show alternating behaviors:

$ make
  CALL    scripts/checksyscalls.sh
  DESCEND  objtool
  CHK     include/generated/compile.h
  DATAREL arch/x86/boot/compressed/vmlinux
Kernel: arch/x86/boot/bzImage is ready  (#48)
  Building modules, stage 2.
  MODPOST 165 modules

$ make
  CALL    scripts/checksyscalls.sh
  DESCEND  objtool
  CHK     include/generated/compile.h
  LD      arch/x86/boot/compressed/vmlinux
  ZOFFSET arch/x86/boot/zoffset.h
  AS      arch/x86/boot/header.o
  LD      arch/x86/boot/setup.elf
  OBJCOPY arch/x86/boot/setup.bin
  OBJCOPY arch/x86/boot/vmlinux.bin
  BUILD   arch/x86/boot/bzImage
Setup is 15644 bytes (padded to 15872 bytes).
System is 6663 kB
CRC 3eb90f40
Kernel: arch/x86/boot/bzImage is ready  (#48)
  Building modules, stage 2.
  MODPOST 165 modules

He bisected it back to:

    commit 98f78525371b ("x86/boot: Refuse to build with data relocations")

The root cause was the use of the "if_changed" kbuild function multiple
times for the same target. It was designed to only be used once per
target, otherwise it will effectively always trigger, flipping back and
forth between the two commands getting recorded by "if_changed". Instead,
this patch merges the two commands into a single function to get stable
build artifacts (i.e. .vmlinux.cmd), and a single build behavior.

Bisected-and-Reported-by: Dirk Gouders <dirk@gouders.net>
Fix-Suggested-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180724230827.GA37823@beast
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agosched/rt: Restore rt_runtime after disabling RT_RUNTIME_SHARE
Hailong Liu [Wed, 18 Jul 2018 00:46:55 +0000 (08:46 +0800)]
sched/rt: Restore rt_runtime after disabling RT_RUNTIME_SHARE

[ Upstream commit f3d133ee0a17d5694c6f21873eec9863e11fa423 ]

NO_RT_RUNTIME_SHARE feature is used to prevent a CPU borrow enough
runtime with a spin-rt-task.

However, if RT_RUNTIME_SHARE feature is enabled and rt_rq has borrowd
enough rt_runtime at the beginning, rt_runtime can't be restored to
its initial bandwidth rt_runtime after we disable RT_RUNTIME_SHARE.

E.g. on my PC with 4 cores, procedure to reproduce:
1) Make sure  RT_RUNTIME_SHARE is enabled
 cat /sys/kernel/debug/sched_features
  GENTLE_FAIR_SLEEPERS START_DEBIT NO_NEXT_BUDDY LAST_BUDDY
  CACHE_HOT_BUDDY WAKEUP_PREEMPTION NO_HRTICK NO_DOUBLE_TICK
  LB_BIAS NONTASK_CAPACITY TTWU_QUEUE NO_SIS_AVG_CPU SIS_PROP
  NO_WARN_DOUBLE_CLOCK RT_PUSH_IPI RT_RUNTIME_SHARE NO_LB_MIN
  ATTACH_AGE_LOAD WA_IDLE WA_WEIGHT WA_BIAS
2) Start a spin-rt-task
 ./loop_rr &
3) set affinity to the last cpu
 taskset -p 8 $pid_of_loop_rr
4) Observe that last cpu have borrowed enough runtime.
 cat /proc/sched_debug | grep rt_runtime
  .rt_runtime                    : 950.000000
  .rt_runtime                    : 900.000000
  .rt_runtime                    : 950.000000
  .rt_runtime                    : 1000.000000
5) Disable RT_RUNTIME_SHARE
 echo NO_RT_RUNTIME_SHARE > /sys/kernel/debug/sched_features
6) Observe that rt_runtime can not been restored
 cat /proc/sched_debug | grep rt_runtime
  .rt_runtime                    : 950.000000
  .rt_runtime                    : 900.000000
  .rt_runtime                    : 950.000000
  .rt_runtime                    : 1000.000000

This patch help to restore rt_runtime after we disable
RT_RUNTIME_SHARE.

Signed-off-by: Hailong Liu <liu.hailong6@zte.com.cn>
Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: zhong.weidong@zte.com.cn
Link: http://lkml.kernel.org/r/1531874815-39357-1-git-send-email-liu.hailong6@zte.com.cn
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoi2c/mux, locking/core: Annotate the nested rt_mutex usage
Peter Rosin [Fri, 20 Jul 2018 08:39:14 +0000 (10:39 +0200)]
i2c/mux, locking/core: Annotate the nested rt_mutex usage

[ Upstream commit 7b94ea50514d1a0dc94f02723b603c27bc0ea597 ]

If an i2c topology has instances of nested muxes, then a lockdep splat
is produced when when i2c_parent_lock_bus() is called.  Here is an
example:

  ============================================
  WARNING: possible recursive locking detected
  --------------------------------------------
  insmod/68159 is trying to acquire lock:
    (i2c_register_adapter#2){+.+.}, at: i2c_parent_lock_bus+0x32/0x50 [i2c_mux]

  but task is already holding lock:
    (i2c_register_adapter#2){+.+.}, at: i2c_parent_lock_bus+0x32/0x50 [i2c_mux]

  other info that might help us debug this:
    Possible unsafe locking scenario:

          CPU0
          ----
     lock(i2c_register_adapter#2);
     lock(i2c_register_adapter#2);

    *** DEADLOCK ***

    May be due to missing lock nesting notation

  1 lock held by insmod/68159:
    #0:  (i2c_register_adapter#2){+.+.}, at: i2c_parent_lock_bus+0x32/0x50 [i2c_mux]

  stack backtrace:
  CPU: 13 PID: 68159 Comm: insmod Tainted: G           O
  Call Trace:
    dump_stack+0x67/0x98
    __lock_acquire+0x162e/0x1780
    lock_acquire+0xba/0x200
    rt_mutex_lock+0x44/0x60
    i2c_parent_lock_bus+0x32/0x50 [i2c_mux]
    i2c_parent_lock_bus+0x3e/0x50 [i2c_mux]
    i2c_smbus_xfer+0xf0/0x700
    i2c_smbus_read_byte+0x42/0x70
    my2c_init+0xa2/0x1000 [my2c]
    do_one_initcall+0x51/0x192
    do_init_module+0x62/0x216
    load_module+0x20f9/0x2b50
    SYSC_init_module+0x19a/0x1c0
    SyS_init_module+0xe/0x10
    do_syscall_64+0x6c/0x1a0
    entry_SYSCALL_64_after_hwframe+0x42/0xb7

Reported-by: John Sperbeck <jsperbeck@google.com>
Tested-by: John Sperbeck <jsperbeck@google.com>
Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Deepa Dinamani <deepadinamani@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Chang <dpf@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
Link: http://lkml.kernel.org/r/20180720083914.1950-3-peda@axentia.se
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agolocking/rtmutex: Allow specifying a subclass for nested locking
Peter Rosin [Fri, 20 Jul 2018 08:39:13 +0000 (10:39 +0200)]
locking/rtmutex: Allow specifying a subclass for nested locking

[ Upstream commit 62cedf3e60af03e47849fe2bd6a03ec179422a8a ]

Needed for annotating rt_mutex locks.

Tested-by: John Sperbeck <jsperbeck@google.com>
Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Deepa Dinamani <deepadinamani@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Chang <dpf@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
Link: http://lkml.kernel.org/r/20180720083914.1950-2-peda@axentia.se
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: axienet: Fix double deregister of mdio
Shubhrajyoti Datta [Tue, 24 Jul 2018 04:39:53 +0000 (10:09 +0530)]
net: axienet: Fix double deregister of mdio

[ Upstream commit 03bc7cab7d7218088412a75e141696a89059ab00 ]

If the registration fails then mdio_unregister is called.
However at unbind the unregister ia attempted again resulting
in the below crash

[   73.544038] kernel BUG at drivers/net/phy/mdio_bus.c:415!
[   73.549362] Internal error: Oops - BUG: 0 [#1] SMP
[   73.554127] Modules linked in:
[   73.557168] CPU: 0 PID: 2249 Comm: sh Not tainted 4.14.0 #183
[   73.562895] Hardware name: xlnx,zynqmp (DT)
[   73.567062] task: ffffffc879e41180 task.stack: ffffff800cbe0000
[   73.572973] PC is at mdiobus_unregister+0x84/0x88
[   73.577656] LR is at axienet_mdio_teardown+0x18/0x30
[   73.582601] pc : [<ffffff80085fa4cc>] lr : [<ffffff8008616858>]
pstate: 20000145
[   73.589981] sp : ffffff800cbe3c30
[   73.593277] x29: ffffff800cbe3c30 x28: ffffffc879e41180
[   73.598573] x27: ffffff8008a21000 x26: 0000000000000040
[   73.603868] x25: 0000000000000124 x24: ffffffc879efe920
[   73.609164] x23: 0000000000000060 x22: ffffffc879e02000
[   73.614459] x21: ffffffc879e02800 x20: ffffffc87b0b8870
[   73.619754] x19: ffffffc879e02800 x18: 000000000000025d
[   73.625050] x17: 0000007f9a719ad0 x16: ffffff8008195bd8
[   73.630345] x15: 0000007f9a6b3d00 x14: 0000000000000010
[   73.635640] x13: 74656e7265687465 x12: 0000000000000030
[   73.640935] x11: 0000000000000030 x10: 0101010101010101
[   73.646231] x9 : 241f394f42533300 x8 : ffffffc8799f6e98
[   73.651526] x7 : ffffffc8799f6f18 x6 : ffffffc87b0ba318
[   73.656822] x5 : ffffffc87b0ba498 x4 : 0000000000000000
[   73.662117] x3 : 0000000000000000 x2 : 0000000000000008
[   73.667412] x1 : 0000000000000004 x0 : ffffffc8799f4000
[   73.672708] Process sh (pid: 2249, stack limit = 0xffffff800cbe0000)

Fix the same by making the bus NULL on unregister.

Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>