platform/kernel/linux-rpi.git
5 years agoMerge tag 'for-5.5/drivers-20191121' of git://git.kernel.dk/linux-block
Linus Torvalds [Mon, 25 Nov 2019 19:15:41 +0000 (11:15 -0800)]
Merge tag 'for-5.5/drivers-20191121' of git://git.kernel.dk/linux-block

Pull block driver updates from Jens Axboe:
 "Here are the main block driver updates for 5.5. Nothing major in here,
  mostly just fixes. This contains:

   - a set of bcache changes via Coly

   - MD changes from Song

   - loop unmap write-zeroes fix (Darrick)

   - spelling fixes (Geert)

   - zoned additions cleanups to null_blk/dm (Ajay)

   - allow null_blk online submit queue changes (Bart)

   - NVMe changes via Keith, nothing major here either"

* tag 'for-5.5/drivers-20191121' of git://git.kernel.dk/linux-block: (56 commits)
  Revert "bcache: fix fifo index swapping condition in journal_pin_cmp()"
  drivers/md/raid5-ppl.c: use the new spelling of RWH_WRITE_LIFE_NOT_SET
  drivers/md/raid5.c: use the new spelling of RWH_WRITE_LIFE_NOT_SET
  bcache: don't export symbols
  bcache: remove the extra cflags for request.o
  bcache: at least try to shrink 1 node in bch_mca_scan()
  bcache: add idle_max_writeback_rate sysfs interface
  bcache: add code comments in bch_btree_leaf_dirty()
  bcache: fix deadlock in bcache_allocator
  bcache: add code comment bch_keylist_pop() and bch_keylist_pop_front()
  bcache: deleted code comments for dead code in bch_data_insert_keys()
  bcache: add more accurate error messages in read_super()
  bcache: fix static checker warning in bcache_device_free()
  bcache: fix a lost wake-up problem caused by mca_cannibalize_lock
  bcache: fix fifo index swapping condition in journal_pin_cmp()
  md/raid10: prevent access of uninitialized resync_pages offset
  md: avoid invalid memory access for array sb->dev_roles
  md/raid1: avoid soft lockup under high load
  null_blk: add zone open, close, and finish support
  dm: add zone open, close and finish support
  ...

5 years agoMerge tag 'for-5.5/block-20191121' of git://git.kernel.dk/linux-block
Linus Torvalds [Mon, 25 Nov 2019 18:59:41 +0000 (10:59 -0800)]
Merge tag 'for-5.5/block-20191121' of git://git.kernel.dk/linux-block

Pull core block updates from Jens Axboe:
 "Due to more granular branches, this one is small and will be followed
  with other core branches that add specific features. I meant to just
  have a core and drivers branch, but external dependencies we ended up
  adding a few more that are also core.

  The changes are:

   - Fixes and improvements for the zoned device support (Ajay, Damien)

   - sed-opal table writing and datastore UID (Revanth)

   - blk-cgroup (and bfq) blk-cgroup stat fixes (Tejun)

   - Improvements to the block stats tracking (Pavel)

   - Fix for overruning sysfs buffer for large number of CPUs (Ming)

   - Optimization for small IO (Ming, Christoph)

   - Fix typo in RWH lifetime hint (Eugene)

   - Dead code removal and documentation (Bart)

   - Reduction in memory usage for queue and tag set (Bart)

   - Kerneldoc header documentation (André)

   - Device/partition revalidation fixes (Jan)

   - Stats tracking for flush requests (Konstantin)

   - Various other little fixes here and there (et al)"

* tag 'for-5.5/block-20191121' of git://git.kernel.dk/linux-block: (48 commits)
  Revert "block: split bio if the only bvec's length is > SZ_4K"
  block: add iostat counters for flush requests
  block,bfq: Skip tracing hooks if possible
  block: sed-opal: Introduce SUM_SET_LIST parameter and append it using 'add_token_u64'
  blk-cgroup: cgroup_rstat_updated() shouldn't be called on cgroup1
  block: Don't disable interrupts in trigger_softirq()
  sbitmap: Delete sbitmap_any_bit_clear()
  blk-mq: Delete blk_mq_has_free_tags() and blk_mq_can_queue()
  block: split bio if the only bvec's length is > SZ_4K
  block: still try to split bio if the bvec crosses pages
  blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT
  blk-cgroup: reimplement basic IO stats using cgroup rstat
  blk-cgroup: remove now unused blkg_print_stat_{bytes|ios}_recursive()
  blk-throtl: stop using blkg->stat_bytes and ->stat_ios
  bfq-iosched: stop using blkg->stat_bytes and ->stat_ios
  bfq-iosched: relocate bfqg_*rwstat*() helpers
  block: add zone open, close and finish ioctl support
  block: add zone open, close and finish operations
  block: Simplify REQ_OP_ZONE_RESET_ALL handling
  block: Remove REQ_OP_ZONE_RESET plugging
  ...

5 years agoMerge tag 'for-5.5/libata-20191121' of git://git.kernel.dk/linux-block
Linus Torvalds [Mon, 25 Nov 2019 18:57:53 +0000 (10:57 -0800)]
Merge tag 'for-5.5/libata-20191121' of git://git.kernel.dk/linux-block

Pull libata updates from Jens Axboe:
 "Just a few fixes all over the place, support for the Annapurna SATA
  controller, and a patchset that cleans up the error defines and
  ultimately fixes anissue with sata_mv"

* tag 'for-5.5/libata-20191121' of git://git.kernel.dk/linux-block:
  ata: pata_artop: make arrays static const, makes object smaller
  ata_piix: remove open-coded dmi_match(DMI_OEM_STRING)
  ata: sata_mv, avoid trigerrable BUG_ON
  ata: make qc_prep return ata_completion_errors
  ata: define AC_ERR_OK
  ata: Documentation, fix function names
  libata: Ensure ata_port probe has completed before detach
  ahci: tegra: use regulator_bulk_set_supply_names()
  ahci: Add support for Amazon's Annapurna Labs SATA controller

5 years agoMerge tag 'for-5.5/io_uring-20191121' of git://git.kernel.dk/linux-block
Linus Torvalds [Mon, 25 Nov 2019 18:40:27 +0000 (10:40 -0800)]
Merge tag 'for-5.5/io_uring-20191121' of git://git.kernel.dk/linux-block

Pull io_uring updates from Jens Axboe:
 "A lot of stuff has been going on this cycle, with improving the
  support for networked IO (and hence unbounded request completion
  times) being one of the major themes. There's been a set of fixes done
  this week, I'll send those out as well once we're certain we're fully
  happy with them.

  This contains:

   - Unification of the "normal" submit path and the SQPOLL path (Pavel)

   - Support for sparse (and bigger) file sets, and updating of those
     file sets without needing to unregister/register again.

   - Independently sized CQ ring, instead of just making it always 2x
     the SQ ring size. This makes it more flexible for networked
     applications.

   - Support for overflowed CQ ring, never dropping events but providing
     backpressure on submits.

   - Add support for absolute timeouts, not just relative ones.

   - Support for generic cancellations. This divorces io_uring from
     workqueues as well, which additionally gets us one step closer to
     generic async system call support.

   - With cancellations, we can support grabbing the process file table
     as well, just like we do mm context. This allows support for system
     calls that create file descriptors, like accept4() support that's
     built on top of that.

   - Support for io_uring tracing (Dmitrii)

   - Support for linked timeouts. These abort an operation if it isn't
     completed by the time noted in the linke timeout.

   - Speedup tracking of poll requests

   - Various cleanups making the coder easier to follow (Jackie, Pavel,
     Bob, YueHaibing, me)

   - Update MAINTAINERS with new io_uring list"

* tag 'for-5.5/io_uring-20191121' of git://git.kernel.dk/linux-block: (64 commits)
  io_uring: make POLL_ADD/POLL_REMOVE scale better
  io-wq: remove now redundant struct io_wq_nulls_list
  io_uring: Fix getting file for non-fd opcodes
  io_uring: introduce req_need_defer()
  io_uring: clean up io_uring_cancel_files()
  io-wq: ensure free/busy list browsing see all items
  io-wq: ensure we have a stable view of ->cur_work for cancellations
  io_wq: add get/put_work handlers to io_wq_create()
  io_uring: check for validity of ->rings in teardown
  io_uring: fix potential deadlock in io_poll_wake()
  io_uring: use correct "is IO worker" helper
  io_uring: fix -ENOENT issue with linked timer with short timeout
  io_uring: don't do flush cancel under inflight_lock
  io_uring: flag SQPOLL busy condition to userspace
  io_uring: make ASYNC_CANCEL work with poll and timeout
  io_uring: provide fallback request for OOM situations
  io_uring: convert accept4() -ERESTARTSYS into -EINTR
  io_uring: fix error clear of ->file_table in io_sqe_files_register()
  io_uring: separate the io_free_req and io_free_req_find_next interface
  io_uring: keep io_put_req only responsible for release and put req
  ...

5 years agoMerge tag 'tpmdd-next-20191112' of git://git.infradead.org/users/jjs/linux-tpmdd
Linus Torvalds [Mon, 25 Nov 2019 18:29:42 +0000 (10:29 -0800)]
Merge tag 'tpmdd-next-20191112' of git://git.infradead.org/users/jjs/linux-tpmdd

Pull tpmd updates from Jarkko Sakkinen:

 - support for Cr50 fTPM

 - support for fTPM on AMD Zen+ CPUs

 - TPM 2.0 trusted keys code relocated from drivers/char/tpm to
   security/keys

* tag 'tpmdd-next-20191112' of git://git.infradead.org/users/jjs/linux-tpmdd:
  KEYS: trusted: Remove set but not used variable 'keyhndl'
  tpm: Switch to platform_get_irq_optional()
  tpm_crb: fix fTPM on AMD Zen+ CPUs
  KEYS: trusted: Move TPM2 trusted keys code
  KEYS: trusted: Create trusted keys subsystem
  KEYS: Use common tpm_buf for trusted and asymmetric keys
  tpm: Move tpm_buf code to include/linux/
  tpm: use GFP_KERNEL instead of GFP_HIGHMEM for tpm_buf
  tpm: add check after commands attribs tab allocation
  tpm: tpm_tis_spi: Drop THIS_MODULE usage from driver struct
  tpm: tpm_tis_spi: Cleanup includes
  tpm: tpm_tis_spi: Support cr50 devices
  tpm: tpm_tis_spi: Introduce a flow control callback
  tpm: Add a flag to indicate TPM power is managed by firmware
  dt-bindings: tpm: document properties for cr50
  tpm_tis: override durations for STM tpm with firmware 1.2.8.28
  tpm: provide a way to override the chip returned durations
  tpm: Remove duplicate code from caps_show() in tpm-sysfs.c

5 years agovfs: properly and reliably lock f_pos in fdget_pos()
Linus Torvalds [Mon, 11 Nov 2019 23:51:03 +0000 (15:51 -0800)]
vfs: properly and reliably lock f_pos in fdget_pos()

fdget_pos() is used by file operations that will read and update f_pos:
things like "read()", "write()" and "lseek()" (but not, for example,
"pread()/pwrite" that get their file positions elsewhere).

However, it had two separate escape clauses for this, because not
everybody wants or needs serialization of the file position.

The first and most obvious case is the "file descriptor doesn't have a
position at all", ie a stream-like file.  Except we didn't actually use
FMODE_STREAM, but instead used FMODE_ATOMIC_POS.  The reason for that
was that FMODE_STREAM didn't exist back in the days, but also that we
didn't want to mark all the special cases, so we only marked the ones
that _required_ position atomicity according to POSIX - regular files
and directories.

The case one was intentionally lazy, but now that we _do_ have
FMODE_STREAM we could and should just use it.  With the change to use
FMODE_STREAM, there are no remaining uses for FMODE_ATOMIC_POS, and all
the code to set it is deleted.

Any cases where we don't want the serialization because the driver (or
subsystem) doesn't use the file position should just be updated to do
"stream_open()".  We've done that for all the obvious and common
situations, we may need a few more.  Quoting Kirill Smelkov in the
original FMODE_STREAM thread (see link below for full email):

 "And I appreciate if people could help at least somehow with "getting
  rid of mixed case entirely" (i.e. always lock f_pos_lock on
  !FMODE_STREAM), because this transition starts to diverge from my
  particular use-case too far. To me it makes sense to do that
  transition as follows:

   - convert nonseekable_open -> stream_open via stream_open.cocci;
   - audit other nonseekable_open calls and convert left users that
     truly don't depend on position to stream_open;
   - extend stream_open.cocci to analyze alloc_file_pseudo as well (this
     will cover pipes and sockets), or maybe convert pipes and sockets
     to FMODE_STREAM manually;
   - extend stream_open.cocci to analyze file_operations that use
     no_llseek or noop_llseek, but do not use nonseekable_open or
     alloc_file_pseudo. This might find files that have stream semantic
     but are opened differently;
   - extend stream_open.cocci to analyze file_operations whose
     .read/.write do not use ppos at all (independently of how file was
     opened);
   - ...
   - after that remove FMODE_ATOMIC_POS and always take f_pos_lock if
     !FMODE_STREAM;
   - gather bug reports for deadlocked read/write and convert missed
     cases to FMODE_STREAM, probably extending stream_open.cocci along
     the road to catch similar cases

  i.e. always take f_pos_lock unless a file is explicitly marked as
  being stream, and try to find and cover all files that are streams"

We have not done the "extend stream_open.cocci to analyze
alloc_file_pseudo" as well, but the previous commit did manually handle
the case of pipes and sockets.

The other case where we can avoid locking f_pos is the "this file
descriptor only has a single user and it is us, and thus there is no
need to lock it".

The second test was correct, although a bit subtle and worth just
re-iterating here.  There are two kinds of other sources of references
to the same file descriptor: file descriptors that have been explicitly
shared across fork() or with dup(), and file tables having elevated
reference counts due to threading (or explicit file sharing with
clone()).

The first case would have incremented the file count explicitly, and in
the second case the previous __fdget() would have incremented it for us
and set the FDPUT_FPUT flag.

But in both cases the file count would be greater than one, so the
"file_count(file) > 1" test catches both situations.  Also note that if
file_count is 1, that also means that no other thread can have access to
the file table, so there also cannot be races with concurrent calls to
dup()/fork()/clone() that would increment the file count any other way.

Link: https://lore.kernel.org/linux-fsdevel/20190413184404.GA13490@deco.navytux.spb.ru
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Eic Dumazet <edumazet@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Marco Elver <elver@google.com>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Paul McKenney <paulmck@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agovfs: mark pipes and sockets as stream-like file descriptors
Linus Torvalds [Sun, 17 Nov 2019 19:20:48 +0000 (11:20 -0800)]
vfs: mark pipes and sockets as stream-like file descriptors

In commit 3975b097e577 ("convert stream-like files -> stream_open, even
if they use noop_llseek") Kirill used a coccinelle script to change
"nonseekable_open()" to "stream_open()", which changed the trivial cases
of stream-like file descriptors to the new model with FMODE_STREAM.

However, the two big cases - sockets and pipes - don't actually have
that trivial pattern at all, and were thus never converted to
FMODE_STREAM even though it makes lots of sense to do so.

That's particularly true when looking forward to the next change:
getting rid of FMODE_ATOMIC_POS entirely, and just using FMODE_STREAM to
decide whether f_pos updates are needed or not.  And if they are, we'll
always do them atomically.

This came up because KCSAN (correctly) noted that the non-locked f_pos
updates are data races: they are clearly benign for the case where we
don't care, but it would be good to just not have that issue exist at
all.

Note that the reason we used FMODE_ATOMIC_POS originally is that only
doing it for the minimal required case is "safer" in that it's possible
that the f_pos locking can cause unnecessary serialization across the
whole write() call.  And in the worst case, that kind of serialization
can cause deadlock issues: think writers that need readers to empty the
state using the same file descriptor.

[ Note that the locking is per-file descriptor - because it protects
  "f_pos", which is obviously per-file descriptor - so it only affects
  cases where you literally use the same file descriptor to both read
  and write.

  So a regular pipe that has separate reading and writing file
  descriptors doesn't really have this situation even though it's the
  obvious case of "reader empties what a bit writer concurrently fills"

  But we want to make pipes as being stream-line anyway, because we
  don't want the unnecessary overhead of locking, and because a named
  pipe can be (ab-)used by reading and writing to the same file
  descriptor. ]

There are likely a lot of other cases that might want FMODE_STREAM, and
looking for ".llseek = no_llseek" users and other cases that don't have
an lseek file operation at all and making them use "stream_open()" might
be a good idea.  But pipes and sockets are likely to be the two main
cases.

Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Eic Dumazet <edumazet@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Marco Elver <elver@google.com>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Paul McKenney <paulmck@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoLinux 5.4 v5.4
Linus Torvalds [Mon, 25 Nov 2019 00:32:01 +0000 (16:32 -0800)]
Linux 5.4

5 years agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Sun, 24 Nov 2019 20:36:39 +0000 (12:36 -0800)]
Merge branch 'fixes' of git://git./linux/kernel/git/viro/vfs

Pull cramfs fix from Al Viro:
 "Regression fix, fallen through the cracks"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  cramfs: fix usage on non-MTD device

5 years agocramfs: fix usage on non-MTD device
Maxime Bizon [Sat, 19 Oct 2019 19:24:11 +0000 (15:24 -0400)]
cramfs: fix usage on non-MTD device

When both CONFIG_CRAMFS_MTD and CONFIG_CRAMFS_BLOCKDEV are enabled, if
we fail to mount on MTD, we don't try on block device.

Note: this relies upon cramfs_mtd_fill_super() leaving no side
effects on fc state in case of failure; in general, failing
get_tree_...() does *not* mean "fine to try again"; e.g. parsed
options might've been consumed by fill_super callback and freed
on failure.

Fixes: 74f78fc5ef43 ("vfs: Convert cramfs to use the new mount API")
Signed-off-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
5 years agoMerge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Linus Torvalds [Sat, 23 Nov 2019 21:02:18 +0000 (13:02 -0800)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost

Pull last minute virtio bugfixes from Michael Tsirkin:
 "Minor bugfixes all over the place"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio_balloon: fix shrinker count
  virtio_balloon: fix shrinker scan number of pages
  virtio_console: allocate inbufs in add_port() only if it is needed
  virtio_ring: fix return code on DMA mapping fails

5 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Sat, 23 Nov 2019 00:57:26 +0000 (16:57 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

Pull input fix from Dmitry Torokhov:
 "Just a single revert as RMI mode should not have been enabled for this
  model [yet?]"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Revert "Input: synaptics - enable RMI mode for X1 Extreme 2nd Generation"

5 years agoRevert "Input: synaptics - enable RMI mode for X1 Extreme 2nd Generation"
Lyude Paul [Fri, 22 Nov 2019 22:52:54 +0000 (14:52 -0800)]
Revert "Input: synaptics - enable RMI mode for X1 Extreme 2nd Generation"

This reverts commit 68b9c5066e39af41d3448abfc887c77ce22dd64d.

Ugh, I really dropped the ball on this one :\. So as it turns out RMI4
works perfectly fine on the X1 Extreme Gen 2 except for one thing I
didn't notice because I usually use the trackpoint: clicking with the
touchpad. Somehow this is broken, in fact we don't even seem to indicate
BTN_LEFT as a valid event type for the RMI4 touchpad. And, I don't even
see any RMI4 events coming from the touchpad when I press down on it.
This only seems to work for PS/2 mode.

Since that means we have a regression, and PS/2 mode seems to work fine
for the time being - revert this for now. We'll have to do a more
thorough investigation on this.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://lore.kernel.org/r/20191119234534.10725-1-lyude@redhat.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Fri, 22 Nov 2019 22:28:14 +0000 (14:28 -0800)]
Merge git://git./linux/kernel/git/netdev/net

Pull networking fixes from David Miller:

 1) Validate tunnel options length in act_tunnel_key, from Xin Long.

 2) Fix DMA sync bug in gve driver, from Adi Suresh.

 3) TSO kills performance on some r8169 chips due to HW issues, disable
    by default in that case, from Corinna Vinschen.

 4) Fix clock disable mismatch in fec driver, from Chubong Yuan.

 5) Fix interrupt status bits define in hns3 driver, from Huazhong Tan.

 6) Fix workqueue deadlocks in qeth driver, from Julian Wiedmann.

 7) Don't napi_disable() twice in r8152 driver, from Hayes Wang.

 8) Fix SKB extension memory leak, from Florian Westphal.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits)
  r8152: avoid to call napi_disable twice
  MAINTAINERS: Add myself as maintainer of virtio-vsock
  udp: drop skb extensions before marking skb stateless
  net: rtnetlink: prevent underflows in do_setvfinfo()
  can: m_can_platform: remove unnecessary m_can_class_resume() call
  can: m_can_platform: set net_device structure as driver data
  hv_netvsc: Fix send_table offset in case of a host bug
  hv_netvsc: Fix offset usage in netvsc_send_table()
  net-ipv6: IPV6_TRANSPARENT - check NET_RAW prior to NET_ADMIN
  sfc: Only cancel the PPS workqueue if it exists
  nfc: port100: handle command failure cleanly
  net-sysfs: fix netdev_queue_add_kobject() breakage
  r8152: Re-order napi_disable in rtl8152_close
  net: qca_spi: Move reset_count to struct qcaspi
  net: qca_spi: fix receive buffer size check
  net/ibmvnic: Ignore H_FUNCTION return from H_EOI to tolerate XIVE mode
  Revert "net/ibmvnic: Fix EOI when running in XIVE mode"
  net/mlxfw: Verify FSM error code translation doesn't exceed array size
  net/mlx5: Update the list of the PCI supported devices
  net/mlx5: Fix auto group size calculation
  ...

5 years agoafs: Fix large file support
Marc Dionne [Thu, 21 Nov 2019 15:37:26 +0000 (15:37 +0000)]
afs: Fix large file support

By default s_maxbytes is set to MAX_NON_LFS, which limits the usable
file size to 2GB, enforced by the vfs.

Commit b9b1f8d5930a ("AFS: write support fixes") added support for the
64-bit fetch and store server operations, but did not change this value.
As a result, attempts to write past the 2G mark result in EFBIG errors:

 $ dd if=/dev/zero of=foo bs=1M count=1 seek=2048
 dd: error writing 'foo': File too large

Set s_maxbytes to MAX_LFS_FILESIZE.

Fixes: b9b1f8d5930a ("AFS: write support fixes")
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoafs: Fix possible assert with callbacks from yfs servers
Marc Dionne [Thu, 21 Nov 2019 15:26:15 +0000 (15:26 +0000)]
afs: Fix possible assert with callbacks from yfs servers

Servers sending callback breaks to the YFS_CM_SERVICE service may
send up to YFSCBMAX (1024) fids in a single RPC.  Anything over
AFSCBMAX (50) will cause the assert in afs_break_callbacks to trigger.

Remove the assert, as the count has already been checked against
the appropriate max values in afs_deliver_cb_callback and
afs_deliver_yfs_cb_callback.

Fixes: 35dbfba3111a ("afs: Implement the YFS cache manager service")
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agor8152: avoid to call napi_disable twice
Hayes Wang [Fri, 22 Nov 2019 08:21:09 +0000 (16:21 +0800)]
r8152: avoid to call napi_disable twice

Call napi_disable() twice would cause dead lock. There are three situations
may result in the issue.

1. rtl8152_pre_reset() and set_carrier() are run at the same time.
2. Call rtl8152_set_tunable() after rtl8152_close().
3. Call rtl8152_set_ringparam() after rtl8152_close().

For #1, use the same solution as commit 84811412464d ("r8152: Re-order
napi_disable in rtl8152_close"). For #2 and #3, add checking the flag
of IFF_UP and using napi_disable/napi_enable during mutex.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Fri, 22 Nov 2019 17:49:08 +0000 (09:49 -0800)]
Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
 "Three fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm/ksm.c: don't WARN if page is still mapped in remove_stable_node()
  mm/memory_hotplug: don't access uninitialized memmaps in shrink_zone_span()
  Revert "fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()"

5 years agoMerge tag 'linux-can-fixes-for-5.4-20191122' of git://git.kernel.org/pub/scm/linux...
David S. Miller [Fri, 22 Nov 2019 17:42:11 +0000 (09:42 -0800)]
Merge tag 'linux-can-fixes-for-5.4-20191122' of git://git./linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2019-11-22

this is a pull request of 2 patches for net/master, if possible for the
current release cycle. Otherwise these patches should hit v5.4 via the
stable tree.

Both patches of this pull request target the m_can driver. Pankaj Sharma
fixes the fallout in the m_can_platform part, which appeared with the
introduction of the m_can platform framework.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMAINTAINERS: Add myself as maintainer of virtio-vsock
Stefano Garzarella [Fri, 22 Nov 2019 10:20:10 +0000 (11:20 +0100)]
MAINTAINERS: Add myself as maintainer of virtio-vsock

Since I'm actively working on vsock and virtio/vhost transports,
Stefan suggested to help him to maintain it.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoudp: drop skb extensions before marking skb stateless
Florian Westphal [Thu, 21 Nov 2019 05:56:23 +0000 (06:56 +0100)]
udp: drop skb extensions before marking skb stateless

Once udp stack has set the UDP_SKB_IS_STATELESS flag, later skb free
assumes all skb head state has been dropped already.

This will leak the extension memory in case the skb has extensions other
than the ipsec secpath, e.g. bridge nf data.

To fix this, set the UDP_SKB_IS_STATELESS flag only if we don't have
extensions or if the extension space can be free'd.

Fixes: 895b5c9f206eb7d25dc1360a ("netfilter: drop bridge nf reset from nf_reset")
Cc: Paolo Abeni <pabeni@redhat.com>
Reported-by: Byron Stanoszek <gandalf@winds.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: rtnetlink: prevent underflows in do_setvfinfo()
Dan Carpenter [Wed, 20 Nov 2019 12:34:38 +0000 (15:34 +0300)]
net: rtnetlink: prevent underflows in do_setvfinfo()

The "ivm->vf" variable is a u32, but the problem is that a number of
drivers cast it to an int and then forget to check for negatives.  An
example of this is in the cxgb4 driver.

drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
  2890  static int cxgb4_mgmt_get_vf_config(struct net_device *dev,
  2891                                      int vf, struct ifla_vf_info *ivi)
                                            ^^^^^^
  2892  {
  2893          struct port_info *pi = netdev_priv(dev);
  2894          struct adapter *adap = pi->adapter;
  2895          struct vf_info *vfinfo;
  2896
  2897          if (vf >= adap->num_vfs)
                    ^^^^^^^^^^^^^^^^^^^
  2898                  return -EINVAL;
  2899          vfinfo = &adap->vfinfo[vf];
                ^^^^^^^^^^^^^^^^^^^^^^^^^^

There are 48 functions affected.

drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c:8435 hclge_set_vf_vlan_filter() warn: can 'vfid' underflow 's32min-2147483646'
drivers/net/ethernet/freescale/enetc/enetc_pf.c:377 enetc_pf_set_vf_mac() warn: can 'vf' underflow 's32min-2147483646'
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:2899 cxgb4_mgmt_get_vf_config() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:2960 cxgb4_mgmt_set_vf_rate() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3019 cxgb4_mgmt_set_vf_rate() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3038 cxgb4_mgmt_set_vf_vlan() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3086 cxgb4_mgmt_set_vf_link_state() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/chelsio/cxgb/cxgb2.c:791 get_eeprom() warn: can 'i' underflow 's32min-(-4),0,4-s32max'
drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:82 bnxt_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:164 bnxt_set_vf_trust() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:186 bnxt_get_vf_config() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:228 bnxt_set_vf_mac() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:264 bnxt_set_vf_vlan() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:293 bnxt_set_vf_bw() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:333 bnxt_set_vf_link_state() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:2595 bnx2x_vf_op_prep() warn: can 'vfidx' underflow 's32min-63'
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:2595 bnx2x_vf_op_prep() warn: can 'vfidx' underflow 's32min-63'
drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2281 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63'
drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2285 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63'
drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2286 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63'
drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2292 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63'
drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2297 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63'
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1832 qlcnic_sriov_set_vf_mac() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1864 qlcnic_sriov_set_vf_tx_rate() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1937 qlcnic_sriov_set_vf_vlan() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:2005 qlcnic_sriov_get_vf_config() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:2036 qlcnic_sriov_set_vf_spoofchk() warn: can 'vf' underflow 's32min-254'
drivers/net/ethernet/emulex/benet/be_main.c:1914 be_get_vf_config() warn: can 'vf' underflow 's32min-65534'
drivers/net/ethernet/emulex/benet/be_main.c:1915 be_get_vf_config() warn: can 'vf' underflow 's32min-65534'
drivers/net/ethernet/emulex/benet/be_main.c:1922 be_set_vf_tvt() warn: can 'vf' underflow 's32min-65534'
drivers/net/ethernet/emulex/benet/be_main.c:1951 be_clear_vf_tvt() warn: can 'vf' underflow 's32min-65534'
drivers/net/ethernet/emulex/benet/be_main.c:2063 be_set_vf_tx_rate() warn: can 'vf' underflow 's32min-65534'
drivers/net/ethernet/emulex/benet/be_main.c:2091 be_set_vf_link_state() warn: can 'vf' underflow 's32min-65534'
drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:2609 ice_set_vf_port_vlan() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3050 ice_get_vf_cfg() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3103 ice_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3181 ice_set_vf_mac() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3237 ice_set_vf_trust() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3286 ice_set_vf_link_state() warn: can 'vf_id' underflow 's32min-65534'
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:3919 i40e_validate_vf() warn: can 'vf_id' underflow 's32min-2147483646'
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:3957 i40e_ndo_set_vf_mac() warn: can 'vf_id' underflow 's32min-2147483646'
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4104 i40e_ndo_set_vf_port_vlan() warn: can 'vf_id' underflow 's32min-2147483646'
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4263 i40e_ndo_set_vf_bw() warn: can 'vf_id' underflow 's32min-2147483646'
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4309 i40e_ndo_get_vf_config() warn: can 'vf_id' underflow 's32min-2147483646'
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4371 i40e_ndo_set_vf_link_state() warn: can 'vf_id' underflow 's32min-2147483646'
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4441 i40e_ndo_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-2147483646'
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4441 i40e_ndo_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-2147483646'
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4504 i40e_ndo_set_vf_trust() warn: can 'vf_id' underflow 's32min-2147483646'

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'pm-5.4-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Fri, 22 Nov 2019 17:18:16 +0000 (09:18 -0800)]
Merge tag 'pm-5.4-final' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management regression fix from Rafael Wysocki:
 "Fix problems with switching cpufreq drivers on some x86 systems with
  ACPI (and with changing the operation modes of the intel_pstate driver
  on those systems) introduced by recent changes related to the
  management of frequency limits in cpufreq"

* tag 'pm-5.4-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: QoS: Invalidate frequency QoS requests after removal

5 years agoMerge tag 'drm-fixes-2019-11-22' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 22 Nov 2019 17:14:30 +0000 (09:14 -0800)]
Merge tag 'drm-fixes-2019-11-22' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Two sets of fixes in here, one for amdgpu, and one for i915.

  The amdgpu ones are pretty small, i915's CI system seems to have a few
  problems in the last week or so, there is one major regression fix for
  fb_mmap, but there are a bunch of other issues fixed in there as well,
  oops, screen flashes and rcu related.

  amdgpu:
   - Remove experimental flag for navi14
   - Fix confusing power message failures on older VI parts
   - Hang fix for gfxoff when using the read register interface
   - Two stability regression fixes for Raven

  i915:
   - Fix kernel oops on dumb_create ioctl on no crtc situation
   - Fix bad ugly colored flash on VLV/CHV related to gamma LUT update
   - Fix unity of the frequencies reported on PMU
   - Fix kernel oops on set_page_dirty using better locks around it
   - Protect the request pointer with RCU to prevent it being freed
     while we might need still
   - Make pool objects read-only
   - Restore physical addresses for fb_map to avoid corrupted page
     table"

* tag 'drm-fixes-2019-11-22' of git://anongit.freedesktop.org/drm/drm:
  drm/i915/fbdev: Restore physical addresses for fb_mmap()
  Revert "drm/amd/display: enable S/G for RAVEN chip"
  drm/amdgpu: disable gfxoff on original raven
  drm/amdgpu: disable gfxoff when using register read interface
  drm/amd/powerplay: correct fine grained dpm force level setting
  drm/amd/powerplay: issue no PPSMC_MSG_GetCurrPkgPwr on unsupported ASICs
  drm/amdgpu: remove experimental flag for Navi14
  drm/i915: make pool objects read-only
  drm/i915: Protect request peeking with RCU
  drm/i915/userptr: Try to acquire the page lock around set_page_dirty()
  drm/i915/pmu: "Frequency" is reported as accumulated cycles
  drm/i915: Preload LUTs if the hw isn't currently using them
  drm/i915: Don't oops in dumb_create ioctl if we have no crtcs

5 years agomm/ksm.c: don't WARN if page is still mapped in remove_stable_node()
Andrey Ryabinin [Fri, 22 Nov 2019 01:54:01 +0000 (17:54 -0800)]
mm/ksm.c: don't WARN if page is still mapped in remove_stable_node()

It's possible to hit the WARN_ON_ONCE(page_mapped(page)) in
remove_stable_node() when it races with __mmput() and squeezes in
between ksm_exit() and exit_mmap().

  WARNING: CPU: 0 PID: 3295 at mm/ksm.c:888 remove_stable_node+0x10c/0x150

  Call Trace:
   remove_all_stable_nodes+0x12b/0x330
   run_store+0x4ef/0x7b0
   kernfs_fop_write+0x200/0x420
   vfs_write+0x154/0x450
   ksys_write+0xf9/0x1d0
   do_syscall_64+0x99/0x510
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

Remove the warning as there is nothing scary going on.

Link: http://lkml.kernel.org/r/20191119131850.5675-1-aryabinin@virtuozzo.com
Fixes: cbf86cfe04a6 ("ksm: remove old stable nodes more thoroughly")
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/memory_hotplug: don't access uninitialized memmaps in shrink_zone_span()
David Hildenbrand [Fri, 22 Nov 2019 01:53:56 +0000 (17:53 -0800)]
mm/memory_hotplug: don't access uninitialized memmaps in shrink_zone_span()

Let's limit shrinking to !ZONE_DEVICE so we can fix the current code.
We should never try to touch the memmap of offline sections where we
could have uninitialized memmaps and could trigger BUGs when calling
page_to_nid() on poisoned pages.

There is no reliable way to distinguish an uninitialized memmap from an
initialized memmap that belongs to ZONE_DEVICE, as we don't have
anything like SECTION_IS_ONLINE we can use similar to
pfn_to_online_section() for !ZONE_DEVICE memory.

E.g., set_zone_contiguous() similarly relies on pfn_to_online_section()
and will therefore never set a ZONE_DEVICE zone consecutive.  Stopping
to shrink the ZONE_DEVICE therefore results in no observable changes,
besides /proc/zoneinfo indicating different boundaries - something we
can totally live with.

Before commit d0dc12e86b31 ("mm/memory_hotplug: optimize memory
hotplug"), the memmap was initialized with 0 and the node with the right
value.  So the zone might be wrong but not garbage.  After that commit,
both the zone and the node will be garbage when touching uninitialized
memmaps.

Toshiki reported a BUG (race between delayed initialization of
ZONE_DEVICE memmaps without holding the memory hotplug lock and
concurrent zone shrinking).

  https://lkml.org/lkml/2019/11/14/1040

"Iteration of create and destroy namespace causes the panic as below:

      kernel BUG at mm/page_alloc.c:535!
      CPU: 7 PID: 2766 Comm: ndctl Not tainted 5.4.0-rc4 #6
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
      RIP: 0010:set_pfnblock_flags_mask+0x95/0xf0
      Call Trace:
       memmap_init_zone_device+0x165/0x17c
       memremap_pages+0x4c1/0x540
       devm_memremap_pages+0x1d/0x60
       pmem_attach_disk+0x16b/0x600 [nd_pmem]
       nvdimm_bus_probe+0x69/0x1c0
       really_probe+0x1c2/0x3e0
       driver_probe_device+0xb4/0x100
       device_driver_attach+0x4f/0x60
       bind_store+0xc9/0x110
       kernfs_fop_write+0x116/0x190
       vfs_write+0xa5/0x1a0
       ksys_write+0x59/0xd0
       do_syscall_64+0x5b/0x180
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

  While creating a namespace and initializing memmap, if you destroy the
  namespace and shrink the zone, it will initialize the memmap outside
  the zone and trigger VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page),
  pfn), page) in set_pfnblock_flags_mask()."

This BUG is also mitigated by this commit, where we for now stop to
shrink the ZONE_DEVICE zone until we can do it in a safe and clean way.

Link: http://lkml.kernel.org/r/20191006085646.5768-5-david@redhat.com
Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after d0dc12e86b319]
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reported-by: Toshiki Fukasawa <t-fukasawa@vx.jp.nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Damian Tometzki <damian.tometzki@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jun Yao <yaojun8558363@gmail.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Pankaj Gupta <pagupta@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qian Cai <cai@lca.pw>
Cc: Rich Felker <dalias@libc.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Yu Zhao <yuzhao@google.com>
Cc: <stable@vger.kernel.org> [4.13+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoRevert "fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()"
Joseph Qi [Fri, 22 Nov 2019 01:53:52 +0000 (17:53 -0800)]
Revert "fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()"

This reverts commit 56e94ea132bb5c2c1d0b60a6aeb34dcb7d71a53d.

Commit 56e94ea132bb ("fs: ocfs2: fix possible null-pointer dereferences
in ocfs2_xa_prepare_entry()") introduces a regression that fail to
create directory with mount option user_xattr and acl.  Actually the
reported NULL pointer dereference case can be correctly handled by
loc->xl_ops->xlo_add_entry(), so revert it.

Link: http://lkml.kernel.org/r/1573624916-83825-1-git-send-email-joseph.qi@linux.alibaba.com
Fixes: 56e94ea132bb ("fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()")
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reported-by: Thomas Voegtle <tv@lio96.de>
Acked-by: Changwei Ge <gechangwei@live.cn>
Cc: Jia-Ju Bai <baijiaju1990@gmail.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agocan: m_can_platform: remove unnecessary m_can_class_resume() call
Pankaj Sharma [Tue, 19 Nov 2019 10:20:38 +0000 (15:50 +0530)]
can: m_can_platform: remove unnecessary m_can_class_resume() call

The function m_can_runtime_resume() is getting recursively called from
m_can_class_resume(). This results in a lock up.

We need not call m_can_class_resume() during m_can_runtime_resume().

Fixes: f524f829b75a ("can: m_can: Create a m_can platform framework")
Signed-off-by: Pankaj Sharma <pankj.sharma@samsung.com>
Signed-off-by: Sriram Dash <sriram.dash@samsung.com>
Acked-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
5 years agocan: m_can_platform: set net_device structure as driver data
Pankaj Sharma [Tue, 19 Nov 2019 10:20:37 +0000 (15:50 +0530)]
can: m_can_platform: set net_device structure as driver data

The current code is failing during clock prepare enable because of not
getting proper clock from platform device.

[    0.852089] Call trace:
[    0.854516]  0xffff0000fa22a668
[    0.857638]  clk_prepare+0x20/0x34
[    0.861019]  m_can_runtime_resume+0x2c/0xe4
[    0.865180]  pm_generic_runtime_resume+0x28/0x38
[    0.869770]  __rpm_callback+0x16c/0x1bc
[    0.873583]  rpm_callback+0x24/0x78
[    0.877050]  rpm_resume+0x428/0x560
[    0.880517]  __pm_runtime_resume+0x7c/0xa8
[    0.884593]  m_can_clk_start.isra.9.part.10+0x1c/0xa8
[    0.889618]  m_can_class_register+0x138/0x370
[    0.893950]  m_can_plat_probe+0x120/0x170
[    0.897939]  platform_drv_probe+0x4c/0xa0
[    0.901924]  really_probe+0xd8/0x31c
[    0.905477]  driver_probe_device+0x58/0xe8
[    0.909551]  device_driver_attach+0x68/0x70
[    0.913711]  __driver_attach+0x9c/0xf8
[    0.917437]  bus_for_each_dev+0x50/0xa0
[    0.921251]  driver_attach+0x20/0x28
[    0.924804]  bus_add_driver+0x148/0x1fc
[    0.928617]  driver_register+0x6c/0x124
[    0.932431]  __platform_driver_register+0x48/0x50
[    0.937113]  m_can_plat_driver_init+0x18/0x20
[    0.941446]  do_one_initcall+0x4c/0x19c
[    0.945259]  kernel_init_freeable+0x1d0/0x280
[    0.949591]  kernel_init+0x10/0x100
[    0.953057]  ret_from_fork+0x10/0x18
[    0.956614] Code: 00000000 00000000 00000000 00000000 (fa22a668)
[    0.962681] ---[ end trace 881f71bd609de763 ]---
[    0.967301] Kernel panic - not syncing: Attempted to kill init!

A device driver for CAN controller hardware registers itself with the
Linux network layer as a network device. So, the driver data for m_can
should ideally be of type net_device.

Fixes: f524f829b75a ("can: m_can: Create a m_can platform framework")
Signed-off-by: Pankaj Sharma <pankj.sharma@samsung.com>
Signed-off-by: Sriram Dash <sriram.dash@samsung.com>
Acked-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
5 years agoMerge branch 'hv_netvsc-Fix-send-indirection-table-offset'
David S. Miller [Fri, 22 Nov 2019 03:32:23 +0000 (19:32 -0800)]
Merge branch 'hv_netvsc-Fix-send-indirection-table-offset'

Haiyang Zhang says:

====================
hv_netvsc: Fix send indirection table offset

Fix send indirection table offset issues related to guest and
host bugs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agohv_netvsc: Fix send_table offset in case of a host bug
Haiyang Zhang [Thu, 21 Nov 2019 21:33:41 +0000 (13:33 -0800)]
hv_netvsc: Fix send_table offset in case of a host bug

If negotiated NVSP version <= NVSP_PROTOCOL_VERSION_6, the offset may
be wrong (too small) due to a host bug. This can cause missing the
end of the send indirection table, and add multiple zero entries from
leading zeros before the data region. This bug adds extra burden on
channel 0.

So fix the offset by computing it from the data structure sizes. This
will ensure netvsc driver runs normally on unfixed hosts, and future
fixed hosts.

Fixes: 5b54dac856cb ("hyperv: Add support for virtual Receive Side Scaling (vRSS)")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agohv_netvsc: Fix offset usage in netvsc_send_table()
Haiyang Zhang [Thu, 21 Nov 2019 21:33:40 +0000 (13:33 -0800)]
hv_netvsc: Fix offset usage in netvsc_send_table()

To reach the data region, the existing code adds offset in struct
nvsp_5_send_indirect_table on the beginning of this struct. But the
offset should be based on the beginning of its container,
struct nvsp_message. This bug causes the first table entry missing,
and adds an extra zero from the zero pad after the data region.
This can put extra burden on the channel 0.

So, correct the offset usage. Also add a boundary check to ensure
not reading beyond data region.

Fixes: 5b54dac856cb ("hyperv: Add support for virtual Receive Side Scaling (vRSS)")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet-ipv6: IPV6_TRANSPARENT - check NET_RAW prior to NET_ADMIN
Maciej Żenczykowski [Thu, 21 Nov 2019 21:19:08 +0000 (13:19 -0800)]
net-ipv6: IPV6_TRANSPARENT - check NET_RAW prior to NET_ADMIN

NET_RAW is less dangerous, so more likely to be available to a process,
so check it first to prevent some spurious logging.

This matches IP_TRANSPARENT which checks NET_RAW first.

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'drm-intel-fixes-2019-11-21' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Fri, 22 Nov 2019 00:23:22 +0000 (10:23 +1000)]
Merge tag 'drm-intel-fixes-2019-11-21' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- Fix kernel oops on dumb_create ioctl on no crtc situation
- Fix bad ugly colored flash on VLV/CHV related to gamma LUT update
- Fix unity of the frequencies reported on PMU
- Fix kernel oops on set_page_dirty using better locks around it
- Protect the request pointer with RCU to prevent it being freed while we might need still
- Make pool objects read-only
- Restore physical addresses for fb_map to avoid corrupted page table

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191121165339.GA23920@intel.com
5 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Thu, 21 Nov 2019 20:15:24 +0000 (12:15 -0800)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fix from Will Deacon:
 "Ensure PAN is re-enabled following user fault in uaccess routines.

  After I thought we were done for 5.4, we had a report this week of a
  nasty issue that has been shown to leak data between different user
  address spaces thanks to corruption of entries in the TLB. In
  hindsight, we should have spotted this in review when the PAN code was
  merged back in v4.3, but hindsight is 20/20 and I'm trying not to beat
  myself up too much about it despite being fairly miserable.

  Anyway, the fix is "obvious" but the actual failure is more more
  subtle, and is described in the commit message. I've included a fairly
  mechanical follow-up patch here as well, which moves this checking out
  into the C wrappers which is what we do for {get,put}_user() already
  and allows us to remove these bloody assembly macros entirely. The
  patches have passed kernelci [1] [2] [3] and CKI [4] tests over night,
  as well as some targetted testing [5] for this particular issue.

  The first patch is tagged for stable and should be applied to 4.14,
  4.19 and 5.3. I have separate backports for 4.4 and 4.9, which I'll
  send out once this has landed in your tree (although the original
  patch applies cleanly, it won't build for those two trees).

  Thanks to Pavel Tatashin for reporting this and Mark Rutland for
  helping to diagnose the issue and review/test the solution"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: uaccess: Remove uaccess_*_not_uao asm macros
  arm64: uaccess: Ensure PAN is re-enabled after unhandled uaccess fault

5 years agosfc: Only cancel the PPS workqueue if it exists
Martin Habets [Thu, 21 Nov 2019 17:52:15 +0000 (17:52 +0000)]
sfc: Only cancel the PPS workqueue if it exists

The workqueue only exists for the primary PF. For other functions
we hit a WARN_ON in kernel/workqueue.c.

Fixes: 7c236c43b838 ("sfc: Add support for IEEE-1588 PTP")
Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'for-linus-20191121' of git://git.kernel.dk/linux-block
Linus Torvalds [Thu, 21 Nov 2019 20:04:50 +0000 (12:04 -0800)]
Merge tag 'for-linus-20191121' of git://git.kernel.dk/linux-block

Pull block fix from Jens Axboe:
 "Just a single fix for an issue in nbd introduced in this cycle"

* tag 'for-linus-20191121' of git://git.kernel.dk/linux-block:
  nbd:fix memory leak in nbd_get_socket()

5 years agoMerge tag 'gpio-v5.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux...
Linus Torvalds [Thu, 21 Nov 2019 20:01:30 +0000 (12:01 -0800)]
Merge tag 'gpio-v5.4-5' of git://git./linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
 "A last set of small fixes for GPIO, this cycle was quite busy.

   - Fix debounce delays on the MAX77620 GPIO expander

   - Use the correct unit for debounce times on the BD70528 GPIO expander

   - Get proper deps for parallel builds of the GPIO tools

   - Add a specific ACPI quirk for the Terra Pad 1061"

* tag 'gpio-v5.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpiolib: acpi: Add Terra Pad 1061 to the run_edge_events_on_boot_blacklist
  tools: gpio: Correctly add make dependencies for gpio_utils
  gpio: bd70528: Use correct unit for debounce times
  gpio: max77620: Fixup debounce delays

5 years agoMerge tag 'for-linus-2019-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 21 Nov 2019 19:51:49 +0000 (11:51 -0800)]
Merge tag 'for-linus-2019-11-21' of git://git./linux/kernel/git/brauner/linux

Pull pidfd fixlet from Christian Brauner:
 "This contains a simple fix for the pidfd poll method. In the original
  patchset pidfd_poll() was made to return an unsigned int. However, the
  poll method is defined to return a __poll_t. While the unsigned int is
  not a huge deal it's just nicer to return a __poll_t.

  I've decided to send it right before the 5.4 release mainly so that
  stable doesn't need to backport it to both 5.4 and 5.3"

* tag 'for-linus-2019-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  fork: fix pidfd_poll()'s return type

5 years agonfc: port100: handle command failure cleanly
Oliver Neukum [Thu, 21 Nov 2019 10:37:10 +0000 (11:37 +0100)]
nfc: port100: handle command failure cleanly

If starting the transfer of a command suceeds but the transfer for the reply
fails, it is not enough to initiate killing the transfer for the
command may still be running. You need to wait for the killing to finish
before you can reuse URB and buffer.

Reported-and-tested-by: syzbot+711468aa5c3a1eabf863@syzkaller.appspotmail.com
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoRevert "block: split bio if the only bvec's length is > SZ_4K"
Jens Axboe [Thu, 21 Nov 2019 17:16:12 +0000 (10:16 -0700)]
Revert "block: split bio if the only bvec's length is > SZ_4K"

We really don't need this, as the slow path will do the right thing
anyway.

This reverts commit 6952a7f8446ee85ea9d10ab87b64797a031eaae3.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoblock: add iostat counters for flush requests
Konstantin Khlebnikov [Thu, 21 Nov 2019 10:40:26 +0000 (13:40 +0300)]
block: add iostat counters for flush requests

Requests that triggers flushing volatile writeback cache to disk (barriers)
have significant effect to overall performance.

Block layer has sophisticated engine for combining several flush requests
into one. But there is no statistics for actual flushes executed by disk.
Requests which trigger flushes usually are barriers - zero-size writes.

This patch adds two iostat counters into /sys/class/block/$dev/stat and
/proc/diskstats - count of completed flush requests and their total time.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agodrm/i915/fbdev: Restore physical addresses for fb_mmap()
Chris Wilson [Wed, 13 Nov 2019 18:06:33 +0000 (18:06 +0000)]
drm/i915/fbdev: Restore physical addresses for fb_mmap()

fbdev uses the physical address of our framebuffer for its fb_mmap()
routine. While we need to adapt this address for the new io BAR, we have
to fix v5.4 first! The simplest fix is to restore the smem back to v5.3
and we will then probably have to implement our fbops->fb_mmap() callback
to handle local memory.

Reported-by: Neil MacLeod <freedesktop@nmacleod.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112256
Fixes: 5f889b9a61dd ("drm/i915: Disregard drm_mode_config.fb_base")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Tested-by: Neil MacLeod <freedesktop@nmacleod.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113180633.3947-1-chris@chris-wilson.co.uk
(cherry picked from commit abc5520704ab438099fe352636b30b05c1253bea)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit 9faf5fa4d3dad3b0c0fa6e67689c144981a11c27)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
5 years agonet-sysfs: fix netdev_queue_add_kobject() breakage
Eric Dumazet [Thu, 21 Nov 2019 03:19:07 +0000 (19:19 -0800)]
net-sysfs: fix netdev_queue_add_kobject() breakage

kobject_put() should only be called in error path.

Fixes: b8eb718348b8 ("net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jouni Hogander <jouni.hogander@unikie.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'drm-fixes-5.4-2019-11-20' of git://people.freedesktop.org/~agd5f/linux...
Dave Airlie [Thu, 21 Nov 2019 05:07:35 +0000 (15:07 +1000)]
Merge tag 'drm-fixes-5.4-2019-11-20' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

drm-fixes-5.4-2019-11-20:

amdgpu:
- Remove experimental flag for navi14
- Fix confusing power message failures on older VI parts
- Hang fix for gfxoff when using the read register interface
- Two stability regression fixes for Raven

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120235130.23755-1-alexander.deucher@amd.com
5 years agoRevert "drm/amd/display: enable S/G for RAVEN chip"
Alex Deucher [Fri, 15 Nov 2019 15:26:52 +0000 (10:26 -0500)]
Revert "drm/amd/display: enable S/G for RAVEN chip"

This reverts commit 1c4259159132ae4ceaf7c6db37a6cf76417f73d9.

S/G display is not stable with the IOMMU enabled on some
platforms.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205523
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
5 years agodrm/amdgpu: disable gfxoff on original raven
Alex Deucher [Fri, 15 Nov 2019 15:21:23 +0000 (10:21 -0500)]
drm/amdgpu: disable gfxoff on original raven

There are still combinations of sbios and firmware that
are not stable.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204689
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
5 years agodrm/amdgpu: disable gfxoff when using register read interface
Alex Deucher [Thu, 14 Nov 2019 16:39:05 +0000 (11:39 -0500)]
drm/amdgpu: disable gfxoff when using register read interface

When gfxoff is enabled, accessing gfx registers via MMIO
can lead to a hang.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205497
Acked-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
5 years agodrm/amd/powerplay: correct fine grained dpm force level setting
Evan Quan [Thu, 14 Nov 2019 08:58:31 +0000 (16:58 +0800)]
drm/amd/powerplay: correct fine grained dpm force level setting

For fine grained dpm, there is only two levels supported. However
to reflect correctly the current clock frequency, there is an
intermediate level faked. Thus on forcing level setting, we
need to treat level 2 correctly as level 1.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 years agodrm/amd/powerplay: issue no PPSMC_MSG_GetCurrPkgPwr on unsupported ASICs
Evan Quan [Thu, 14 Nov 2019 07:30:39 +0000 (15:30 +0800)]
drm/amd/powerplay: issue no PPSMC_MSG_GetCurrPkgPwr on unsupported ASICs

Otherwise, the error message prompted will confuse user.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
5 years agodrm/amdgpu: remove experimental flag for Navi14
Alex Deucher [Fri, 15 Nov 2019 14:38:28 +0000 (09:38 -0500)]
drm/amdgpu: remove experimental flag for Navi14

5.4 and newer works fine with navi14.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 years agoblock,bfq: Skip tracing hooks if possible
Dmitry Monakhov [Fri, 1 Nov 2019 13:11:10 +0000 (13:11 +0000)]
block,bfq: Skip tracing hooks if possible

In most cases blk_tracing is not active, but  bfq_log_bfqq macro
generate pid_str unconditionally, which result in significant overhead.

## Test
modprobe null_blk
echo bfq > /sys/block/nullb0/queue/scheduler
fio --name=t --ioengine=libaio --direct=1 --filename=/dev/nullb0 \
   --runtime=30 --time_based=1 --rw=write --iodepth=128 --bs=4k

# Results
|        | baseline | w/ patch | gain |
| iops   | 113.19K  | 126.42K  | +11% |

Acked-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Dmitry Monakhov <dmonakhov@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoMerge tag 'mlx5-fixes-2019-11-20' of git://git.kernel.org/pub/scm/linux/kernel/git...
David S. Miller [Wed, 20 Nov 2019 20:56:32 +0000 (12:56 -0800)]
Merge tag 'mlx5-fixes-2019-11-20' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2019-11-20

This series introduces some fixes to mlx5 driver.

Please pull and let me know if there is any problem.

For -stable v4.9:
 ('net/mlx5e: Fix set vf link state error flow')

For -stable v4.14
 ('net/mlxfw: Verify FSM error code translation doesn't exceed array size')

For -stable v4.19
 ('net/mlx5: Fix auto group size calculation')

For -stable v5.3
 ('net/mlx5e: Fix error flow cleanup in mlx5e_tc_tun_create_header_ipv4/6')
 ('net/mlx5e: Do not use non-EXT link modes in EXT mode')
 ('net/mlx5: Update the list of the PCI supported devices')
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8152: Re-order napi_disable in rtl8152_close
Prashant Malani [Wed, 20 Nov 2019 19:40:21 +0000 (11:40 -0800)]
r8152: Re-order napi_disable in rtl8152_close

Both rtl_work_func_t() and rtl8152_close() call napi_disable().
Since the two calls aren't protected by a lock, if the close
function starts executing before the work function, we can get into a
situation where the napi_disable() function is called twice in
succession (first by rtl8152_close(), then by set_carrier()).

In such a situation, the second call would loop indefinitely, since
rtl8152_close() doesn't call napi_enable() to clear the NAPI_STATE_SCHED
bit.

The rtl8152_close() function in turn issues a
cancel_delayed_work_sync(), and so it would wait indefinitely for the
rtl_work_func_t() to complete. Since rtl8152_close() is called by a
process holding rtnl_lock() which is requested by other processes, this
eventually leads to a system deadlock and crash.

Re-order the napi_disable() call to occur after the work function
disabling and urb cancellation calls are issued.

Change-Id: I6ef0b703fc214998a037a68f722f784e1d07815e
Reported-by: http://crbug.com/1017928
Signed-off-by: Prashant Malani <pmalani@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'qca_spi-fixes'
David S. Miller [Wed, 20 Nov 2019 20:42:23 +0000 (12:42 -0800)]
Merge branch 'qca_spi-fixes'

Stefan Wahren says:

====================
net: qca_spi: Fix receive and reset issues

This small patch series fixes two major issues in the SPI driver for the
QCA700x.

It has been tested on a Charge Control C 300 (NXP i.MX6ULL +
2x QCA7000).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: qca_spi: Move reset_count to struct qcaspi
Stefan Wahren [Wed, 20 Nov 2019 17:29:13 +0000 (18:29 +0100)]
net: qca_spi: Move reset_count to struct qcaspi

The reset counter is specific for every QCA700x chip. So move this
into the private driver struct. Otherwise we get unpredictable reset
behavior in setups with multiple QCA700x chips.

Fixes: 291ab06ecf67 (net: qualcomm: new Ethernet over SPI driver for QCA7000)
Signed-off-by: Stefan Wahren <stefan.wahren@in-tech.com>
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: qca_spi: fix receive buffer size check
Michael Heimpold [Wed, 20 Nov 2019 17:29:12 +0000 (18:29 +0100)]
net: qca_spi: fix receive buffer size check

When receiving many or larger packets, e.g. when doing a file download,
it was observed that the read buffer size register reports up to 4 bytes
more than the current define allows in the check.
If this is the case, then no data transfer is initiated to receive the
packets (and thus to empty the buffer) which results in a stall of the
interface.

These 4 bytes are a hardware generated frame length which is prepended
to the actual frame, thus we have to respect it during our check.

Fixes: 026b907d58c4 ("net: qca_spi: Add available buffer space verification")
Signed-off-by: Michael Heimpold <michael.heimpold@in-tech.com>
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'ibmvnic-regression'
David S. Miller [Wed, 20 Nov 2019 20:37:15 +0000 (12:37 -0800)]
Merge branch 'ibmvnic-regression'

Juliet Kim says:

====================
Support both XIVE and XICS modes in ibmvnic

This series aims to support both XICS and XIVE with avoiding
a regression in behavior when a system runs in XICS mode.

Patch 1 reverts commit 11d49ce9f7946dfed4dcf5dbde865c78058b50ab
(“net/ibmvnic: Fix EOI when running in XIVE mode.”)

Patch 2 Ignore H_FUNCTION return from H_EOI to tolerate XIVE mode
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/ibmvnic: Ignore H_FUNCTION return from H_EOI to tolerate XIVE mode
Juliet Kim [Wed, 20 Nov 2019 15:50:04 +0000 (10:50 -0500)]
net/ibmvnic: Ignore H_FUNCTION return from H_EOI to tolerate XIVE mode

Reversion of commit 11d49ce9f7946dfed4dcf5dbde865c78058b50ab
(“net/ibmvnic: Fix EOI when running in XIVE mode.”) leaves us
calling H_EOI even in XIVE mode. That will fail with H_FUNCTION
because H_EOI is not supported in that mode. That failure is
harmless. Ignore it so we can use common code for both XICS and
XIVE.

Signed-off-by: Juliet Kim <julietk@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoRevert "net/ibmvnic: Fix EOI when running in XIVE mode"
Juliet Kim [Wed, 20 Nov 2019 15:50:03 +0000 (10:50 -0500)]
Revert "net/ibmvnic: Fix EOI when running in XIVE mode"

This reverts commit 11d49ce9f7946dfed4dcf5dbde865c78058b50ab
(“net/ibmvnic: Fix EOI when running in XIVE mode.”) since that
has the unintended effect of changing the interrupt priority
and emits warning when running in legacy XICS mode.

Signed-off-by: Juliet Kim <julietk@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/mlxfw: Verify FSM error code translation doesn't exceed array size
Eran Ben Elisha [Sun, 17 Nov 2019 08:18:59 +0000 (10:18 +0200)]
net/mlxfw: Verify FSM error code translation doesn't exceed array size

Array mlxfw_fsm_state_err_str contains value to string translation, when
values are provided by mlxfw_dev. If value is larger than
MLXFW_FSM_STATE_ERR_MAX, return "unknown error" as expected instead of
reading an address than exceed array size.

Fixes: 410ed13cae39 ("Add the mlxfw module for Mellanox firmware flash process")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Update the list of the PCI supported devices
Shani Shapp [Tue, 12 Nov 2019 13:10:00 +0000 (15:10 +0200)]
net/mlx5: Update the list of the PCI supported devices

Add the upcoming ConnectX-6 LX device ID.

Fixes: 85327a9c4150 ("net/mlx5: Update the list of the PCI supported devices")
Signed-off-by: Shani Shapp <shanish@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Fix auto group size calculation
Maor Gottlieb [Thu, 5 Sep 2019 06:56:10 +0000 (09:56 +0300)]
net/mlx5: Fix auto group size calculation

Once all the large flow groups (defined by the user when the flow table
is created - max_num_groups) were created, then all the following new
flow groups will have only one flow table entry, even though the flow table
has place to larger groups.
Fix the condition to prefer large flow group.

Fixes: f0d22d187473 ("net/mlx5_core: Introduce flow steering autogrouped flow table")
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5e: Add missing capability bit check for IP-in-IP
Marina Varshaver [Tue, 19 Nov 2019 16:52:13 +0000 (18:52 +0200)]
net/mlx5e: Add missing capability bit check for IP-in-IP

Device that doesn't support IP-in-IP offloads has to filter csum and gso
offload support, otherwise kernel will conclude that device is capable of
offloading csum and gso for IP-in-IP tunnels and that might result in
IP-in-IP tunnel not functioning.

Fixes: 25948b87dda2 ("net/mlx5e: Support TSO and TX checksum offloads for IP-in-IP")
Signed-off-by: Marina Varshaver <marinav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5e: Do not use non-EXT link modes in EXT mode
Eran Ben Elisha [Sun, 17 Nov 2019 13:17:05 +0000 (15:17 +0200)]
net/mlx5e: Do not use non-EXT link modes in EXT mode

On some old Firmwares, connector type value was not supported, and value
read from FW was 0. For those, driver used link mode in order to set
connector type in link_ksetting.

After FW exposed the connector type, driver translated the value to ethtool
definitions. However, as 0 is a valid value, before returning PORT_OTHER,
driver run the check of link mode in order to maintain backward
compatibility.

Cited patch added support to EXT mode.  With both features (connector type
and EXT link modes) ,if connector_type read from FW is 0 and EXT mode is
set, driver mistakenly compare EXT link modes to non-EXT link mode.
Fixed that by skipping this comparison if we are in EXT mode, as connector
type value is valid in this scenario.

Fixes: 6a897372417e ("net/mlx5: ethtool, Add ethtool support for 50Gbps per lane link modes")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5e: Fix set vf link state error flow
Roi Dayan [Wed, 13 Nov 2019 12:42:00 +0000 (14:42 +0200)]
net/mlx5e: Fix set vf link state error flow

Before this commit the ndo always returned success.
Fix that.

Fixes: 1ab2068a4c66 ("net/mlx5: Implement vports admin state backup/restore")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: DR, Limit STE hash table enlarge based on bytemask
Alex Vesker [Sun, 10 Nov 2019 13:39:36 +0000 (15:39 +0200)]
net/mlx5: DR, Limit STE hash table enlarge based on bytemask

When an ste hash table has too many collision we enlarge it
to a bigger hash table (rehash). Rehashing collision improvement
depends on the bytemask value. The more 1 bits we have in bytemask
means better spreading in the table.

Without this fix tables can grow in size without providing any
improvement which can lead to memory depletion and failures.

This patch will limit table rehash to reduce memory and improve
the performance.

Fixes: 41d07074154c ("net/mlx5: DR, Expose steering rule functionality")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: DR, Skip rehash for tables with byte mask zero
Alex Vesker [Thu, 31 Oct 2019 13:24:59 +0000 (15:24 +0200)]
net/mlx5: DR, Skip rehash for tables with byte mask zero

The byte mask fields affect on the hash index distribution,
when the byte mask is zero, the hash calculation will always
be equal to the same index.

To avoid unneeded rehash of hash tables mark the table to skip
rehash.

This is needed by the next patch which will limit table rehash
to reduce memory consumption.

Fixes: 41d07074154c ("net/mlx5: DR, Expose steering rule functionality")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: DR, Fix invalid EQ vector number on CQ creation
Alex Vesker [Mon, 4 Nov 2019 09:59:21 +0000 (11:59 +0200)]
net/mlx5: DR, Fix invalid EQ vector number on CQ creation

When creating a CQ, the CPU id is used for the vector value.
This would fail in-case the CPU id was higher than the maximum
vector value.

Fixes: 297cccebdc5a ("net/mlx5: DR, Expose an internal API to issue RDMA operations")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5e: Reorder mirrer action parsing to check for encap first
Vlad Buslov [Thu, 7 Nov 2019 11:37:57 +0000 (13:37 +0200)]
net/mlx5e: Reorder mirrer action parsing to check for encap first

Mirred action parsing code in parse_tc_fdb_actions() first checks if
out_dev has same parent id, and only verifies that there is a pending encap
action that was parsed before. Recent change in vxlan module made function
netdev_port_same_parent_id() to return true when called for mlx5 eswitch
representor and vxlan device created explicitly on mlx5 representor
device (vxlan devices created with "external" flag without explicitly
specifying parent interface are not affected). With call to
netdev_port_same_parent_id() returning true, incorrect code path is chosen
and encap rules fail to offload because vxlan dev is not a valid eswitch
forwarding dev. Dmesg log of error:

[ 1784.389797] devices ens1f0_0 vxlan1 not on same switch HW, can't offload forwarding

In order to fix the issue, rearrange conditional in parse_tc_fdb_actions()
to check for pending encap action before checking if out_dev has the same
parent id.

Fixes: 0ce1822c2a08 ("vxlan: add adjacent link to limit depth level")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5e: Fix ingress rate configuration for representors
Eli Cohen [Thu, 7 Nov 2019 07:07:34 +0000 (09:07 +0200)]
net/mlx5e: Fix ingress rate configuration for representors

Current code uses the old method of prio encoding in
flow_cls_common_offload. Fix to follow the changes introduced in
commit ef01adae0e43 ("net: sched: use major priority number as hardware priority").

Fixes: fcb64c0f5640 ("net/mlx5: E-Switch, add ingress rate support")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5e: Fix error flow cleanup in mlx5e_tc_tun_create_header_ipv4/6
Eli Cohen [Thu, 31 Oct 2019 07:00:43 +0000 (09:00 +0200)]
net/mlx5e: Fix error flow cleanup in mlx5e_tc_tun_create_header_ipv4/6

Be sure to release the neighbour in case of failures after successful
route lookup.

Fixes: 101f4de9dd52 ("net/mlx5e: Move TC tunnel offloading code to separate source file")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agoMerge branch 's390-fixes'
David S. Miller [Wed, 20 Nov 2019 20:29:47 +0000 (12:29 -0800)]
Merge branch 's390-fixes'

Julian Wiedmann says:

====================
s390/qeth: fixes 2019-11-20

please apply two late qeth fixes to your net tree.

The first fixes a deadlock that can occur if a qeth device is set
offline while in the middle of processing deferred HW events.
The second patch converts the return value of an error path to
use -EIO, so that it can be passed back to userspace.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: return proper errno on IO error
Julian Wiedmann [Wed, 20 Nov 2019 13:20:57 +0000 (14:20 +0100)]
s390/qeth: return proper errno on IO error

When propagating IO errors back to userspace, one error path in
qeth_irq() currently returns '1' instead of a proper errno.

Fixes: 54daaca7024d ("s390/qeth: cancel cmd on early error")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: fix potential deadlock on workqueue flush
Julian Wiedmann [Wed, 20 Nov 2019 13:20:56 +0000 (14:20 +0100)]
s390/qeth: fix potential deadlock on workqueue flush

The L2 bridgeport code uses the coarse 'conf_mutex' for guarding access
to its configuration state.
This can result in a deadlock when qeth_l2_stop_card() - called under the
conf_mutex - blocks on flush_workqueue() to wait for the completion of
pending bridgeport workers. Such workers would also need to aquire
the conf_mutex, stalling indefinitely.

Introduce a lock that specifically guards the bridgeport configuration,
so that the workers no longer need the conf_mutex.
Wrapping qeth_l2_promisc_to_bridge() in this fine-grained lock then also
fixes a theoretical race against a concurrent qeth_bridge_port_role_store()
operation.

Fixes: c0a2e4d10d93 ("s390/qeth: conclude all event processing before offlining a card")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6/route: return if there is no fib_nh_gw_family
Hangbin Liu [Wed, 20 Nov 2019 07:39:06 +0000 (15:39 +0800)]
ipv6/route: return if there is no fib_nh_gw_family

Previously we will return directly if (!rt || !rt->fib6_nh.fib_nh_gw_family)
in function rt6_probe(), but after commit cc3a86c802f0
("ipv6: Change rt6_probe to take a fib6_nh"), the logic changed to
return if there is fib_nh_gw_family.

Fixes: cc3a86c802f0 ("ipv6: Change rt6_probe to take a fib6_nh")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject
Jouni Hogander [Wed, 20 Nov 2019 07:08:16 +0000 (09:08 +0200)]
net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject

kobject_init_and_add takes reference even when it fails. This has
to be given up by the caller in error handling. Otherwise memory
allocated by kobject_init_and_add is never freed. Originally found
by Syzkaller:

BUG: memory leak
unreferenced object 0xffff8880679f8b08 (size 8):
  comm "netdev_register", pid 269, jiffies 4294693094 (age 12.132s)
  hex dump (first 8 bytes):
    72 78 2d 30 00 36 20 d4                          rx-0.6 .
  backtrace:
    [<000000008c93818e>] __kmalloc_track_caller+0x16e/0x290
    [<000000001f2e4e49>] kvasprintf+0xb1/0x140
    [<000000007f313394>] kvasprintf_const+0x56/0x160
    [<00000000aeca11c8>] kobject_set_name_vargs+0x5b/0x140
    [<0000000073a0367c>] kobject_init_and_add+0xd8/0x170
    [<0000000088838e4b>] net_rx_queue_update_kobjects+0x152/0x560
    [<000000006be5f104>] netdev_register_kobject+0x210/0x380
    [<00000000e31dab9d>] register_netdevice+0xa1b/0xf00
    [<00000000f68b2465>] __tun_chr_ioctl+0x20d5/0x3dd0
    [<000000004c50599f>] tun_chr_ioctl+0x2f/0x40
    [<00000000bbd4c317>] do_vfs_ioctl+0x1c7/0x1510
    [<00000000d4c59e8f>] ksys_ioctl+0x99/0xb0
    [<00000000946aea81>] __x64_sys_ioctl+0x78/0xb0
    [<0000000038d946e5>] do_syscall_64+0x16f/0x580
    [<00000000e0aa5d8f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [<00000000285b3d1a>] 0xffffffffffffffff

Cc: David Miller <davem@davemloft.net>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Jouni Hogander <jouni.hogander@unikie.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoarm64: uaccess: Remove uaccess_*_not_uao asm macros
Pavel Tatashin [Wed, 20 Nov 2019 17:07:40 +0000 (12:07 -0500)]
arm64: uaccess: Remove uaccess_*_not_uao asm macros

It is safer and simpler to drop the uaccess assembly macros in favour of
inline C functions. Although this bloats the Image size slightly, it
aligns our user copy routines with '{get,put}_user()' and generally
makes the code a lot easier to reason about.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
[will: tweaked commit message and changed temporary variable names]
Signed-off-by: Will Deacon <will@kernel.org>
5 years agoarm64: uaccess: Ensure PAN is re-enabled after unhandled uaccess fault
Pavel Tatashin [Tue, 19 Nov 2019 22:10:06 +0000 (17:10 -0500)]
arm64: uaccess: Ensure PAN is re-enabled after unhandled uaccess fault

A number of our uaccess routines ('__arch_clear_user()' and
'__arch_copy_{in,from,to}_user()') fail to re-enable PAN if they
encounter an unhandled fault whilst accessing userspace.

For CPUs implementing both hardware PAN and UAO, this bug has no effect
when both extensions are in use by the kernel.

For CPUs implementing hardware PAN but not UAO, this means that a kernel
using hardware PAN may execute portions of code with PAN inadvertently
disabled, opening us up to potential security vulnerabilities that rely
on userspace access from within the kernel which would usually be
prevented by this mechanism. In other words, parts of the kernel run the
same way as they would on a CPU without PAN implemented/emulated at all.

For CPUs not implementing hardware PAN and instead relying on software
emulation via 'CONFIG_ARM64_SW_TTBR0_PAN=y', the impact is unfortunately
much worse. Calling 'schedule()' with software PAN disabled means that
the next task will execute in the kernel using the page-table and ASID
of the previous process even after 'switch_mm()', since the actual
hardware switch is deferred until return to userspace. At this point, or
if there is a intermediate call to 'uaccess_enable()', the page-table
and ASID of the new process are installed. Sadly, due to the changes
introduced by KPTI, this is not an atomic operation and there is a very
small window (two instructions) where the CPU is configured with the
page-table of the old task and the ASID of the new task; a speculative
access in this state is disastrous because it would corrupt the TLB
entries for the new task with mappings from the previous address space.

As Pavel explains:

  | I was able to reproduce memory corruption problem on Broadcom's SoC
  | ARMv8-A like this:
  |
  | Enable software perf-events with PERF_SAMPLE_CALLCHAIN so userland's
  | stack is accessed and copied.
  |
  | The test program performed the following on every CPU and forking
  | many processes:
  |
  | unsigned long *map = mmap(NULL, PAGE_SIZE, PROT_READ|PROT_WRITE,
  |   MAP_SHARED | MAP_ANONYMOUS, -1, 0);
  | map[0] = getpid();
  | sched_yield();
  | if (map[0] != getpid()) {
  | fprintf(stderr, "Corruption detected!");
  | }
  | munmap(map, PAGE_SIZE);
  |
  | From time to time I was getting map[0] to contain pid for a
  | different process.

Ensure that PAN is re-enabled when returning after an unhandled user
fault from our uaccess routines.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org>
Fixes: 338d4f49d6f7 ("arm64: kernel: Add support for Privileged Access Never")
Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
[will: rewrote commit message]
Signed-off-by: Will Deacon <will@kernel.org>
5 years agofork: fix pidfd_poll()'s return type
Luc Van Oostenryck [Wed, 20 Nov 2019 00:33:20 +0000 (01:33 +0100)]
fork: fix pidfd_poll()'s return type

pidfd_poll() is defined as returning 'unsigned int' but the
.poll method is declared as returning '__poll_t', a bitwise type.

Fix this by using the proper return type and using the EPOLL
constants instead of the POLL ones, as required for __poll_t.

Fixes: b53b0b9d9a61 ("pidfd: add polling support")
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: stable@vger.kernel.org # 5.3
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20191120003320.31138-1-luc.vanoostenryck@gmail.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
5 years agoPM: QoS: Invalidate frequency QoS requests after removal
Rafael J. Wysocki [Wed, 20 Nov 2019 09:33:34 +0000 (10:33 +0100)]
PM: QoS: Invalidate frequency QoS requests after removal

Switching cpufreq drivers (or switching operation modes of the
intel_pstate driver from "active" to "passive" and vice versa)
does not work on some x86 systems with ACPI after commit
3000ce3c52f8 ("cpufreq: Use per-policy frequency QoS"), because
the ACPI _PPC and thermal code uses the same frequency QoS request
object for a given CPU every time a cpufreq driver is registered
and freq_qos_remove_request() does not invalidate the request after
removing it from its QoS list, so freq_qos_add_request() complains
and fails when that request is passed to it again.

Fix the issue by modifying freq_qos_remove_request() to clear the qos
and type fields of the frequency request pointed to by its argument
after removing it from its QoS list so as to invalidate it.

Fixes: 3000ce3c52f8 ("cpufreq: Use per-policy frequency QoS")
Reported-and-tested-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
5 years agovirtio_balloon: fix shrinker count
Wei Wang [Tue, 19 Nov 2019 10:02:33 +0000 (05:02 -0500)]
virtio_balloon: fix shrinker count

Instead of multiplying by page order, virtio balloon divided by page
order. The result is that it can return 0 if there are a bit less
than MAX_ORDER - 1 pages in use, and then shrinker scan won't be called.

Cc: stable@vger.kernel.org
Fixes: 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker")
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
5 years agovirtio_balloon: fix shrinker scan number of pages
Michael S. Tsirkin [Tue, 19 Nov 2019 09:50:35 +0000 (04:50 -0500)]
virtio_balloon: fix shrinker scan number of pages

virtio_balloon_shrinker_scan should return number of system pages freed,
but because it's calling functions that deal with balloon pages, it gets
confused and sometimes returns the number of balloon pages.

It does not matter practically as the exact number isn't
used, but it seems better to be consistent in case someone
starts using this API.

Further, if we ever tried to iteratively leak pages as
virtio_balloon_shrinker_scan tries to do, we'd run into issues - this is
because freed_pages was accumulating total freed pages, but was also
subtracted on each iteration from pages_to_free, which can result in
either leaking less memory than we were supposed to free, or more if
pages_to_free underruns.

On a system with 4K pages we are lucky that we are never asked to leak
more than 128 pages while we can leak up to 256 at a time,
but it looks like a real issue for systems with page size != 4K.

Fixes: 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker")
Reported-by: Khazhismel Kumykov <khazhy@google.com>
Reviewed-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
5 years agomdio_bus: Fix init if CONFIG_RESET_CONTROLLER=n
Geert Uytterhoeven [Tue, 19 Nov 2019 11:25:24 +0000 (12:25 +0100)]
mdio_bus: Fix init if CONFIG_RESET_CONTROLLER=n

Commit 1d4639567d97 ("mdio_bus: Fix PTR_ERR applied after initialization
to constant") accidentally changed a check from -ENOTSUPP to -ENOSYS,
causing failures if reset controller support is not enabled.  E.g. on
r7s72100/rskrza1:

    sh-eth e8203000.ethernet: MDIO init failed: -524
    sh-eth: probe of e8203000.ethernet failed with error -524

Seen on r8a7740/armadillo, r7s72100/rskrza1, and r7s9210/rza2mevb.

Fixes: 1d4639567d97 ("mdio_bus: Fix PTR_ERR applied after initialization to constant")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: YueHaibing <yuehaibing@huawei.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoRevert "mdio_bus: fix mdio_register_device when RESET_CONTROLLER is disabled"
David S. Miller [Wed, 20 Nov 2019 03:16:49 +0000 (19:16 -0800)]
Revert "mdio_bus: fix mdio_register_device when RESET_CONTROLLER is disabled"

This reverts commit 075e238d12c21c8bde700d21fb48be7a3aa80194.

Going to go with Geert's fix instead, which also has a
correct Fixes tag.

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: fix a wrong reset interrupt status mask
Huazhong Tan [Wed, 20 Nov 2019 02:07:15 +0000 (10:07 +0800)]
net: hns3: fix a wrong reset interrupt status mask

According to hardware user manual, bits5~7 in register
HCLGE_MISC_VECTOR_INT_STS means reset interrupts status,
but HCLGE_RESET_INT_M is defined as bits0~2 now. So it
will make hclge_reset_err_handle() read the wrong reset
interrupt status.

This patch fixes this wrong bit mask.

Fixes: 2336f19d7892 ("net: hns3: check reset interrupt status when reset fails")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: fec: fix clock count mis-match
Chuhong Yuan [Wed, 20 Nov 2019 01:25:13 +0000 (09:25 +0800)]
net: fec: fix clock count mis-match

pm_runtime_put_autosuspend in probe will call runtime suspend to
disable clks automatically if CONFIG_PM is defined. (If CONFIG_PM
is not defined, its implementation will be empty, then runtime
suspend will not be called.)

Therefore, we can call pm_runtime_get_sync to runtime resume it
first to enable clks, which matches the runtime suspend. (Only when
CONFIG_PM is defined, otherwise pm_runtime_get_sync will also be
empty, then runtime resume will not be called.)

Then it is fine to disable clks without causing clock count mis-match.

Fixes: c43eab3eddb4 ("net: fec: add missed clk_disable_unprepare in remove")
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/sched: act_pedit: fix WARN() in the traffic path
Davide Caratti [Tue, 19 Nov 2019 22:47:33 +0000 (23:47 +0100)]
net/sched: act_pedit: fix WARN() in the traffic path

when configuring act_pedit rules, the number of keys is validated only on
addition of a new entry. This is not sufficient to avoid hitting a WARN()
in the traffic path: for example, it is possible to replace a valid entry
with a new one having 0 extended keys, thus causing splats in dmesg like:

 pedit BUG: index 42
 WARNING: CPU: 2 PID: 4054 at net/sched/act_pedit.c:410 tcf_pedit_act+0xc84/0x1200 [act_pedit]
 [...]
 RIP: 0010:tcf_pedit_act+0xc84/0x1200 [act_pedit]
 Code: 89 fa 48 c1 ea 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e ac 00 00 00 48 8b 44 24 10 48 c7 c7 a0 c4 e4 c0 8b 70 18 e8 1c 30 95 ea <0f> 0b e9 a0 fa ff ff e8 00 03 f5 ea e9 14 f4 ff ff 48 89 58 40 e9
 RSP: 0018:ffff888077c9f320 EFLAGS: 00010286
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffac2983a2
 RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff888053927bec
 RBP: dffffc0000000000 R08: ffffed100a726209 R09: ffffed100a726209
 R10: 0000000000000001 R11: ffffed100a726208 R12: ffff88804beea780
 R13: ffff888079a77400 R14: ffff88804beea780 R15: ffff888027ab2000
 FS:  00007fdeec9bd740(0000) GS:ffff888053900000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007ffdb3dfd000 CR3: 000000004adb4006 CR4: 00000000001606e0
 Call Trace:
  tcf_action_exec+0x105/0x3f0
  tcf_classify+0xf2/0x410
  __dev_queue_xmit+0xcbf/0x2ae0
  ip_finish_output2+0x711/0x1fb0
  ip_output+0x1bf/0x4b0
  ip_send_skb+0x37/0xa0
  raw_sendmsg+0x180c/0x2430
  sock_sendmsg+0xdb/0x110
  __sys_sendto+0x257/0x2b0
  __x64_sys_sendto+0xdd/0x1b0
  do_syscall_64+0xa5/0x4e0
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
 RIP: 0033:0x7fdeeb72e993
 Code: 48 8b 0d e0 74 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 0d d6 2c 00 00 75 13 49 89 ca b8 2c 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 4b cc 00 00 48 89 04 24
 RSP: 002b:00007ffdb3de8a18 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
 RAX: ffffffffffffffda RBX: 000055c81972b700 RCX: 00007fdeeb72e993
 RDX: 0000000000000040 RSI: 000055c81972b700 RDI: 0000000000000003
 RBP: 00007ffdb3dea130 R08: 000055c819728510 R09: 0000000000000010
 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040
 R13: 000055c81972b6c0 R14: 000055c81972969c R15: 0000000000000080

Fix this moving the check on 'nkeys' earlier in tcf_pedit_init(), so that
attempts to install rules having 0 keys are always rejected with -EINVAL.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phylink: fix link mode modification in PHY mode
Russell King [Tue, 19 Nov 2019 17:28:14 +0000 (17:28 +0000)]
net: phylink: fix link mode modification in PHY mode

Modifying the link settings via phylink_ethtool_ksettings_set() and
phylink_ethtool_set_pauseparam() didn't always work as intended for
PHY based setups, as calling phylink_mac_config() would result in the
unresolved configuration being committed to the MAC, rather than the
configuration with the speed and duplex setting.

This would work fine if the update caused the link to renegotiate,
but if no settings have changed, phylib won't trigger a renegotiation
cycle, and the MAC will be left incorrectly configured.

Avoid calling phylink_mac_config() unless we are using an inband mode
in phylink_ethtool_ksettings_set(), and use phy_set_asym_pause() as
introduced in 4.20 to set the PHY settings in
phylink_ethtool_set_pauseparam().

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phylink: update documentation on create and destroy
Russell King [Tue, 19 Nov 2019 17:18:52 +0000 (17:18 +0000)]
net: phylink: update documentation on create and destroy

Update the documentation on phylink's create and destroy functions to
explicitly state that the rtnl lock must not be held while calling
these.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: disable TSO on a single version of RTL8168c to fix performance
Corinna Vinschen [Tue, 19 Nov 2019 09:09:39 +0000 (10:09 +0100)]
r8169: disable TSO on a single version of RTL8168c to fix performance

During performance testing, I found that one of my r8169 NICs suffered
a major performance loss, a 8168c model.

Running netperf's TCP_STREAM test didn't return the expected
throughput of > 900 Mb/s, but rather only about 22 Mb/s.  Strange
enough, running the TCP_MAERTS and UDP_STREAM tests all returned with
throughput > 900 Mb/s, as did TCP_STREAM with the other r8169 NICs I can
test (either one of 8169s, 8168e, 8168f).

Bisecting turned up commit 93681cd7d94f83903cb3f0f95433d10c28a7e9a5,
"r8169: enable HW csum and TSO" as the culprit.

I added my 8168c version, RTL_GIGA_MAC_VER_22, to the code
special-casing the 8168evl as per the patch below.  This fixed the
performance problem for me.

Fixes: 93681cd7d94f ("r8169: enable HW csum and TSO")
Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMAINTAINERS: forcedeth: Change Zhu Yanjun's email address
Zhu Yanjun [Tue, 19 Nov 2019 08:03:48 +0000 (03:03 -0500)]
MAINTAINERS: forcedeth: Change Zhu Yanjun's email address

I prefer to use my personal email address for kernel related work.

Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Acked-by: Rain River <rain.1986.08.12@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotaprio: don't reject same mqprio settings
Ivan Khoronzhuk [Tue, 19 Nov 2019 00:23:12 +0000 (02:23 +0200)]
taprio: don't reject same mqprio settings

The taprio qdisc allows to set mqprio setting but only once. In case
if mqprio settings are provided next time the error is returned as
it's not allowed to change traffic class mapping in-flignt and that
is normal. But if configuration is absolutely the same - no need to
return error. It allows to provide same command couple times,
changing only base time for instance, or changing only scheds maps,
but leaving mqprio setting w/o modification. It more corresponds the
message: "Changing the traffic mapping of a running schedule is not
supported", so reject mqprio if it's really changed.

Also corrected TC_BITMASK + 1 for consistency, as proposed.

Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule")
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/tls: enable sk_msg redirect to tls socket egress
Willem de Bruijn [Mon, 18 Nov 2019 15:40:51 +0000 (10:40 -0500)]
net/tls: enable sk_msg redirect to tls socket egress

Bring back tls_sw_sendpage_locked. sk_msg redirection into a socket
with TLS_TX takes the following path:

  tcp_bpf_sendmsg_redir
    tcp_bpf_push_locked
      tcp_bpf_push
        kernel_sendpage_locked
          sock->ops->sendpage_locked

Also update the flags test in tls_sw_sendpage_locked to allow flag
MSG_NO_SHARED_FRAGS. bpf_tcp_sendmsg sets this.

Link: https://lore.kernel.org/netdev/CA+FuTSdaAawmZ2N8nfDDKu3XLpXBbMtcCT0q4FntDD2gn8ASUw@mail.gmail.com/T/#t
Link: https://github.com/wdebruij/kerneltools/commits/icept.2
Fixes: 0608c69c9a80 ("bpf: sk_msg, sock{map|hash} redirect through ULP")
Fixes: f3de19af0f5b ("Revert \"net/tls: remove unused function tls_sw_sendpage_locked\"")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoafs: Fix missing timeout reset
David Howells [Tue, 19 Nov 2019 21:00:36 +0000 (21:00 +0000)]
afs: Fix missing timeout reset

In afs_wait_for_call_to_complete(), rather than immediately aborting an
operation if a signal occurs, the code attempts to wait for it to
complete, using a schedule timeout of 2*RTT (or min 2 jiffies) and a
check that we're still receiving relevant packets from the server before
we consider aborting the call.  We may even ping the server to check on
the status of the call.

However, there's a missing timeout reset in the event that we do
actually get a packet to process, such that if we then get a couple of
short stalls, we then time out when progress is actually being made.

Fix this by resetting the timeout any time we get something to process.
If it's the failure of the call then the call state will get changed and
we'll exit the loop shortly thereafter.

A symptom of this is data fetches and stores failing with EINTR when
they really shouldn't.

Fixes: bc5e3a546d55 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agogve: fix dma sync bug where not all pages synced
Adi Suresh [Tue, 19 Nov 2019 16:02:47 +0000 (08:02 -0800)]
gve: fix dma sync bug where not all pages synced

The previous commit had a bug where the last page in the memory range
could not be synced. This change fixes the behavior so that all the
required pages are synced.

Fixes: 9cfeeb576d49 ("gve: Fixes DMA synchronization")
Signed-off-by: Adi Suresh <adisuresh@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodrm/i915: make pool objects read-only
Matthew Auld [Tue, 19 Nov 2019 15:01:54 +0000 (15:01 +0000)]
drm/i915: make pool objects read-only

For our current users we don't expect pool objects to be writable from
the gpu.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 4f7af1948abc ("drm/i915: Support ro ppgtt mapped cmdparser shadow buffers")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191119150154.18249-1-matthew.auld@intel.com
(cherry picked from commit d18580b08b92ec4105eb0ede2d676e8b1f5a66c3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
5 years agomdio_bus: Fix init if CONFIG_RESET_CONTROLLER=n
Geert Uytterhoeven [Tue, 19 Nov 2019 11:25:24 +0000 (12:25 +0100)]
mdio_bus: Fix init if CONFIG_RESET_CONTROLLER=n

Commit 1d4639567d97 ("mdio_bus: Fix PTR_ERR applied after initialization
to constant") accidentally changed a check from -ENOTSUPP to -ENOSYS,
causing failures if reset controller support is not enabled.  E.g. on
r7s72100/rskrza1:

    sh-eth e8203000.ethernet: MDIO init failed: -524
    sh-eth: probe of e8203000.ethernet failed with error -524

Seen on r8a7740/armadillo, r7s72100/rskrza1, and r7s9210/rza2mevb.

Fixes: 1d4639567d97 ("mdio_bus: Fix PTR_ERR applied after initialization to constant")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: YueHaibing <yuehaibing@huawei.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agonbd:fix memory leak in nbd_get_socket()
Sun Ke [Tue, 19 Nov 2019 06:09:11 +0000 (14:09 +0800)]
nbd:fix memory leak in nbd_get_socket()

Before returning NULL, put the sock first.

Cc: stable@vger.kernel.org
Fixes: cf1b2326b734 ("nbd: verify socket is supported during setup")
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Sun Ke <sunke32@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agovirtio_console: allocate inbufs in add_port() only if it is needed
Laurent Vivier [Thu, 14 Nov 2019 12:25:48 +0000 (13:25 +0100)]
virtio_console: allocate inbufs in add_port() only if it is needed

When we hot unplug a virtserialport and then try to hot plug again,
it fails:

(qemu) chardev-add socket,id=serial0,path=/tmp/serial0,server,nowait
(qemu) device_add virtserialport,bus=virtio-serial0.0,nr=2,\
                  chardev=serial0,id=serial0,name=serial0
(qemu) device_del serial0
(qemu) device_add virtserialport,bus=virtio-serial0.0,nr=2,\
                  chardev=serial0,id=serial0,name=serial0
kernel error:
  virtio-ports vport2p2: Error allocating inbufs
qemu error:
  virtio-serial-bus: Guest failure in adding port 2 for device \
                     virtio-serial0.0

This happens because buffers for the in_vq are allocated when the port is
added but are not released when the port is unplugged.

They are only released when virtconsole is removed (see a7a69ec0d8e4)

To avoid the problem and to be symmetric, we could allocate all the buffers
in init_vqs() as they are released in remove_vqs(), but it sounds like
a waste of memory.

Rather than that, this patch changes add_port() logic to ignore ENOSPC
error in fill_queue(), which means queue has already been filled.

Fixes: a7a69ec0d8e4 ("virtio_console: free buffers after reset")
Cc: mst@redhat.com
Cc: stable@vger.kernel.org
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>