platform/kernel/linux-rpi.git
2 years agonet: 802: Use memset_startat() to clear struct fields
Kees Cook [Thu, 18 Nov 2021 20:30:45 +0000 (12:30 -0800)]
net: 802: Use memset_startat() to clear struct fields

In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Use memset_startat() so memset() doesn't get confused about writing
beyond the destination member that is intended to be the starting point
of zeroing through the end of the struct.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: dccp: Use memset_startat() for TP zeroing
Kees Cook [Thu, 18 Nov 2021 20:30:19 +0000 (12:30 -0800)]
net: dccp: Use memset_startat() for TP zeroing

In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Use memset_startat() so memset() doesn't get confused about writing
beyond the destination member that is intended to be the starting point
of zeroing through the end of the struct.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosky2: use PCI VPD API in eeprom ethtool ops
Heiner Kallweit [Thu, 18 Nov 2021 20:04:23 +0000 (21:04 +0100)]
sky2: use PCI VPD API in eeprom ethtool ops

Recently pci_read/write_vpd_any() have been added to the PCI VPD API.
These functions allow to access VPD address space outside the
auto-detected VPD, and they can be used to significantly simplify the
eeprom ethtool ops.

Tested with a 88E8070 card with 1KB EEPROM.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ipa: Use 'for_each_clear_bit' when possible
Christophe JAILLET [Thu, 18 Nov 2021 19:37:15 +0000 (20:37 +0100)]
net: ipa: Use 'for_each_clear_bit' when possible

Use 'for_each_clear_bit()' instead of hand writing it. It is much less
version.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnx2x: Use struct_group() for memcpy() region
Kees Cook [Thu, 18 Nov 2021 18:42:53 +0000 (10:42 -0800)]
bnx2x: Use struct_group() for memcpy() region

In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.

Use struct_group() in struct nig_stats around members egress_mac_pkt0_lo,
egress_mac_pkt0_hi, egress_mac_pkt1_lo, and egress_mac_pkt1_hi (and the
respective members in struct bnx2x_eth_stats), so they can be referenced
together. This will allow memcpy() and sizeof() to more easily reason
about sizes, improve readability, and avoid future warnings about writing
beyond the end of struct bnx2x_eth_stats's rx_stat_ifhcinbadoctets_hi.

"pahole" shows no size nor member offset changes to either struct.
"objdump -d" shows no meaningful object code changes (i.e. only source
line number induced differences and optimizations).

Additionally adds BUILD_BUG_ON() to compare the separate struct group
sizes.

Reviewed-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Link: https://lore.kernel.org/lkml/DM5PR18MB2229B0413C372CC6E49D59A3B2C59@DM5PR18MB2229.namprd18.prod.outlook.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agocxgb4: Use struct_group() for memcpy() region
Kees Cook [Thu, 18 Nov 2021 18:42:35 +0000 (10:42 -0800)]
cxgb4: Use struct_group() for memcpy() region

In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.

Use struct_group() in struct fw_eth_tx_pkt_vm_wr around members ethmacdst,
ethmacsrc, ethtype, and vlantci, so they can be referenced together. This
will allow memcpy() and sizeof() to more easily reason about sizes,
improve readability, and avoid future warnings about writing beyond the
end of ethmacdst.

"pahole" shows no size nor member offset changes to struct
fw_eth_tx_pkt_vm_wr. "objdump -d" shows no object code changes.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agocxgb3: Use struct_group() for memcpy() region
Kees Cook [Thu, 18 Nov 2021 18:41:42 +0000 (10:41 -0800)]
cxgb3: Use struct_group() for memcpy() region

In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.

Use struct_group() in struct rss_hdr around members imm_data and intr_gen,
so they can be referenced together. This will allow memcpy() and sizeof()
to more easily reason about sizes, improve readability, and avoid future
warnings about writing beyond the end of imm_data.

"pahole" shows no size nor member offset changes to struct rss_hdr.
"objdump -d" shows no object code changes.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phylink: add 1000base-KX to phylink_caps_to_linkmodes()
Russell King (Oracle) [Thu, 18 Nov 2021 18:07:06 +0000 (18:07 +0000)]
net: phylink: add 1000base-KX to phylink_caps_to_linkmodes()

1000base-KX was missed in phylink_caps_to_linkmodes(), add it. This
will be necessary to convert stmmac with xpcs to ensure we don't drop
any supported linkmodes.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 's390-next'
David S. Miller [Fri, 19 Nov 2021 11:12:30 +0000 (11:12 +0000)]
Merge branch 's390-next'

Karsten Graul says:

====================
s390/net: updates 2021-11-18

Please apply the following patches to netdev's net-next tree.

Heiko provided fixes for kernel doc comments and solved some
other compiler warnings.
Julians qeth patch simplifies the rx queue handling in the code.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agos390/lcs: add braces around empty function body
Heiko Carstens [Thu, 18 Nov 2021 16:06:07 +0000 (17:06 +0100)]
s390/lcs: add braces around empty function body

Fix allmodconfig + W=1 compile breakage:

drivers/s390/net/lcs.c: In function ‘lcs_get_frames_cb’:
drivers/s390/net/lcs.c:1823:25: error: suggest braces around empty body in an ‘else’ statement [-Werror=empty-body]
 1823 |                         ; // FIXME: error message ?
      |                         ^

Acked-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agos390/ctcm: add __printf format attribute to ctcm_dbf_longtext
Heiko Carstens [Thu, 18 Nov 2021 16:06:06 +0000 (17:06 +0100)]
s390/ctcm: add __printf format attribute to ctcm_dbf_longtext

Allow the compiler to recognize and check format strings and parameters.

As reported with allmodconfig and W=1:

drivers/s390/net/ctcm_dbug.c: In function ‘ctcm_dbf_longtext’:
drivers/s390/net/ctcm_dbug.c:73:9: error: function ‘ctcm_dbf_longtext’ might be a candidate for ‘gnu_printf’ format attribute [-Werror=suggest-attribute=format]
   73 |         vsnprintf(dbf_txt_buf, sizeof(dbf_txt_buf), fmt, args);
      |         ^~~~~~~~~

Acked-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agos390/ctcm: fix format string
Heiko Carstens [Thu, 18 Nov 2021 16:06:05 +0000 (17:06 +0100)]
s390/ctcm: fix format string

The second parameter as specified by the format string is actually a
string not an integer.

Acked-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet/af_iucv: fix kernel doc comments
Heiko Carstens [Thu, 18 Nov 2021 16:06:04 +0000 (17:06 +0100)]
net/af_iucv: fix kernel doc comments

Fix kernel doc comments where appropriate, or remove incorrect kernel
doc indicators.

Acked-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet/iucv: fix kernel doc comments
Heiko Carstens [Thu, 18 Nov 2021 16:06:03 +0000 (17:06 +0100)]
net/iucv: fix kernel doc comments

Fix kernel doc comments where appropriate or remove incorrect kernel
doc indicators.
Also move kernel doc comments directly before functions.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agos390/qeth: allocate RX queue at probe time
Julian Wiedmann [Thu, 18 Nov 2021 16:06:02 +0000 (17:06 +0100)]
s390/qeth: allocate RX queue at probe time

We always need an RX queue, and there's no reconfig situation either
where we would need to free & rebuild the queue.

So allocate the RX queue right from the start, and avoid freeing it
during unrelated qeth_free_qdio_queues() calls.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'hw_addr_set-arch'
David S. Miller [Fri, 19 Nov 2021 11:05:22 +0000 (11:05 +0000)]
Merge branch 'hw_addr_set-arch'

Jakub Kicinski says:

====================
net: use eth_hw_addr_set() in arch-specific drivers

Fixups for more arch-specific drivers.

With these (and another patch which didn't fit) the build is more or
less clean with all cross-compilers available on kernel.org. I say
more or less because around half of the arches fail to build for
unrelated reasons right now.

Most of the changes here are for m68k, 32bit x86 and alpha.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonatsemi: macsonic: use eth_hw_addr_set()
Jakub Kicinski [Fri, 19 Nov 2021 07:10:33 +0000 (23:10 -0800)]
natsemi: macsonic: use eth_hw_addr_set()

Byte by byte assignments.

Fixes build on m68k.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agocirrus: mac89x0: use eth_hw_addr_set()
Jakub Kicinski [Fri, 19 Nov 2021 07:10:32 +0000 (23:10 -0800)]
cirrus: mac89x0: use eth_hw_addr_set()

Byte by byte assignments.

Fixes build on m68k.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoapple: macmace: use eth_hw_addr_set()
Jakub Kicinski [Fri, 19 Nov 2021 07:10:31 +0000 (23:10 -0800)]
apple: macmace: use eth_hw_addr_set()

Byte by byte assignments.

Fixes build on m68k.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agolasi_82594: use eth_hw_addr_set()
Jakub Kicinski [Fri, 19 Nov 2021 07:10:30 +0000 (23:10 -0800)]
lasi_82594: use eth_hw_addr_set()

dev_addr is set from IO reads, passed to an arch-specific helper.
Note that the helper never reads it so uninitialized temp is fine.

Fixes build on parisc.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosmc9194: use eth_hw_addr_set()
Jakub Kicinski [Fri, 19 Nov 2021 07:10:29 +0000 (23:10 -0800)]
smc9194: use eth_hw_addr_set()

dev_addr is set from IO reads, and broken from a u16 value.

Fixes build on Alpha.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years ago8390: wd: use eth_hw_addr_set()
Jakub Kicinski [Fri, 19 Nov 2021 07:10:28 +0000 (23:10 -0800)]
8390: wd: use eth_hw_addr_set()

IO reads, so save to an array then eth_hw_addr_set().

Fixes build on x86 (32bit).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years ago8390: mac8390: use eth_hw_addr_set()
Jakub Kicinski [Fri, 19 Nov 2021 07:10:27 +0000 (23:10 -0800)]
8390: mac8390: use eth_hw_addr_set()

Use temp to pass to the reading function, the function is generic
so can't fix there.

Fixes m68k build.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years ago8390: hydra: use eth_hw_addr_set()
Jakub Kicinski [Fri, 19 Nov 2021 07:10:26 +0000 (23:10 -0800)]
8390: hydra: use eth_hw_addr_set()

Loop with offsetting to every second byte, so use a temp buffer.

Fixes m68k build.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years ago8390: smc-ultra: use eth_hw_addr_set()
Jakub Kicinski [Fri, 19 Nov 2021 07:10:25 +0000 (23:10 -0800)]
8390: smc-ultra: use eth_hw_addr_set()

IO reads, so save to an array then eth_hw_addr_set().

Fixes build on Alpha.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoamd: mvme147: use eth_hw_addr_set()
Jakub Kicinski [Fri, 19 Nov 2021 07:10:24 +0000 (23:10 -0800)]
amd: mvme147: use eth_hw_addr_set()

Byte by byte assignments.

Fixes build on m68k.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoamd: atarilance: use eth_hw_addr_set()
Jakub Kicinski [Fri, 19 Nov 2021 07:10:23 +0000 (23:10 -0800)]
amd: atarilance: use eth_hw_addr_set()

Byte by byte assignments.

Fixes build on m68k.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoamd: hplance: use eth_hw_addr_set()
Jakub Kicinski [Fri, 19 Nov 2021 07:10:22 +0000 (23:10 -0800)]
amd: hplance: use eth_hw_addr_set()

Byte by byte assignments.

Fixes build on m68k.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoamd: a2065/ariadne: use eth_hw_addr_set()
Jakub Kicinski [Fri, 19 Nov 2021 07:10:21 +0000 (23:10 -0800)]
amd: a2065/ariadne: use eth_hw_addr_set()

dev_addr is initialized byte by byte from series.

Fixes build on x86 (32bit).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoamd: ni65: use eth_hw_addr_set()
Jakub Kicinski [Fri, 19 Nov 2021 07:10:20 +0000 (23:10 -0800)]
amd: ni65: use eth_hw_addr_set()

IO reads, so save to an array then eth_hw_addr_set().

Fixes build on x86 (32bit).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoamd: lance: use eth_hw_addr_set()
Jakub Kicinski [Fri, 19 Nov 2021 07:10:19 +0000 (23:10 -0800)]
amd: lance: use eth_hw_addr_set()

IO reads, so save to an array then eth_hw_addr_set().

Fixes build on x86 (32bit).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'dev_addr-const-x86'
David S. Miller [Fri, 19 Nov 2021 10:46:04 +0000 (10:46 +0000)]
Merge branch 'dev_addr-const-x86'

Jakub Kicinski says:

====================
net: constify netdev->dev_addr - x86 changes

Resending these so they can get merged while I battle random cross builds.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoipw2200: constify address in ipw_send_adapter_address
Jakub Kicinski [Thu, 18 Nov 2021 14:27:20 +0000 (06:27 -0800)]
ipw2200: constify address in ipw_send_adapter_address

Add const to the address param of ipw_send_adapter_address()
all the functions down the chain have already been changed.

Not sure how I lost this in the rebase.

Acked-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Stanislav Yakovlev <stas.yakovlev@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agowilc1000: copy address before calling wilc_set_mac_address
Jakub Kicinski [Thu, 18 Nov 2021 14:27:19 +0000 (06:27 -0800)]
wilc1000: copy address before calling wilc_set_mac_address

wilc_set_mac_address() calls IO routines which don't guarantee
the pointer won't be written to. Make a copy.

Acked-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agomlxsw: constify address in mlxsw_sp_port_dev_addr_set
Jakub Kicinski [Thu, 18 Nov 2021 14:27:18 +0000 (06:27 -0800)]
mlxsw: constify address in mlxsw_sp_port_dev_addr_set

Argument comes from netdev->dev_addr directly, it needs a const.

Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ax88796c: don't write to netdev->dev_addr directly
Jakub Kicinski [Thu, 18 Nov 2021 14:27:17 +0000 (06:27 -0800)]
net: ax88796c: don't write to netdev->dev_addr directly

The future is here, convert the new driver as we are about
to make netdev->dev_addr const.

Acked-by: Lukasz Stelmach <l.stelmach@samsung.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge tag 'regmap-no-bus-update-bits' of git://git.kernel.org/pub/scm/linux/kernel...
Jakub Kicinski [Fri, 19 Nov 2021 01:50:18 +0000 (17:50 -0800)]
Merge tag 'regmap-no-bus-update-bits' of git://git./linux/kernel/git/broonie/regmap

Mark Brown says:

===================
regmap: Allow regmap_update_bits() to be offloaded with no bus

Some hardware can do this so let's use that capability.
===================

Link: https://lore.kernel.org/all/YZWDOidBOssP10yS@sirena.org.uk/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 18 Nov 2021 21:13:16 +0000 (13:13 -0800)]
Merge git://git./linux/kernel/git/netdev/net

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'net-5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 18 Nov 2021 20:54:24 +0000 (12:54 -0800)]
Merge tag 'net-5.16-rc2' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf, mac80211.

  Current release - regressions:

   - devlink: don't throw an error if flash notification sent before
     devlink visible

   - page_pool: Revert "page_pool: disable dma mapping support...",
     turns out there are active arches who need it

  Current release - new code bugs:

   - amt: cancel delayed_work synchronously in amt_fini()

  Previous releases - regressions:

   - xsk: fix crash on double free in buffer pool

   - bpf: fix inner map state pruning regression causing program
     rejections

   - mac80211: drop check for DONT_REORDER in __ieee80211_select_queue,
     preventing mis-selecting the best effort queue

   - mac80211: do not access the IV when it was stripped

   - mac80211: fix radiotap header generation, off-by-one

   - nl80211: fix getting radio statistics in survey dump

   - e100: fix device suspend/resume

  Previous releases - always broken:

   - tcp: fix uninitialized access in skb frags array for Rx 0cp

   - bpf: fix toctou on read-only map's constant scalar tracking

   - bpf: forbid bpf_ktime_get_coarse_ns and bpf_timer_* in tracing
     progs

   - tipc: only accept encrypted MSG_CRYPTO msgs

   - smc: transfer remaining wait queue entries during fallback, fix
     missing wake ups

   - udp: validate checksum in udp_read_sock() (when sockmap is used)

   - sched: act_mirred: drop dst for the direction from egress to
     ingress

   - virtio_net_hdr_to_skb: count transport header in UFO, prevent
     allowing bad skbs into the stack

   - nfc: reorder the logic in nfc_{un,}register_device, fix unregister

   - ipsec: check return value of ipv6_skip_exthdr

   - usb: r8152: add MAC passthrough support for more Lenovo Docks"

* tag 'net-5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (96 commits)
  ptp: ocp: Fix a couple NULL vs IS_ERR() checks
  net: ethernet: dec: tulip: de4x5: fix possible array overflows in type3_infoblock()
  net: tulip: de4x5: fix the problem that the array 'lp->phy[8]' may be out of bound
  ipv6: check return value of ipv6_skip_exthdr
  e100: fix device suspend/resume
  devlink: Don't throw an error if flash notification sent before devlink visible
  page_pool: Revert "page_pool: disable dma mapping support..."
  ethernet: hisilicon: hns: hns_dsaf_misc: fix a possible array overflow in hns_dsaf_ge_srst_by_port()
  octeontx2-af: debugfs: don't corrupt user memory
  NFC: add NCI_UNREG flag to eliminate the race
  NFC: reorder the logic in nfc_{un,}register_device
  NFC: reorganize the functions in nci_request
  tipc: check for null after calling kmemdup
  i40e: Fix display error code in dmesg
  i40e: Fix creation of first queue by omitting it if is not power of two
  i40e: Fix warning message and call stack during rmmod i40e driver
  i40e: Fix ping is lost after configuring ADq on VF
  i40e: Fix changing previously set num_queue_pairs for PFs
  i40e: Fix NULL ptr dereference on VSI filter sync
  i40e: Fix correct max_pkt_size on VF RX queue
  ...

2 years agoMerge tag 'for-5.16-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Thu, 18 Nov 2021 20:41:14 +0000 (12:41 -0800)]
Merge tag 'for-5.16-rc1-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "Several xes and one old ioctl deprecation. Namely there's fix for
  crashes/warnings with lzo compression that was suspected to be caused
  by first pull merge resolution, but it was a different bug.

  Summary:

   - regression fix for a crash in lzo due to missing boundary checks of
     the page array

   - fix crashes on ARM64 due to missing barriers when synchronizing
     status bits between work queues

   - silence lockdep when reading chunk tree during mount

   - fix false positive warning in integrity checker on devices with
     disabled write caching

   - fix signedness of bitfields in scrub

   - start deprecation of balance v1 ioctl"

* tag 'for-5.16-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: deprecate BTRFS_IOC_BALANCE ioctl
  btrfs: make 1-bit bit-fields of scrub_page unsigned int
  btrfs: check-integrity: fix a warning on write caching disabled disk
  btrfs: silence lockdep when reading chunk tree during mount
  btrfs: fix memory ordering between normal and ordered work functions
  btrfs: fix a out-of-bound access in copy_compressed_data_to_page()

2 years agoMerge tag 'fs_for_v5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack...
Linus Torvalds [Thu, 18 Nov 2021 20:31:29 +0000 (12:31 -0800)]
Merge tag 'fs_for_v5.16-rc2' of git://git./linux/kernel/git/jack/linux-fs

Pull UDF fix from Jan Kara:
 "A fix for a long-standing UDF bug where we were not properly
  validating directory position inside readdir"

* tag 'fs_for_v5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: Fix crash after seekdir

2 years agoMerge tag 'fs.idmapped.v5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 18 Nov 2021 20:17:33 +0000 (12:17 -0800)]
Merge tag 'fs.idmapped.v5.16-rc2' of git://git./linux/kernel/git/brauner/linux

Pull setattr idmapping fix from Christian Brauner:
 "This contains a simple fix for setattr. When determining the validity
  of the attributes the ia_{g,u}id fields contain the value that will be
  written to inode->i_{g,u}id. When the {g,u}id attribute of the file
  isn't altered and the caller's fs{g,u}id matches the current {g,u}id
  attribute the attribute change is allowed.

  The value in ia_{g,u}id does already account for idmapped mounts and
  will have taken the relevant idmapping into account. So in order to
  verify that the {g,u}id attribute isn't changed we simple need to
  compare the ia_{g,u}id value against the inode's i_{g,u}id value.

  This only has any meaning for idmapped mounts as idmapping helpers are
  idempotent without them. And for idmapped mounts this really only has
  a meaning when circular idmappings are used, i.e. mappings where e.g.
  id 1000 is mapped to id 1001 and id 1001 is mapped to id 1000. Such
  ciruclar mappings can e.g. be useful when sharing the same home
  directory between multiple users at the same time.

  Before this patch we could end up denying legitimate attribute changes
  and allowing invalid attribute changes when circular mappings are
  used. To even get into this situation the caller must've been
  privileged both to create that mapping and to create that idmapped
  mount.

  This hasn't been seen in the wild anywhere but came up when expanding
  the fstest suite during work on a series of hardening patches. All
  idmapped fstests pass without any regressions and we're adding new
  tests to verify the behavior of circular mappings.

  The new tests can be found at [1]"

Link: https://lore.kernel.org/linux-fsdevel/20211109145713.1868404-2-brauner@kernel.org
* tag 'fs.idmapped.v5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  fs: handle circular mappings correctly

2 years agoMerge tag 'for-5.16/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller...
Linus Torvalds [Thu, 18 Nov 2021 20:13:24 +0000 (12:13 -0800)]
Merge tag 'for-5.16/parisc-4' of git://git./linux/kernel/git/deller/parisc-linux

Pull parisc fixes from Helge Deller:
 "parisc bug and warning fixes and wire up futex_waitv.

  Fix some warnings which showed up with allmodconfig builds, a revert
  of a change to the sigreturn trampoline which broke signal handling,
  wire up futex_waitv and add CONFIG_PRINTK_TIME=y to 32bit defconfig"

* tag 'for-5.16/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Enable CONFIG_PRINTK_TIME=y in 32bit defconfig
  Revert "parisc: Reduce sigreturn trampoline to 3 instructions"
  parisc: Wrap assembler related defines inside __ASSEMBLY__
  parisc: Wire up futex_waitv
  parisc: Include stringify.h to avoid build error in crypto/api.c
  parisc/sticon: fix reverse colors

2 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Thu, 18 Nov 2021 20:05:22 +0000 (12:05 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "Selftest changes:

   - Cleanups for the perf test infrastructure and mapping hugepages

   - Avoid contention on mmap_sem when the guests start to run

   - Add event channel upcall support to xen_shinfo_test

  x86 changes:

   - Fixes for Xen emulation

   - Kill kvm_map_gfn() / kvm_unmap_gfn() and broken gfn_to_pfn_cache

   - Fixes for migration of 32-bit nested guests on 64-bit hypervisor

   - Compilation fixes

   - More SEV cleanups

  Generic:

   - Cap the return value of KVM_CAP_NR_VCPUS to both KVM_CAP_MAX_VCPUS
     and num_online_cpus(). Most architectures were only using one of
     the two"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (42 commits)
  KVM: x86: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
  KVM: s390: Cap KVM_CAP_NR_VCPUS by num_online_cpus()
  KVM: RISC-V: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
  KVM: PPC: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
  KVM: MIPS: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
  KVM: arm64: Cap KVM_CAP_NR_VCPUS by kvm_arm_default_max_vcpus()
  KVM: x86: Assume a 64-bit hypercall for guests with protected state
  selftests: KVM: Add /x86_64/sev_migrate_tests to .gitignore
  riscv: kvm: fix non-kernel-doc comment block
  KVM: SEV: Fix typo in and tweak name of cmd_allowed_from_miror()
  KVM: SEV: Drop a redundant setting of sev->asid during initialization
  KVM: SEV: WARN if SEV-ES is marked active but SEV is not
  KVM: SEV: Set sev_info.active after initial checks in sev_guest_init()
  KVM: SEV: Disallow COPY_ENC_CONTEXT_FROM if target has created vCPUs
  KVM: Kill kvm_map_gfn() / kvm_unmap_gfn() and gfn_to_pfn_cache
  KVM: nVMX: Use a gfn_to_hva_cache for vmptrld
  KVM: nVMX: Use kvm_read_guest_offset_cached() for nested VMCS check
  KVM: x86/xen: Use sizeof_field() instead of open-coding it
  KVM: nVMX: Use kvm_{read,write}_guest_cached() for shadow_vmcs12
  KVM: x86/xen: Fix get_attr of KVM_XEN_ATTR_TYPE_SHARED_INFO
  ...

2 years agoMerge tag 'docs-5.16-2' of git://git.lwn.net/linux
Linus Torvalds [Thu, 18 Nov 2021 19:01:06 +0000 (11:01 -0800)]
Merge tag 'docs-5.16-2' of git://git.lwn.net/linux

Pull documentation fixes from Jonathan Corbet:
 "A handful of documentation fixes for 5.16"

* tag 'docs-5.16-2' of git://git.lwn.net/linux:
  Documentation/process: fix a cross reference
  Documentation: update vcpu-requests.rst reference
  docs: accounting: update delay-accounting.rst reference
  libbpf: update index.rst reference
  docs: filesystems: Fix grammatical error "with" to "which"
  doc/zh_CN: fix a translation error in management-style
  docs: ftrace: fix the wrong path of tracefs
  Documentation: arm: marvell: Fix link to armada_1000_pb.pdf document
  Documentation: arm: marvell: Put Armada XP section between Armada 370 and 375
  Documentation: arm: marvell: Add some links to homepage / product infos
  docs: Update Sphinx requirements

2 years agoMerge tag 'printk-for-5.16-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 18 Nov 2021 18:50:45 +0000 (10:50 -0800)]
Merge tag 'printk-for-5.16-fixup' of git://git./linux/kernel/git/printk/linux

Pull printk fixes from Petr Mladek:

 - Try to flush backtraces from other CPUs also on the local one. This
   was a regression caused by printk_safe buffers removal.

 - Remove header dependency warning.

* tag 'printk-for-5.16-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: Remove printk.h inclusion in percpu.h
  printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces

2 years agoptp: ocp: Fix a couple NULL vs IS_ERR() checks
Dan Carpenter [Thu, 18 Nov 2021 11:22:11 +0000 (14:22 +0300)]
ptp: ocp: Fix a couple NULL vs IS_ERR() checks

The ptp_ocp_get_mem() function does not return NULL, it returns error
pointers.

Fixes: 773bda964921 ("ptp: ocp: Expose various resources on the timecard.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'lan78xx-napi'
David S. Miller [Thu, 18 Nov 2021 12:11:51 +0000 (12:11 +0000)]
Merge branch 'lan78xx-napi'

John Efstathiades says:

===================
lan78xx NAPI Performance Improvements

This patch set introduces a set of changes to the lan78xx driver
that were originally developed as part of an investigation into
the performance of TCP and UDP transfers on an Android system.
The changes increase the throughput of both UDP and TCP transfers
and reduce the overall CPU load.

These improvements are also seen on a standard Linux kernel. Typical
results are included at the end of this document.

The changes to the driver evolved over time. The patches presented
here attempt to organise the changes in to coherent blocks that
affect logically connected parts of the driver. The patches do not
reflect the way in which the code evolved during the performance
investigation.

Each patch produces a working driver that has an incremental
improvement but patches 2, 3 and 6 should be considered a single
update.

The changes affect the following parts of the driver:

1. Deferred URB processing

The deferred URB processing that was originally done by a tasklet
is now done by a NAPI polling routine. The NAPI cycle has a fixed
work budget that controls how many received frames are passed to
the network stack.

Patch 6 introduces the NAPI polling but depends on preceding patches.

The new NAPI polling routine is also responsible for submitting
Rx and Tx URBs to the USB host controller.

Moving the URB processing to a NAPI-based system "smoothed"
incoming and outgoing data flows on the Android system under
investigation. However, taken in isolation, moving from a tasklet
approach to a NAPI approach made little or no difference to the
overall performance.

2. URB buffer management

The driver creates a pool of Tx and a pool of Rx URB buffers. Each
buffer is large enough to accommodate a packet with the maximum MTU
data. URBs are allocated from these pools as required.

Patch 2 introduces the new Tx buffer pool.
Patch 3 introduces the new Rx buffer pool.

3. Tx pending data

SKBs containing data to be transmitted are added to a queue. The
driver tracks free Tx URBs and the corresponding free Tx URB space.
When new Tx URBs are submitted, pending data is copied into the
URB buffer until the URB buffer is filled or there is no more
pending data. This maximises utilisation the LAN78xx internal
USB and network frame buffers.

New Tx URBs are submitted to the USB host controller as part of the
NAPI polling cycle.

Patch 2 introduces these changes.

4. Rx URB completion

A new URB is no longer submitted as part of the URB completion
callback.
New URBs are submitted during the NAPI polling cycle.

Patch 3 introduces these changes.

5. Rx URB processing

Completed URBs are put on to queue for processing (as is done in the
current driver). Network packets in completed URBs are copied from
the URB buffer in to dynamically allocated SKBs and passed to
the network stack.

The emptied URBs are resubmitted to the USB host controller.

Patch 3 introduces this change. Patch 6 updates the change to use
NAPI SKBs.

Each packet passed to the network stack is a single NAPI work item.
If the NAPI work budget is exhausted the remaining packets in the
URB are put onto an overflow queue that is processed at the start
of the next NAPI cycle.

Patch 6 introduces this change.

6. Driver-specific hard_header_len

The driver-specific hard_header_len adjustment was removed as it
broke generic receive offload (GRO) processing. Moreover, it was no
longer required due the change in Tx pending data management (see
point 3. above).

Patch 5 introduces this change.

The modification has been tested on four different target machines:

Target           |    CPU     |   ARCH  | cores | kernel |  RAM  |
-----------------+------------+---------+-------+--------+-------|
Raspberry Pi 4B  | Cortex-A72 | aarch64 |   4   | 64-bit |  2 GB |
Nitrogen8M SBC   | Cortex-A53 | aarch64 |   4   | 64-bit |  2 GB |
Compaq Pressario | Pentium D  | i686    |   2   | 32-bit |  4 GB |
Dell T3620       | Core i3    | x86_64  |  2+2  | 64-bit | 16 GB |

The targets, apart from the Compaq, each have an on-chip USB3 host
controller. A PCIe-based USB3 host controller card was added to the
Compaq to provide the necessary USB3 host interface.

The network throughput was measured using iperf3. The peer device was
a second Dell T3620 fitted with an Intel i210 network interface. The
target machine and the peer device were connected via a Netgear GS105
gigabit switch.

The CPU load was measured using mpstat running on the target machine.

The tables below summarise the throughput and CPU load improvements
achieved by the updated driver.

The bandwidth is the average bandwidth reported by iperf3 at the end
of a 60-second test.

The percentage idle figure is the average idle reported across all
CPU cores on the target machine for the duration of the test.

TCP Rx (target receiving, peer transmitting)

                 |   Standard Driver  |   NAPI Driver      |
Target           | Bandwidth | % Idle | Bandwidth | % Idle |
-----------------+-----------+--------+--------------------|
RPi4 Model B     |    941    |  74.9  |    941    |  91.5  |
Nitrogen8M       |    941    |  76.2  |    941    |  92.7  |
Compaq Pressario |    941    |  44.5  |    941    |  82.1  |
Dell T3620       |    941    |  88.9  |    941    |  98.3  |

TCP Tx (target transmitting, peer receiving)

                 |   Standard Driver  |   NAPI Driver      |
Target           | Bandwidth | % Idle | Bandwidth | % Idle |
-----------------+-----------+--------+--------------------|
RPi4 Model B     |    683    |  80.1  |    942    |  97.6  |
Nitrogen8M       |    942    |  97.8  |    942    |  97.3  |
Compaq Pressario |    939    |  80.0  |    942    |  91.2  |
Dell T3620       |    942    |  95.3  |    942    |  97.6  |

UDP Rx (target receiving, peer transmitting)

                 |   Standard Driver  |   NAPI Driver      |
Target           | Bandwidth | % Idle | Bandwidth | % Idle |
-----------------+-----------+--------+--------------------|
RPi4 Model B     |     -     |    -   | 958 (0%)  |  76.2  |
Nitrogen8M       | 690 (25%) |  57.7  | 937 (0%)  |  68.5  |
Compaq Pressario | 958 (0%)  |  50.2  | 958 (0%)  |  61.6  |
Dell T3620       | 958 (0%)  |  89.6  | 958 (0%)  |  85.3  |

The figure in brackets is the percentage packet loss.

UDP Tx (target transmitting, peer receiving)

                 |   Standard Driver  |   NAPI Driver      |
Target           | Bandwidth | % Idle | Bandwidth | % Idle |
-----------------+-----------+--------+--------------------|
RPi4 Model B     |    370    |  75.0  |    886    |  78.9  |
Nitrogen8M       |    710    |  75.0  |    958    |  85.3  |
Compaq Pressario |    958    |  65.5  |    958    |  76.6  |
Dell T3620       |    958    |  97.0  |    958    |  97.3  |
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agolan78xx: Introduce NAPI polling support
John Efstathiades [Thu, 18 Nov 2021 11:01:39 +0000 (11:01 +0000)]
lan78xx: Introduce NAPI polling support

This patch introduces a NAPI-style approach for processing completed
Rx URBs that contributes to improving driver throughput and reducing
CPU load.

Packets in completed URBs are copied to NAPI SKBs and passed to the
network stack for processing. Each frame passed to the stack is one
work item in the NAPI budget.

If the NAPI budget is consumed and frames remain, they are added to
an overflow queue that is processed at the start of the next NAPI
polling cycle.

The NAPI handler is also responsible for copying pending Tx data to
Tx URBs and submitting them to the USB host controller for
transmission.

Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agolan78xx: Remove hardware-specific header update
John Efstathiades [Thu, 18 Nov 2021 11:01:38 +0000 (11:01 +0000)]
lan78xx: Remove hardware-specific header update

Remove hardware-specific header length adjustment as it is no longer
required. It also breaks generic receive offload (GRO) processing of
received TCP frames that results in a TCP ACK being sent for each
received frame.

Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agolan78xx: Re-order rx_submit() to remove forward declaration
John Efstathiades [Thu, 18 Nov 2021 11:01:37 +0000 (11:01 +0000)]
lan78xx: Re-order rx_submit() to remove forward declaration

Move position of rx_submit() to remove forward declaration of
rx_complete() which is now no longer required.

Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agolan78xx: Introduce Rx URB processing improvements
John Efstathiades [Thu, 18 Nov 2021 11:01:36 +0000 (11:01 +0000)]
lan78xx: Introduce Rx URB processing improvements

This patch introduces a new approach to allocating and managing
Rx URBs that contributes to improving driver throughput and reducing
CPU load.

A pool of Rx URBs is created during driver instantiation. All the
URBs are initially submitted to the USB host controller for
processing.

The default URB buffer size is different for each USB bus speed.
The chosen sizes provide good USB utilisation with little impact on
overall packet latency.

Completed URBs are processed in the driver bottom half. The URB
buffer contents are copied to a dynamically allocated SKB, which is
then passed to the network stack. The URB is then re-submitted to
the USB host controller.

NOTE: the call to skb_copy() in rx_process() that copies the URB
contents to a new SKB is a temporary change to make this patch work
in its own right. This call will be removed when the NAPI processing
is introduced by patch 6 in this patch set.

Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agolan78xx: Introduce Tx URB processing improvements
John Efstathiades [Thu, 18 Nov 2021 11:01:35 +0000 (11:01 +0000)]
lan78xx: Introduce Tx URB processing improvements

This patch introduces a new approach to allocating and managing
Tx URBs that contributes to improving driver throughput and reducing
CPU load.

A pool of Tx URBs is created during driver instantiation. A URB is
allocated from the pool when there is data to transmit. The URB is
released back to the pool when the data has been transmitted by the
device.

The default URB buffer size is different for each USB bus speed.
The chosen sizes provide good USB utilisation with little impact on
overall packet latency.

SKBs to be transmitted are added to a pending queue for processing.
The driver tracks the available Tx URB buffer space and copies as
much pending data as possible into each free URB. Each full URB
is then submitted to the USB host controller for transmission.

Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agolan78xx: Fix memory allocation bug
John Efstathiades [Thu, 18 Nov 2021 11:01:34 +0000 (11:01 +0000)]
lan78xx: Fix memory allocation bug

Fix memory allocation that fails to check for NULL return.

Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'dsa-felix-psfp'
David S. Miller [Thu, 18 Nov 2021 12:07:24 +0000 (12:07 +0000)]
Merge branch 'dsa-felix-psfp'

Xiaoliang Yang says:

====================
net: dsa: felix: psfp support on vsc9959

VSC9959 hardware supports Per-Stream Filtering and Policing(PSFP).
This patch series add PSFP support on tc flower offload of ocelot
driver. Use chain 30000 to distinguish PSFP from VCAP blocks. Add gate
and police set to support PSFP in VSC9959 driver.

v6-v7 changes:
 - Add a patch to restrict psfp rules on ingress port.
 - Using stats.drops to show the packet count discarded by the rule.

v5->v6 changes:
 - Modify ocelot_mact_lookup() parameters.
 - Use parameters ssid and sfid instead of streamdata in
   ocelot_mact_learn_streamdata() function.
 - Serialize STREAMDATA and MAC table write.

v4->v5 changes:
 - Add MAC table lock patch, and move stream data write in
   ocelot_mact_learn_streamdata().
 - Add two sections of VCAP policers to Seville platform.

v3->v4 changes:
 - Introduce vsc9959_psfp_sfi_table_get() function in patch where it is
   used to fix compile warning.

v2->v3 changes:
 - Reorder first two patches. Export struct ocelot_mact_entry, then add
   ocelot_mact_lookup() and ocelot_mact_write() functions.
 - Add PSFP list to struct ocelot, and init it by using
   ocelot->ops->psfp_init().

v1->v2 changes:
 - Use tc flower offload of ocelot driver to support PSFP add and delete.
 - Add PSFP tables add/del functions in felix_vsc9959.c.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: dsa: felix: restrict psfp rules on ingress port
Xiaoliang Yang [Thu, 18 Nov 2021 10:12:04 +0000 (18:12 +0800)]
net: dsa: felix: restrict psfp rules on ingress port

PSFP rules take effect on the streams from any port of VSC9959 switch.
This patch use ingress port to limit the rule only active on this port.

Each stream can only match two ingress source ports in VSC9959. Streams
from lowest port gets the configuration of SFID pointed by MAC Table
lookup and streams from highest port gets the configuration of (SFID+1)
pointed by MAC Table lookup. This patch defines the PSFP rule on highest
port as dummy rule, which means that it does not modify the MAC table.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: dsa: felix: use vcap policer to set flow meter for psfp
Xiaoliang Yang [Thu, 18 Nov 2021 10:12:03 +0000 (18:12 +0800)]
net: dsa: felix: use vcap policer to set flow meter for psfp

This patch add police action to set flow meter table which is defined
in IEEE802.1Qci. Flow metering is two rates two buckets and three color
marker to policing the frames, we only enable one rate one bucket in
this patch.

Flow metering shares a same policer pool with VCAP policers, so the PSFP
policer calls ocelot_vcap_policer_add() and ocelot_vcap_policer_del() to
set flow meter police.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: mscc: ocelot: use index to set vcap policer
Xiaoliang Yang [Thu, 18 Nov 2021 10:12:02 +0000 (18:12 +0800)]
net: mscc: ocelot: use index to set vcap policer

Policer was previously automatically assigned from the highest index to
the lowest index from policer pool. But police action of tc flower now
uses index to set an police entry. This patch uses the police index to
set vcap policers, so that one policer can be shared by multiple rules.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: dsa: felix: add stream gate settings for psfp
Xiaoliang Yang [Thu, 18 Nov 2021 10:12:01 +0000 (18:12 +0800)]
net: dsa: felix: add stream gate settings for psfp

This patch adds stream gate settings for PSFP. Use SGI table to store
stream gate entries. Disable the gate entry when it is not used by any
stream.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: dsa: felix: support psfp filter on vsc9959
Xiaoliang Yang [Thu, 18 Nov 2021 10:12:00 +0000 (18:12 +0800)]
net: dsa: felix: support psfp filter on vsc9959

VSC9959 supports Per-Stream Filtering and Policing(PSFP) that complies
with the IEEE 802.1Qci standard. The stream is identified by Null stream
identification(DMAC and VLAN ID) defined in IEEE802.1CB.

For PSFP, four tables need to be set up: stream table, stream filter
table, stream gate table, and flow meter table. Identify the stream by
parsing the tc flower keys and add it to the stream table. The stream
filter table is automatically maintained, and its index is determined by
SGID(flow gate index) and FMID(flow meter index).

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: mscc: ocelot: add gate and police action offload to PSFP
Xiaoliang Yang [Thu, 18 Nov 2021 10:11:59 +0000 (18:11 +0800)]
net: mscc: ocelot: add gate and police action offload to PSFP

PSFP support gate and police action. This patch add the gate and police
action to flower parse action, check chain ID to determine which block
to offload. Adding psfp callback functions to add, delete and update gate
and police in PSFP table if hardware supports it.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: mscc: ocelot: set vcap IS2 chain to goto PSFP chain
Xiaoliang Yang [Thu, 18 Nov 2021 10:11:58 +0000 (18:11 +0800)]
net: mscc: ocelot: set vcap IS2 chain to goto PSFP chain

Some chips in the ocelot series such as VSC9959 support Per-Stream
Filtering and Policing(PSFP), which is processing after VCAP blocks.
We set this block on chain 30000 and set vcap IS2 chain to goto PSFP
chain if hardware support.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: mscc: ocelot: add MAC table stream learn and lookup operations
Xiaoliang Yang [Thu, 18 Nov 2021 10:11:57 +0000 (18:11 +0800)]
net: mscc: ocelot: add MAC table stream learn and lookup operations

ocelot_mact_learn_streamdata() can be used in VSC9959 to overwrite an
FDB entry with stream data. The stream data includes SFID and SSID which
can be used for PSFP and FRER set.

ocelot_mact_lookup() can be used to check if the given {DMAC, VID} FDB
entry is exist, and also can retrieve the DEST_IDX and entry type for
the FDB entry.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ethernet: dec: tulip: de4x5: fix possible array overflows in type3_infoblock()
Teng Qi [Thu, 18 Nov 2021 07:01:18 +0000 (15:01 +0800)]
net: ethernet: dec: tulip: de4x5: fix possible array overflows in type3_infoblock()

The definition of macro MOTO_SROM_BUG is:
  #define MOTO_SROM_BUG    (lp->active == 8 && (get_unaligned_le32(
  dev->dev_addr) & 0x00ffffff) == 0x3e0008)

and the if statement
  if (MOTO_SROM_BUG) lp->active = 0;

using this macro indicates lp->active could be 8. If lp->active is 8 and
the second comparison of this macro is false. lp->active will remain 8 in:
  lp->phy[lp->active].gep = (*p ? p : NULL); p += (2 * (*p) + 1);
  lp->phy[lp->active].rst = (*p ? p : NULL); p += (2 * (*p) + 1);
  lp->phy[lp->active].mc  = get_unaligned_le16(p); p += 2;
  lp->phy[lp->active].ana = get_unaligned_le16(p); p += 2;
  lp->phy[lp->active].fdx = get_unaligned_le16(p); p += 2;
  lp->phy[lp->active].ttm = get_unaligned_le16(p); p += 2;
  lp->phy[lp->active].mci = *p;

However, the length of array lp->phy is 8, so array overflows can occur.
To fix these possible array overflows, we first check lp->active and then
return -EINVAL if it is greater or equal to ARRAY_SIZE(lp->phy) (i.e. 8).

Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Teng Qi <starmiku1207184332@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agomctp/test: Update refcount checking in route fragment tests
Jeremy Kerr [Thu, 18 Nov 2021 06:57:23 +0000 (14:57 +0800)]
mctp/test: Update refcount checking in route fragment tests

In 99ce45d5e, we moved a route refcount decrement from
mctp_do_fragment_route into the caller. This invalidates the assumption
that the route test makes about refcount behaviour, so the route tests
fail.

This change fixes the test case to suit the new refcount behaviour.

Fixes: 99ce45d5e7db ("mctp: Implement extended addressing")
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoipv6: ah6: use swap() to make code cleaner
Yao Jing [Thu, 18 Nov 2021 06:10:18 +0000 (06:10 +0000)]
ipv6: ah6: use swap() to make code cleaner

Use the macro 'swap()' defined in 'include/linux/minmax.h' to avoid
opencoding it.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Yao Jing <yao.jing2@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: tulip: de4x5: fix the problem that the array 'lp->phy[8]' may be out of bound
zhangyue [Thu, 18 Nov 2021 05:46:32 +0000 (13:46 +0800)]
net: tulip: de4x5: fix the problem that the array 'lp->phy[8]' may be out of bound

In line 5001, if all id in the array 'lp->phy[8]' is not 0, when the
'for' end, the 'k' is 8.

At this time, the array 'lp->phy[8]' may be out of bound.

Signed-off-by: zhangyue <zhangyue1@kylinos.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: add missing htmldocs for skb->ll_node and sk->defer_list
Eric Dumazet [Thu, 18 Nov 2021 01:57:29 +0000 (17:57 -0800)]
tcp: add missing htmldocs for skb->ll_node and sk->defer_list

Add missing entries to fix these "make htmldocs" warnings.

./include/linux/skbuff.h:953: warning: Function parameter or member 'll_node' not described in 'sk_buff'
./include/net/sock.h:540: warning: Function parameter or member 'defer_list' not described in 'sock'

Fixes: f35f821935d8 ("tcp: defer skb freeing after socket lock is released")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next...
David S. Miller [Thu, 18 Nov 2021 11:49:52 +0000 (11:49 +0000)]
Merge branch '10GbE' of git://git./linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
10GbE Intel Wired LAN Driver Updates 2021-11-17

Radoslaw Tyl says:

The change is a consequence of errors reported by the ixgbevf driver
while starting several virtual guests at the same time on ESX host.
During this, VF was not able to communicate correctly with the PF,
as a result reported "PF still in reset state. Is the PF interface up?"
and then goes to locked state. The only thing left was to reload
the VF driver on the guest OS.

The background of the problem is that the current PFU and VFU
semaphore locking mechanism between sender and receiver may cause
overriding Mailbox memory (VFMBMEM), in such scenario receiver of
the original message will read the invalid, corrupted or one (or more)
message may be lost.

This change is actually as a support for communication with PF ESX
driver and does not contains changes and support for ixgbe driver.
For maintain backward compatibility, previous communication method
has been preserved in the form of LEGACY functions.

In the future there is a plan to add a support for a 1.5 mailbox API
communication also to ixgbe driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-
David S. Miller [Thu, 18 Nov 2021 11:48:33 +0000 (11:48 +0000)]
Merge branch '40GbE' of git://git./linux/kernel/git/tnguy/net-
queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-11-17

This series contains updates to i40e driver only.

Eryk adds accounting for VLAN header in packet size when VF port VLAN is
configured. He also fixes TC queue distribution when the user has changed
queue counts as well as for configuration of VF ADQ which caused dropped
packets.

Michal adds tracking for when a VSI is being released to prevent null
pointer dereference when managing filters.

Karen ensures PF successfully initiates VF requested reset which could
cause a call trace otherwise.

Jedrzej moves validation of channel queue value earlier to prevent
partial configuration when the value is invalid.

Grzegorz corrects the reported error when adding filter fails.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: mdio: Replaced BUG_ON() with WARN()
Florian Fainelli [Wed, 17 Nov 2021 17:36:29 +0000 (09:36 -0800)]
net: mdio: Replaced BUG_ON() with WARN()

Killing the kernel because a certain MDIO bus object is not in the
desired state at various points in the registration or unregistration
paths is excessive and is not helping in troubleshooting or fixing
issues. Replace the BUG_ON() with WARN() and print out the MDIO bus name
to facilitate debugging.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoipv6: check return value of ipv6_skip_exthdr
Jordy Zomer [Wed, 17 Nov 2021 19:06:48 +0000 (20:06 +0100)]
ipv6: check return value of ipv6_skip_exthdr

The offset value is used in pointer math on skb->data.
Since ipv6_skip_exthdr may return -1 the pointer to uh and th
may not point to the actual udp and tcp headers and potentially
overwrite other stuff. This is why I think this should be checked.

EDIT:  added {}'s, thanks Kees

Signed-off-by: Jordy Zomer <jordy@pwning.systems>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoe100: fix device suspend/resume
Jesse Brandeburg [Wed, 17 Nov 2021 20:59:52 +0000 (12:59 -0800)]
e100: fix device suspend/resume

As reported in [1], e100 was no longer working for suspend/resume
cycles. The previous commit mentioned in the fixes appears to have
broken things and this attempts to practice best known methods for
device power management and keep wake-up working while allowing
suspend/resume to work. To do this, I reorder a little bit of code
and fix the resume path to make sure the device is enabled.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=214933

Fixes: 69a74aef8a18 ("e100: use generic power management")
Cc: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Reported-by: Alexey Kuznetsov <axet@me.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Alexey Kuznetsov <axet@me.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'dpaa2-phylink'
David S. Miller [Thu, 18 Nov 2021 11:38:45 +0000 (11:38 +0000)]
Merge branch 'dpaa2-phylink'

Russell King says:

====================
net: dpaa2: phylink validate implementation updates

This series converts dpaa2 to fill in the supported_interfaces member
of phylink_config, cleans up the validate() implementation, and then
converts to phylink_generic_validate(). Previous behaviour should be
preserved.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: dpaa2-mac: use phylink_generic_validate()
Russell King (Oracle) [Wed, 17 Nov 2021 17:24:13 +0000 (17:24 +0000)]
net: dpaa2-mac: use phylink_generic_validate()

DPAA2 has no special behaviour in its validation implementation, so can
be switched to phylink_generic_validate().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: dpaa2-mac: remove interface checks in dpaa2_mac_validate()
Russell King (Oracle) [Wed, 17 Nov 2021 17:24:07 +0000 (17:24 +0000)]
net: dpaa2-mac: remove interface checks in dpaa2_mac_validate()

As phylink checks the interface mode against the supported_interfaces
bitmap, we no longer need to validate the interface mode, nor handle
PHY_INTERFACE_MODE_NA in the validation function. Remove these to
simplify the implementation.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: dpaa2-mac: populate supported_interfaces member
Russell King [Wed, 17 Nov 2021 17:24:02 +0000 (17:24 +0000)]
net: dpaa2-mac: populate supported_interfaces member

Populate the phy interface mode bitmap for the Freescale DPAA2 driver
with interfaces modes supported by the MAC.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'ag71xx-phylink'
David S. Miller [Thu, 18 Nov 2021 11:36:48 +0000 (11:36 +0000)]
Merge branch 'ag71xx-phylink'

Russell King says:

====================
net: ag71xx: phylink validate implementation updates

This series converts ag71xx to fill in the supported_interfaces member
of phylink_config, cleans up the validate() implementation, and then
converts to phylink_generic_validate().

The question over the port linkmode restriction has been answered by
Oleksij - there is no reason for this restriction, so we can go the
whole hog with this conversion. Thanks!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ag71xx: use phylink_generic_validate()
Russell King (Oracle) [Wed, 17 Nov 2021 16:46:31 +0000 (16:46 +0000)]
net: ag71xx: use phylink_generic_validate()

ag71xx apparently only supports MII port type, which makes it different
from other implementations. However, Oleksij says there is no special
reason for this.

Convert the driver to use phylink_generic_validate(), which will allow
all ethtool port linkmodes instead of only MII, giving the driver
consistent behaviour with other drivers.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ag71xx: remove interface checks in ag71xx_mac_validate()
Russell King (Oracle) [Wed, 17 Nov 2021 16:46:25 +0000 (16:46 +0000)]
net: ag71xx: remove interface checks in ag71xx_mac_validate()

As phylink checks the interface mode against the supported_interfaces
bitmap, we no longer need to validate the interface mode, nor handle
PHY_INTERFACE_MODE_NA in the validation function. Remove these to
simplify the implementation.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ag71xx: populate supported_interfaces member
Russell King [Wed, 17 Nov 2021 16:46:20 +0000 (16:46 +0000)]
net: ag71xx: populate supported_interfaces member

Populate the phy_interface_t bitmap for the Atheros ag71xx driver with
interfaces modes supported by the MAC.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodevlink: Don't throw an error if flash notification sent before devlink visible
Leon Romanovsky [Wed, 17 Nov 2021 14:49:09 +0000 (16:49 +0200)]
devlink: Don't throw an error if flash notification sent before devlink visible

The mlxsw driver calls to various devlink flash routines even before
users can get any access to the devlink instance itself. For example,
mlxsw_core_fw_rev_validate() one of such functions.

__mlxsw_core_bus_device_register
 -> mlxsw_core_fw_rev_validate
  -> mlxsw_core_fw_flash
   -> mlxfw_firmware_flash
    -> mlxfw_status_notify
     -> devlink_flash_update_status_notify
      -> __devlink_flash_update_notify
       -> WARN_ON(...)

It causes to the WARN_ON to trigger warning about devlink not registered.

Fixes: cf530217408e ("devlink: Notify users when objects are accessible")
Reported-by: Danielle Ratson <danieller@nvidia.com>
Tested-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: stmmac: dwmac-qcom-ethqos: add platform level clocks management
Bhupesh Sharma [Wed, 17 Nov 2021 11:05:38 +0000 (16:35 +0530)]
net: stmmac: dwmac-qcom-ethqos: add platform level clocks management

Split clocks settings from init callback into clks_config callback,
which could support platform level clock management.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Bhupesh Sharma <bhupesh.sharma@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agopage_pool: Revert "page_pool: disable dma mapping support..."
Yunsheng Lin [Wed, 17 Nov 2021 07:56:52 +0000 (15:56 +0800)]
page_pool: Revert "page_pool: disable dma mapping support..."

This reverts commit d00e60ee54b12de945b8493cf18c1ada9e422514.

As reported by Guillaume in [1]:
Enabling LPAE always enables CONFIG_ARCH_DMA_ADDR_T_64BIT
in 32-bit systems, which breaks the bootup proceess when a
ethernet driver is using page pool with PP_FLAG_DMA_MAP flag.
As we were hoping we had no active consumers for such system
when we removed the dma mapping support, and LPAE seems like
a common feature for 32 bits system, so revert it.

1. https://www.spinics.net/lists/netdev/msg779890.html

Fixes: d00e60ee54b1 ("page_pool: disable dma mapping support for 32-bit arch with 64-bit DMA")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reported-by: "kernelci.org bot" <bot@kernelci.org>
Tested-by: "kernelci.org bot" <bot@kernelci.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoethernet: hisilicon: hns: hns_dsaf_misc: fix a possible array overflow in hns_dsaf_ge...
Teng Qi [Wed, 17 Nov 2021 03:44:53 +0000 (11:44 +0800)]
ethernet: hisilicon: hns: hns_dsaf_misc: fix a possible array overflow in hns_dsaf_ge_srst_by_port()

The if statement:
  if (port >= DSAF_GE_NUM)
        return;

limits the value of port less than DSAF_GE_NUM (i.e., 8).
However, if the value of port is 6 or 7, an array overflow could occur:
  port_rst_off = dsaf_dev->mac_cb[port]->port_rst_off;

because the length of dsaf_dev->mac_cb is DSAF_MAX_PORT_NUM (i.e., 6).

To fix this possible array overflow, we first check port and if it is
greater than or equal to DSAF_MAX_PORT_NUM, the function returns.

Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Teng Qi <starmiku1207184332@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'rework/printk_safe-removal' into for-linus
Petr Mladek [Thu, 18 Nov 2021 09:03:47 +0000 (10:03 +0100)]
Merge branch 'rework/printk_safe-removal' into for-linus

2 years agoparisc: Enable CONFIG_PRINTK_TIME=y in 32bit defconfig
Helge Deller [Wed, 17 Nov 2021 14:48:45 +0000 (15:48 +0100)]
parisc: Enable CONFIG_PRINTK_TIME=y in 32bit defconfig

Signed-off-by: Helge Deller <deller@gmx.de>
2 years agoRevert "parisc: Reduce sigreturn trampoline to 3 instructions"
Helge Deller [Wed, 17 Nov 2021 10:05:07 +0000 (11:05 +0100)]
Revert "parisc: Reduce sigreturn trampoline to 3 instructions"

This reverts commit e4f2006f1287e7ea17660490569cff323772dac4.

This patch shows problems with signal handling. Revert it for now.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v5.15
2 years agoparisc: Wrap assembler related defines inside __ASSEMBLY__
Helge Deller [Tue, 16 Nov 2021 12:12:21 +0000 (13:12 +0100)]
parisc: Wrap assembler related defines inside __ASSEMBLY__

Building allmodconfig shows errors in the gpu/drm/msm snapdragon drivers,
because a COND() define is used there which conflicts with the COND() for
PA-RISC assembly.  Although the snapdragon driver isn't relevant for parisc, it
is nevertheless compiled when CONFIG_COMPILE_TEST is defined.

Move the COND() define and other PA-RISC mnemonics inside the #ifdef
__ASSEMBLY__ part to avoid this conflict.

Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: kernel test robot <lkp@intel.com>
2 years agoparisc: Wire up futex_waitv
Helge Deller [Tue, 16 Nov 2021 12:11:26 +0000 (13:11 +0100)]
parisc: Wire up futex_waitv

Signed-off-by: Helge Deller <deller@gmx.de>
2 years agoparisc: Include stringify.h to avoid build error in crypto/api.c
Helge Deller [Mon, 15 Nov 2021 16:34:50 +0000 (17:34 +0100)]
parisc: Include stringify.h to avoid build error in crypto/api.c

Include stringify.h to avoid this build error:
 arch/parisc/include/asm/jump_label.h: error: expected ':' before '__stringify'
 arch/parisc/include/asm/jump_label.h: error: label 'l_yes' defined but not used [-Werror=unused-label]

Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: kernel test robot <lkp@intel.com>
2 years agoKVM: x86: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
Vitaly Kuznetsov [Tue, 16 Nov 2021 16:34:43 +0000 (17:34 +0100)]
KVM: x86: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS

It doesn't make sense to return the recommended maximum number of
vCPUs which exceeds the maximum possible number of vCPUs.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211116163443.88707-7-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agoKVM: s390: Cap KVM_CAP_NR_VCPUS by num_online_cpus()
Vitaly Kuznetsov [Tue, 16 Nov 2021 16:34:42 +0000 (17:34 +0100)]
KVM: s390: Cap KVM_CAP_NR_VCPUS by num_online_cpus()

KVM_CAP_NR_VCPUS is a legacy advisory value which on other architectures
return num_online_cpus() caped by KVM_CAP_NR_VCPUS or something else
(ppc and arm64 are special cases). On s390, KVM_CAP_NR_VCPUS returns
the same as KVM_CAP_MAX_VCPUS and this may turn out to be a bad
'advice'. Switch s390 to returning caped num_online_cpus() too.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Message-Id: <20211116163443.88707-6-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agoKVM: RISC-V: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
Vitaly Kuznetsov [Tue, 16 Nov 2021 16:34:41 +0000 (17:34 +0100)]
KVM: RISC-V: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS

It doesn't make sense to return the recommended maximum number of
vCPUs which exceeds the maximum possible number of vCPUs.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Anup Patel <anup.patel@wdc.com>
Message-Id: <20211116163443.88707-5-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agoKVM: PPC: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
Vitaly Kuznetsov [Tue, 16 Nov 2021 16:34:40 +0000 (17:34 +0100)]
KVM: PPC: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS

It doesn't make sense to return the recommended maximum number of
vCPUs which exceeds the maximum possible number of vCPUs.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211116163443.88707-4-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agoKVM: MIPS: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
Vitaly Kuznetsov [Tue, 16 Nov 2021 16:34:39 +0000 (17:34 +0100)]
KVM: MIPS: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS

It doesn't make sense to return the recommended maximum number of
vCPUs which exceeds the maximum possible number of vCPUs.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211116163443.88707-3-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agoKVM: arm64: Cap KVM_CAP_NR_VCPUS by kvm_arm_default_max_vcpus()
Vitaly Kuznetsov [Tue, 16 Nov 2021 16:34:38 +0000 (17:34 +0100)]
KVM: arm64: Cap KVM_CAP_NR_VCPUS by kvm_arm_default_max_vcpus()

Generally, it doesn't make sense to return the recommended maximum number
of vCPUs which exceeds the maximum possible number of vCPUs.

Note: ARM64 is special as the value returned by KVM_CAP_MAX_VCPUS differs
depending on whether it is a system-wide ioctl or a per-VM one. Previously,
KVM_CAP_NR_VCPUS didn't have this difference and it seems preferable to
keep the status quo. Cap KVM_CAP_NR_VCPUS by kvm_arm_default_max_vcpus()
which is what gets returned by system-wide KVM_CAP_MAX_VCPUS.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211116163443.88707-2-vkuznets@redhat.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agoKVM: x86: Assume a 64-bit hypercall for guests with protected state
Tom Lendacky [Mon, 24 May 2021 17:48:57 +0000 (12:48 -0500)]
KVM: x86: Assume a 64-bit hypercall for guests with protected state

When processing a hypercall for a guest with protected state, currently
SEV-ES guests, the guest CS segment register can't be checked to
determine if the guest is in 64-bit mode. For an SEV-ES guest, it is
expected that communication between the guest and the hypervisor is
performed to shared memory using the GHCB. In order to use the GHCB, the
guest must have been in long mode, otherwise writes by the guest to the
GHCB would be encrypted and not be able to be comprehended by the
hypervisor.

Create a new helper function, is_64_bit_hypercall(), that assumes the
guest is in 64-bit mode when the guest has protected state, and returns
true, otherwise invoking is_64_bit_mode() to determine the mode. Update
the hypercall related routines to use is_64_bit_hypercall() instead of
is_64_bit_mode().

Add a WARN_ON_ONCE() to is_64_bit_mode() to catch occurences of calls to
this helper function for a guest running with protected state.

Fixes: f1c6366e3043 ("KVM: SVM: Add required changes to support intercepts under SEV-ES")
Reported-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <e0b20c770c9d0d1403f23d83e785385104211f74.1621878537.git.thomas.lendacky@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agoselftests: KVM: Add /x86_64/sev_migrate_tests to .gitignore
Arnaldo Carvalho de Melo [Tue, 16 Nov 2021 15:03:25 +0000 (12:03 -0300)]
selftests: KVM: Add /x86_64/sev_migrate_tests to .gitignore

  $ git status
  nothing to commit, working tree clean
  $
  $ make -C tools/testing/selftests/kvm/ > /dev/null 2>&1
  $ git status

  Untracked files:
    (use "git add <file>..." to include in what will be committed)
   tools/testing/selftests/kvm/x86_64/sev_migrate_tests

  nothing added to commit but untracked files present (use "git add" to track)
  $

Fixes: 6a58150859fdec76 ("selftest: KVM: Add intra host migration tests")
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Marc Orr <marcorr@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Gonda <pgonda@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Message-Id: <YZPIPfvYgRDCZi/w@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agoriscv: kvm: fix non-kernel-doc comment block
Randy Dunlap [Sun, 7 Nov 2021 03:47:06 +0000 (20:47 -0700)]
riscv: kvm: fix non-kernel-doc comment block

Don't use "/**" to begin a comment block for a non-kernel-doc comment.

Prevents this docs build warning:

vcpu_sbi.c:3: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 * Copyright (c) 2019 Western Digital Corporation or its affiliates.

Fixes: dea8ee31a039 ("RISC-V: KVM: Add SBI v0.1 support")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Atish Patra <atish.patra@wdc.com>
Cc: Anup Patel <anup.patel@wdc.com>
Cc: kvm@vger.kernel.org
Cc: kvm-riscv@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Message-Id: <20211107034706.30672-1-rdunlap@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>