platform/kernel/linux-rpi.git
3 years agoselftests/bpf: Fix running of XDP bonding tests
Jussi Maki [Wed, 11 Aug 2021 12:36:27 +0000 (12:36 +0000)]
selftests/bpf: Fix running of XDP bonding tests

An "innocent" cleanup in the last version of the XDP bonding patchset moved
the "test__start_subtest" calls to the test main function, but I forgot to
reverse the condition, which lead to all tests being skipped. Fix it.

Fixes: 6aab1c81b98a ("selftests/bpf: Add tests for XDP bonding")
Signed-off-by: Jussi Maki <joamaki@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210811123627.20223-1-joamaki@gmail.com
3 years agonet: in_irq() cleanup
Changbin Du [Fri, 13 Aug 2021 14:57:49 +0000 (22:57 +0800)]
net: in_irq() cleanup

Replace the obsolete and ambiguos macro in_irq() with new
macro in_hardirq().

Signed-off-by: Changbin Du <changbin.du@gmail.com>
Link: https://lore.kernel.org/r/20210813145749.86512-1-changbin.du@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet, bonding: Disallow vlan+srcmac with XDP
Jussi Maki [Thu, 12 Aug 2021 14:52:41 +0000 (14:52 +0000)]
net, bonding: Disallow vlan+srcmac with XDP

The new vlan+srcmac xmit policy is not implementable with XDP since
in many cases the 802.1Q payload is not present in the packet. This
can be for example due to hardware offload or in the case of veth
due to use of skbuffs internally.

This also fixes the NULL deref with the vlan+srcmac xmit policy
reported by Jonathan Toppins by additionally checking the skb
pointer.

Fixes: a815bde56b15 ("net, bonding: Refactor bond_xmit_hash for use with xdp_buff")
Reported-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: Jussi Maki <joamaki@gmail.com>
Reviewed-by: Jonathan Toppins <jtoppins@redhat.com>
Link: https://lore.kernel.org/r/20210812145241.12449-1-joamaki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoaf_unix: fix holding spinlock in oob handling
Rao Shoaib [Wed, 11 Aug 2021 22:06:52 +0000 (15:06 -0700)]
af_unix: fix holding spinlock in oob handling

syzkaller found that OOB code was holding spinlock
while calling a function in which it could sleep.

Reported-by: syzbot+8760ca6c1ee783ac4abd@syzkaller.appspotmail.com
Fixes: 314001f0bf92 ("af_unix: Add OOB support")
Signed-off-by: Rao Shoaib <rao.shoaib@oracle.com>
Link: https://lore.kernel.org/r/20210811220652.567434-1-Rao.Shoaib@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Fri, 13 Aug 2021 13:41:22 +0000 (06:41 -0700)]
Merge git://git./linux/kernel/git/netdev/net

Conflicts:

drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h
  9e26680733d5 ("bnxt_en: Update firmware call to retrieve TX PTP timestamp")
  9e518f25802c ("bnxt_en: 1PPS functions to configure TSIO pins")
  099fdeda659d ("bnxt_en: Event handler for PPS events")

kernel/bpf/helpers.c
include/linux/bpf-cgroup.h
  a2baf4e8bb0f ("bpf: Fix potentially incorrect results with bpf_get_local_storage()")
  c7603cfa04e7 ("bpf: Add ambient BPF runtime context stored in current")

drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
  5957cc557dc5 ("net/mlx5: Set all field of mlx5_irq before inserting it to the xarray")
  2d0b41a37679 ("net/mlx5: Refcount mlx5_irq with integer")

MAINTAINERS
  7b637cd52f02 ("MAINTAINERS: fix Microchip CAN BUS Analyzer Tool entry typo")
  7d901a1e878a ("net: phy: add Maxlinear GPY115/21x/24x driver")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge tag 'net-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Fri, 13 Aug 2021 02:24:03 +0000 (16:24 -1000)]
Merge tag 'net-5.14-rc6' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Networking fixes, including fixes from netfilter, bpf, can and
  ieee802154.

  The size of this is pretty normal, but we got more fixes for 5.14
  changes this week than last week. Nothing major but the trend is the
  opposite of what we like. We'll see how the next week goes..

  Current release - regressions:

   - r8169: fix ASPM-related link-up regressions

   - bridge: fix flags interpretation for extern learn fdb entries

   - phy: micrel: fix link detection on ksz87xx switch

   - Revert "tipc: Return the correct errno code"

   - ptp: fix possible memory leak caused by invalid cast

  Current release - new code bugs:

   - bpf: add missing bpf_read_[un]lock_trace() for syscall program

   - bpf: fix potentially incorrect results with bpf_get_local_storage()

   - page_pool: mask the page->signature before the checking, avoid dma
     mapping leaks

   - netfilter: nfnetlink_hook: 5 fixes to information in netlink dumps

   - bnxt_en: fix firmware interface issues with PTP

   - mlx5: Bridge, fix ageing time

  Previous releases - regressions:

   - linkwatch: fix failure to restore device state across
     suspend/resume

   - bareudp: fix invalid read beyond skb's linear data

  Previous releases - always broken:

   - bpf: fix integer overflow involving bucket_size

   - ppp: fix issues when desired interface name is specified via
     netlink

   - wwan: mhi_wwan_ctrl: fix possible deadlock

   - dsa: microchip: ksz8795: fix number of VLAN related bugs

   - dsa: drivers: fix broken backpressure in .port_fdb_dump

   - dsa: qca: ar9331: make proper initial port defaults

  Misc:

   - bpf: add lockdown check for probe_write_user helper

   - netfilter: conntrack: remove offload_pickup sysctl before 5.14 is
     out

   - netfilter: conntrack: collect all entries in one cycle,
     heuristically slow down garbage collection scans on idle systems to
     prevent frequent wake ups"

* tag 'net-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits)
  vsock/virtio: avoid potential deadlock when vsock device remove
  wwan: core: Avoid returning NULL from wwan_create_dev()
  net: dsa: sja1105: unregister the MDIO buses during teardown
  Revert "tipc: Return the correct errno code"
  net: mscc: Fix non-GPL export of regmap APIs
  net: igmp: increase size of mr_ifc_count
  MAINTAINERS: switch to my OMP email for Renesas Ethernet drivers
  tcp_bbr: fix u32 wrap bug in round logic if bbr_init() called after 2B packets
  net: pcs: xpcs: fix error handling on failed to allocate memory
  net: linkwatch: fix failure to restore device state across suspend/resume
  net: bridge: fix memleak in br_add_if()
  net: switchdev: zero-initialize struct switchdev_notifier_fdb_info emitted by drivers towards the bridge
  net: bridge: fix flags interpretation for extern learn fdb entries
  net: dsa: sja1105: fix broken backpressure in .port_fdb_dump
  net: dsa: lantiq: fix broken backpressure in .port_fdb_dump
  net: dsa: lan9303: fix broken backpressure in .port_fdb_dump
  net: dsa: hellcreek: fix broken backpressure in .port_fdb_dump
  bpf, core: Fix kernel-doc notation
  net: igmp: fix data-race in igmp_ifc_timer_expire()
  net: Fix memory leak in ieee802154_raw_deliver
  ...

3 years agoMerge tag 'ceph-for-5.14-rc6' of git://github.com/ceph/ceph-client
Linus Torvalds [Fri, 13 Aug 2021 02:16:01 +0000 (16:16 -1000)]
Merge tag 'ceph-for-5.14-rc6' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "A patch to avoid a soft lockup in ceph_check_delayed_caps() from Luis
  and a reference handling fix from Jeff that should address some memory
  corruption reports in the snaprealm area.

  Both marked for stable"

* tag 'ceph-for-5.14-rc6' of git://github.com/ceph/ceph-client:
  ceph: take snap_empty_lock atomically with snaprealm refcount change
  ceph: reduce contention in ceph_check_delayed_caps()

3 years agoMerge tag 'drm-fixes-2021-08-13' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 13 Aug 2021 02:09:25 +0000 (16:09 -1000)]
Merge tag 'drm-fixes-2021-08-13' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Another week, another set of pretty regular fixes, nothing really
  stands out too much.

  amdgpu:
   - Yellow carp update
   - RAS EEPROM fixes
   - BACO/BOCO fixes
   - Fix a memory leak in an error path
   - Freesync fix
   - VCN harvesting fix
   - Display fixes

  i915:
   - GVT fix for Windows VM hang.
   - Display fix of 12 BPC bits for display 12 and newer.
   - Don't try to access some media register for fused off domains.
   - Fix kerneldoc build warnings.

  mediatek:
   - Fix dpi bridge bug.
   - Fix cursor plane no update.

  meson:
   - Fix colors when booting with HDR"

* tag 'drm-fixes-2021-08-13' of git://anongit.freedesktop.org/drm/drm:
  drm/doc/rfc: drop lmem uapi section
  drm/i915: Only access SFC_DONE when media domain is not fused off
  drm/i915/display: Fix the 12 BPC bits for PIPE_MISC reg
  drm/amd/display: use GFP_ATOMIC in amdgpu_dm_irq_schedule_work
  drm/amd/display: Remove invalid assert for ODM + MPC case
  drm/amd/pm: bug fix for the runtime pm BACO
  drm/amdgpu: handle VCN instances when harvesting (v2)
  drm/meson: fix colour distortion from HDR set during vendor u-boot
  drm/i915/gvt: Fix cached atomics setting for Windows VM
  drm/amdgpu: Add preferred mode in modeset when freesync video mode's enabled.
  drm/amd/pm: Fix a memory leak in an error handling path in 'vangogh_tables_init()'
  drm/amdgpu: don't enable baco on boco platforms in runpm
  drm/amdgpu: set RAS EEPROM address from VBIOS
  drm/amd/pm: update smu v13.0.1 firmware header
  drm/mediatek: Fix cursor plane no update
  drm/mediatek: mtk-dpi: Set out_fmt from config if not the last bridge
  drm/mediatek: dpi: Fix NULL dereference in mtk_dpi_bridge_atomic_check

3 years agodt-bindings: net: qcom,ipa: make imem interconnect optional
Alex Elder [Wed, 11 Aug 2021 14:18:02 +0000 (09:18 -0500)]
dt-bindings: net: qcom,ipa: make imem interconnect optional

On some newer SoCs, the interconnect between IPA and SoC internal
memory (imem) is not used.  Update the binding to indicate that
having just the memory and config interconnects is another allowed
configuration.

Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20210811141802.2635424-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ipa: always inline ipa_aggr_granularity_val()
Alex Elder [Wed, 11 Aug 2021 13:59:48 +0000 (08:59 -0500)]
net: ipa: always inline ipa_aggr_granularity_val()

It isn't required, but all callers of ipa_aggr_granularity_val()
pass a constant value (IPA_AGGR_GRANULARITY) as the usec argument.
Two of those callers are in ipa_validate_build(), with the result
being passed to BUILD_BUG_ON().

Evidently the "sparc64-linux-gcc" compiler (at least) doesn't always
inline ipa_aggr_granularity_val(), so the result of the function is
not constant at compile time, and that leads to build errors.

Define the function with the __always_inline attribute to avoid the
errors.  We can see by inspection that the value passed is never
zero, so we can just remove its WARN_ON() call.

Fixes: 5bc5588466a1f ("net: ipa: use WARN_ON() rather than assertions")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20210811135948.2634264-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge tag 'drm-misc-fixes-2021-08-12' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Thu, 12 Aug 2021 20:37:31 +0000 (06:37 +1000)]
Merge tag 'drm-misc-fixes-2021-08-12' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

Short summary of fixes pull:

 * meson: Fix colors when booting with HDR

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YRTb+qUuBYWjJDVg@linux-uq9g.fritz.box
3 years agoMerge tag 'drm-intel-fixes-2021-08-12' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Thu, 12 Aug 2021 20:29:12 +0000 (06:29 +1000)]
Merge tag 'drm-intel-fixes-2021-08-12' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- GVT fix for Windows VM hang.
- Display fix of 12 BPC bits for display 12 and newer.
- Don't try to access some media register for fused off domains.
- Fix kerneldoc build warnings.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YRU/hnQ1sNr+j37x@intel.com
3 years agoMerge tag 'ieee802154-for-davem-2021-08-12' of git://git.kernel.org/pub/scm/linux...
Jakub Kicinski [Thu, 12 Aug 2021 18:50:16 +0000 (11:50 -0700)]
Merge tag 'ieee802154-for-davem-2021-08-12' of git://git./linux/kernel/git/sschmidt/wpan

Stefan Schmidt says:

====================
ieee802154 for net 2021-08-12

Mostly fixes coming from bot reports. Dongliang Mu tackled some syzkaller
reports in hwsim again and Takeshi Misawa a memory leak  in  ieee802154 raw.

* tag 'ieee802154-for-davem-2021-08-12' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan:
  net: Fix memory leak in ieee802154_raw_deliver
  ieee802154: hwsim: fix GPF in hwsim_new_edge_nl
  ieee802154: hwsim: fix GPF in hwsim_set_edge_lqi
====================

Link: https://lore.kernel.org/r/20210812183912.1663996-1-stefan@datenfreihafen.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agovsock/virtio: avoid potential deadlock when vsock device remove
Longpeng(Mike) [Thu, 12 Aug 2021 05:30:56 +0000 (13:30 +0800)]
vsock/virtio: avoid potential deadlock when vsock device remove

There's a potential deadlock case when remove the vsock device or
process the RESET event:

  vsock_for_each_connected_socket:
      spin_lock_bh(&vsock_table_lock) ----------- (1)
      ...
          virtio_vsock_reset_sock:
              lock_sock(sk) --------------------- (2)
      ...
      spin_unlock_bh(&vsock_table_lock)

lock_sock() may do initiative schedule when the 'sk' is owned by
other thread at the same time, we would receivce a warning message
that "scheduling while atomic".

Even worse, if the next task (selected by the scheduler) try to
release a 'sk', it need to request vsock_table_lock and the deadlock
occur, cause the system into softlockup state.
  Call trace:
   queued_spin_lock_slowpath
   vsock_remove_bound
   vsock_remove_sock
   virtio_transport_release
   __vsock_release
   vsock_release
   __sock_release
   sock_close
   __fput
   ____fput

So we should not require sk_lock in this case, just like the behavior
in vhost_vsock or vmci.

Fixes: 0ea9e1d3a9e3 ("VSOCK: Introduce virtio_transport.ko")
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20210812053056.1699-1-longpeng2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm...
Linus Torvalds [Thu, 12 Aug 2021 17:20:16 +0000 (07:20 -1000)]
Merge branch 'for-v5.14' of git://git./linux/kernel/git/ebiederm/user-namespace

Pull ucounts fix from Eric Biederman:
 "This fixes the ucount sysctls on big endian architectures.

  The counts were expanded to be longs instead of ints, and the sysctl
  code was overlooked, so only the low 32bit were being processed. On
  litte endian just processing the low 32bits is fine, but on 64bit big
  endian processing just the low 32bits results in the high order bits
  instead of the low order bits being processed and nothing works
  proper.

  This change took a little bit to mature as we have the SYSCTL_ZERO,
  and SYSCTL_INT_MAX macros that are only usable for sysctls operating
  on ints, but unfortunately are not obviously broken. Which resulted in
  the versions of this change working on big endian and not on little
  endian, because the int SYSCTL_ZERO when extended 64bit wound up being
  0x100000000. So we only allowed values greater than 0x100000000 and
  less than 0faff. Which unfortunately broken everything that tried to
  set the sysctls. (First reported with the windows subsystem for
  linux).

  I have tested this on x86_64 64bit after first reproducing the
  problems with the earlier version of this change, and then verifying
  the problems do not exist when we use appropriate long min and max
  values for extra1 and extra2"

* 'for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  ucounts: add missing data type changes

3 years agoMerge tag 'sound-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Thu, 12 Aug 2021 17:06:40 +0000 (07:06 -1000)]
Merge tag 'sound-5.14-rc6' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "This seems to be a usual bump in the middle, containing lots of
  pending ASoC fixes:

   - Yet another PCM mmap regression fix

   - Fix for ASoC DAPM prefix handling

   - Various cs42l42 codec fixes

   - PCM buffer reference fixes in a few ASoC drivers

   - Fixes for ASoC SOF, AMD, tlv320, WM

   - HD-audio quirks"

* tag 'sound-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (32 commits)
  ALSA: hda/realtek: fix mute/micmute LEDs for HP ProBook 650 G8 Notebook PC
  ALSA: pcm: Fix mmap breakage without explicit buffer setup
  ALSA: hda: Add quirk for ASUS Flow x13
  ASoC: cs42l42: Fix mono playback
  ASoC: cs42l42: Constrain sample rate to prevent illegal SCLK
  ASoC: cs42l42: Fix LRCLK frame start edge
  ASoC: cs42l42: PLL must be running when changing MCLK_SRC_SEL
  ASoC: cs42l42: Remove duplicate control for WNF filter frequency
  ASoC: cs42l42: Fix inversion of ADC Notch Switch control
  ASoC: SOF: Intel: hda-ipc: fix reply size checking
  ASoC: SOF: Intel: Kconfig: fix SoundWire dependencies
  ASoC: amd: Fix reference to PCM buffer address
  ASoC: nau8824: Fix open coded prefix handling
  ASoC: kirkwood: Fix reference to PCM buffer address
  ASoC: uniphier: Fix reference to PCM buffer address
  ASoC: xilinx: Fix reference to PCM buffer address
  ASoC: intel: atom: Fix reference to PCM buffer address
  ASoC: cs42l42: Fix bclk calculation for mono
  ASoC: cs42l42: Don't allow SND_SOC_DAIFMT_LEFT_J
  ASoC: cs42l42: Correct definition of ADC Volume control
  ...

3 years agowwan: core: Avoid returning NULL from wwan_create_dev()
Andy Shevchenko [Wed, 11 Aug 2021 12:48:45 +0000 (15:48 +0300)]
wwan: core: Avoid returning NULL from wwan_create_dev()

Make wwan_create_dev() to return either valid or error pointer,
In some cases it may return NULL. Prevent this by converting
it to the respective error pointer.

Fixes: 9a44c1cc6388 ("net: Add a WWAN subsystem")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Link: https://lore.kernel.org/r/20210811124845.10955-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge tag 'mlx5-updates-2021-08-11' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Thu, 12 Aug 2021 11:45:41 +0000 (12:45 +0100)]
Merge tag 'mlx5-updates-2021-08-11' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 updates 2021-08-11

This series provides misc updates to mlx5.
For more information please see tag log below.

Please pull and let me know if there is any problem.

mlx5-updates-2021-08-11

Misc. cleanup for mlx5.

1) Typos and use of netdev_warn()
2) smatch cleanup
3) Minor fix to inner TTC table creation
4) Dynamic capability cache allocation
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'dsa-cross-chip-notifiers'
David S. Miller [Thu, 12 Aug 2021 10:46:21 +0000 (11:46 +0100)]
Merge branch 'dsa-cross-chip-notifiers'

Vladimir Oltean says:

====================
Improvements to the DSA tag_8021q cross-chip notifiers

This series improves cross-chip notifier error messages and addresses a
benign error message seen during reboot on a system with disjoint DSA
trees.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: tag_8021q: don't broadcast during setup/teardown
Vladimir Oltean [Wed, 11 Aug 2021 13:46:06 +0000 (16:46 +0300)]
net: dsa: tag_8021q: don't broadcast during setup/teardown

Currently, on my board with multiple sja1105 switches in disjoint trees
described in commit f66a6a69f97a ("net: dsa: permit cross-chip bridging
between all trees in the system"), rebooting the board triggers the
following benign warnings:

[   12.345566] sja1105 spi2.0: port 0 failed to notify tag_8021q VLAN 1088 deletion: -ENOENT
[   12.353804] sja1105 spi2.0: port 0 failed to notify tag_8021q VLAN 2112 deletion: -ENOENT
[   12.362019] sja1105 spi2.0: port 1 failed to notify tag_8021q VLAN 1089 deletion: -ENOENT
[   12.370246] sja1105 spi2.0: port 1 failed to notify tag_8021q VLAN 2113 deletion: -ENOENT
[   12.378466] sja1105 spi2.0: port 2 failed to notify tag_8021q VLAN 1090 deletion: -ENOENT
[   12.386683] sja1105 spi2.0: port 2 failed to notify tag_8021q VLAN 2114 deletion: -ENOENT

Basically switch 1 calls dsa_tag_8021q_unregister, and switch 1's TX and
RX VLANs cannot be found on switch 2's CPU port.

But why would switch 2 even attempt to delete switch 1's TX and RX
tag_8021q VLANs from its CPU port? Well, because we use dsa_broadcast,
and it is supposed that it had added those VLANs in the first place
(because in dsa_port_tag_8021q_vlan_match, all CPU ports match
regardless of their tree index or switch index).

The two trees probe asynchronously, and when switch 1 probed, it called
dsa_broadcast which did not notify the tree of switch 2, because that
didn't probe yet. But during unbind, switch 2's tree _is_ probed, so it
_is_ notified of the deletion.

Before jumping to introduce a synchronization mechanism between the
probing across disjoint switch trees, let's take a step back and see
whether we _need_ to do that in the first place.

The RX and TX VLANs of switch 1 would be needed on switch 2's CPU port
only if switch 1 and 2 were part of a cross-chip bridge. And
dsa_tag_8021q_bridge_join takes care precisely of that (but if probing
was synchronous, the bridge_join would just end up bumping the VLANs'
refcount, because they are already installed by the setup path).

Since by the time the ports are bridged, all DSA trees are already set
up, and we don't need the tag_8021q VLANs of one switch installed on the
other switches during probe time, the answer is that we don't need to
fix the synchronization issue.

So make the setup and teardown code paths call dsa_port_notify, which
notifies only the local tree, and the bridge code paths call
dsa_broadcast, which let the other trees know as well.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: print more information when a cross-chip notifier fails
Vladimir Oltean [Wed, 11 Aug 2021 13:46:05 +0000 (16:46 +0300)]
net: dsa: print more information when a cross-chip notifier fails

Currently this error message does not say a lot:

[   32.693498] DSA: failed to notify tag_8021q VLAN deletion: -ENOENT
[   32.699725] DSA: failed to notify tag_8021q VLAN deletion: -ENOENT
[   32.705931] DSA: failed to notify tag_8021q VLAN deletion: -ENOENT
[   32.712139] DSA: failed to notify tag_8021q VLAN deletion: -ENOENT
[   32.718347] DSA: failed to notify tag_8021q VLAN deletion: -ENOENT
[   32.724554] DSA: failed to notify tag_8021q VLAN deletion: -ENOENT

but in this form, it is immediately obvious (at least to me) what the
problem is, even without further looking at the code:

[   12.345566] sja1105 spi2.0: port 0 failed to notify tag_8021q VLAN 1088 deletion: -ENOENT
[   12.353804] sja1105 spi2.0: port 0 failed to notify tag_8021q VLAN 2112 deletion: -ENOENT
[   12.362019] sja1105 spi2.0: port 1 failed to notify tag_8021q VLAN 1089 deletion: -ENOENT
[   12.370246] sja1105 spi2.0: port 1 failed to notify tag_8021q VLAN 2113 deletion: -ENOENT
[   12.378466] sja1105 spi2.0: port 2 failed to notify tag_8021q VLAN 1090 deletion: -ENOENT
[   12.386683] sja1105 spi2.0: port 2 failed to notify tag_8021q VLAN 2114 deletion: -ENOENT

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodrm/doc/rfc: drop lmem uapi section
Daniel Vetter [Tue, 10 Aug 2021 14:27:48 +0000 (16:27 +0200)]
drm/doc/rfc: drop lmem uapi section

We still have quite a bit more work to do with overall reworking of
the ttm-based dg1 code, but the uapi stuff is now finalized with the
latest pull. So remove that.

This also fixes kerneldoc build warnings because we've included the
same headers in two places, resulting in sphinx complaining about
duplicated symbols. This regression has been created when we moved the
uapi definitions to the real include/uapi/ folder in 727ecd99a4c9
("drm/doc/rfc: drop the i915_gem_lmem.h header")

v2: Fix a few references that I missed, the htmldocs build took
forever.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Tested-by Stephen Rothwell <sfr@canb.auug.org.au> (v1)
References: https://lore.kernel.org/dri-devel/20210603193242.1ce99344@canb.auug.org.au/
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 727ecd99a4c9 ("drm/doc/rfc: drop the i915_gem_lmem.h header")
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210810142748.1983271-1-daniel.vetter@ffwll.ch
(cherry picked from commit dae2d28832968751f7731336b560a4a84a197b76)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
3 years agodrm/i915: Only access SFC_DONE when media domain is not fused off
Matt Roper [Fri, 6 Aug 2021 17:41:30 +0000 (10:41 -0700)]
drm/i915: Only access SFC_DONE when media domain is not fused off

The SFC_DONE register lives within the corresponding VD0/VD2/VD4/VD6
forcewake domain and is not accessible if the vdbox in that domain is
fused off and the forcewake is not initialized.

This mistake went unnoticed because until recently we were using the
wrong register offset for the SFC_DONE register; once the register
offset was corrected, we started hitting errors like

  <4> [544.989065] i915 0000:cc:00.0: Uninitialized forcewake domain(s) 0x80 accessed at 0x1ce000

on parts with fused-off vdbox engines.

Fixes: e50dbdbfd9fb ("drm/i915/tgl: Add SFC instdone to error state")
Fixes: 9c9c6d0ab08a ("drm/i915: Correct SFC_DONE register offset")
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210806174130.1058960-1-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit c5589bb5dccb0c5cb74910da93663f489589f3ce)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Changed Fixes tag to match the cherry-picked 82929a2140eb]

3 years agowwan: core: Unshadow error code returned by ida_alloc_range()
Andy Shevchenko [Wed, 11 Aug 2021 13:39:32 +0000 (16:39 +0300)]
wwan: core: Unshadow error code returned by ida_alloc_range()

ida_alloc_range() may return other than -ENOMEM error code.
Unshadow it in the wwan_create_port().

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodrm/i915/display: Fix the 12 BPC bits for PIPE_MISC reg
Ankit Nautiyal [Wed, 11 Aug 2021 05:18:57 +0000 (10:48 +0530)]
drm/i915/display: Fix the 12 BPC bits for PIPE_MISC reg

Till DISPLAY12 the PIPE_MISC bits 5-7 are used to set the
Dithering BPC, with valid values of 6, 8, 10 BPC.
For ADLP+ these bits are used to set the PORT OUTPUT BPC, with valid
values of: 6, 8, 10, 12 BPC, and need to be programmed whether
dithering is enabled or not.

This patch:
-corrects the bits 5-7 for PIPE MISC register for 12 BPC.
-renames the bits and mask to have generic names for these bits for
dithering bpc and port output bpc.

v3: Added a note for MIPI DSI which uses the PIPE_MISC for readout
for pipe_bpp. (Uma Shankar)

v2: Added 'display' to the subject and fixes tag. (Uma Shankar)

Fixes: 756f85cffef2 ("drm/i915/bdw: Broadwell has PIPEMISC")
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> (v1)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v3.13+
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210811051857.109723-1-ankit.k.nautiyal@intel.com
(cherry picked from commit 70418a68713c13da3f36c388087d0220b456a430)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
3 years agonet: dsa: sja1105: unregister the MDIO buses during teardown
Vladimir Oltean [Wed, 11 Aug 2021 11:59:45 +0000 (14:59 +0300)]
net: dsa: sja1105: unregister the MDIO buses during teardown

The call to sja1105_mdiobus_unregister is present in the error path but
absent from the main driver unbind path.

Fixes: 5a8f09748ee7 ("net: dsa: sja1105: register the MDIO buses for 100base-T1 and 100base-TX")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: mt7530: fix VLAN traffic leaks again
DENG Qingfang [Wed, 11 Aug 2021 09:50:43 +0000 (17:50 +0800)]
net: dsa: mt7530: fix VLAN traffic leaks again

When a port leaves a VLAN-aware bridge, the current code does not clear
other ports' matrix field bit. If the bridge is later set to VLAN-unaware
mode, traffic in the bridge may leak to that port.

Remove the VLAN filtering check in mt7530_port_bridge_leave.

Fixes: 474a2ddaa192 ("net: dsa: mt7530: fix VLAN traffic leaks")
Fixes: 83163f7dca56 ("net: dsa: mediatek: add VLAN support for MT7530")
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: nxp-tja11xx: log critical health state
Oleksij Rempel [Wed, 11 Aug 2021 06:37:12 +0000 (08:37 +0200)]
net: phy: nxp-tja11xx: log critical health state

TJA1102 provides interrupt notification for the critical health states
like overtemperature and undervoltage.

The overtemperature bit is set if package temperature is beyond 155C°.
This functionality was tested by heating the package up to 200C°

The undervoltage bit is set if supply voltage drops beyond some critical
threshold. Currently not tested.

In a typical use case, both of this events should be logged and stored
(or send to some remote system) for further investigations.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'pktgen-imix'
David S. Miller [Thu, 12 Aug 2021 09:50:33 +0000 (10:50 +0100)]
Merge branch 'pktgen-imix'

Nick Richardson says:

====================
pktgen: Add IMIX mode

Adds internet mix (IMIX) mode to pktgen. Internet mix is
included in many user-space network perf testing tools. It allows
for the user to specify a distribution of discrete packet sizes to be
generated. This type of test is common among vendors when perf testing
their devices.
link: https://datatracker.ietf.org/doc/html/rfc2544#section-9.1]
This allows users to get a
more complete picture of how their device will perform in the
real-world.

This feature adds a command that allows users to specify an imix
distribution in the following format:
  imix_weights size_1,weight_1 size_2,weight_2 ... size_n,weight_n

The distribution of packets with size_i will be
(weight_i / total_weights) where
total_weights = weight_1 + weight_2 + ... + weight_n

For example:
  imix_weights 40,7 576,4 1500,1

The pkt_size "40" will account for 7 / (7 + 4 + 1) = ~58% of the total
packets sent.

This patch was tested with the following:
1. imix_weights = 40,7 576,4 1500,1
2. imix_weights = 0,7 576,4 1500,1
  - Packet size of 0 is resized to the minimum, 42
3. imix_weights = 40,7 576,4 1500,1 count = 0
  - Zero count.
  - Runs until user stops pktgen.
Invalid Configurations
1. clone_skb = 200 imix_weights = 40,7 576,4 1500,1
    - Returns error code -524 (-ENOTSUPP) when setting imix_weights
2. len(imix_weights) > MAX_IMIX_ENTRIES
    - Returns -7 (-E2BIG)

This patch is split into three parts, each provide different aspects of
required functionality:
  1. Parse internet mix input.
  2. Add IMIX Distribution representation.
  3. Process and output IMIX results.

Changes in v2:
* Remove __ prefix outside of uAPI.
* Use seq_puts instead of seq_printf where necessary.
* Reorder variable declaration.
* Return -EINVAL instead of -ENOTSUPP when using IMIX with clone_skb > 0
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agopktgen: Add output for imix results
Nick Richardson [Tue, 10 Aug 2021 19:01:55 +0000 (19:01 +0000)]
pktgen: Add output for imix results

The bps for imix mode is calculated by:
sum(imix_entry.size) / time_elapsed

The actual counts of each imix_entry are displayed under the
"Current:" section of the interface output in the following format:
imix_size_counts: size_1,count_1 size_2,count_2 ... size_n,count_n

Example (count = 200000):
imix_weights: 256,1 859,3 205,2
imix_size_counts: 256,32082 859,99796 205,68122
Result: OK: 17992362(c17964678+d27684) usec, 200000 (859byte,0frags)
  11115pps 47Mb/sec (47977140bps) errors: 0

Summary of changes:
Calculate bps based on imix counters when in IMIX mode.
Add output for IMIX counters.

Signed-off-by: Nick Richardson <richardsonnick@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agopktgen: Add imix distribution bins
Nick Richardson [Tue, 10 Aug 2021 19:01:54 +0000 (19:01 +0000)]
pktgen: Add imix distribution bins

In order to represent the distribution of imix packet sizes, a
pre-computed data structure is used. It features 100 (IMIX_PRECISION)
"bins". Contiguous ranges of these bins represent the respective
packet size of each imix entry. This is done to avoid the overhead of
selecting the correct imix packet size based on the corresponding weights.

Example:
imix_weights 40,7 576,4 1500,1
total_weight = 7 + 4 + 1 = 12

pkt_size 40 occurs 7/total_weight = 58% of the time
pkt_size 576 occurs 4/total_weight = 33% of the time
pkt_size 1500 occurs 1/total_weight = 9% of the time

We generate a random number between 0-100 and select the corresponding
packet size based on the specified weights.
Eg. random number = 358723895 % 100 = 65
Selects the packet size corresponding to index:65 in the pre-computed
imix_distribution array.
An example of the  pre-computed array is below:

The imix_distribution will look like the following:
0        ->  0 (index of imix_entry.size == 40)
1        ->  0 (index of imix_entry.size == 40)
2        ->  0 (index of imix_entry.size == 40)
[...]    ->  0 (index of imix_entry.size == 40)
57       ->  0 (index of imix_entry.size == 40)
58       ->  1 (index of imix_entry.size == 576)
[...]    ->  1 (index of imix_entry.size == 576)
90       ->  1 (index of imix_entry.size == 576)
91       ->  2 (index of imix_entry.size == 1500)
[...]    ->  2 (index of imix_entry.size == 1500)
99       ->  2 (index of imix_entry.size == 1500)

Create and use "bin" representation of the imix distribution.

Signed-off-by: Nick Richardson <richardsonnick@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agopktgen: Parse internet mix (imix) input
Nick Richardson [Tue, 10 Aug 2021 19:01:53 +0000 (19:01 +0000)]
pktgen: Parse internet mix (imix) input

Adds "imix_weights" command for specifying internet mix distribution.

The command is in this format:
"imix_weights size_1,weight_1 size_2,weight_2 ... size_n,weight_n"
where the probability that packet size_i is picked is:
weight_i / (weight_1 + weight_2 + .. + weight_n)

The user may provide up to 100 imix entries (size_i,weight_i) in this
command.

The user specified imix entries will be displayed in the "Params"
section of the interface output.

Values for clone_skb > 0 is not supported in IMIX mode.

Summary of changes:
Add flag for enabling internet mix mode.
Add command (imix_weights) for internet mix input.
Return -ENOTSUPP when clone_skb > 0 in IMIX mode.
Display imix_weights in Params.
Create data structures to store imix entries and distribution.

Signed-off-by: Nick Richardson <richardsonnick@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoRevert "tipc: Return the correct errno code"
Hoang Le [Wed, 11 Aug 2021 01:22:09 +0000 (08:22 +0700)]
Revert "tipc: Return the correct errno code"

This reverts commit 0efea3c649f0 because of:
- The returning -ENOBUF error is fine on socket buffer allocation.
- There is side effect in the calling path
tipc_node_xmit()->tipc_link_xmit() when checking error code returning.

Fixes: 0efea3c649f0 ("tipc: Return the correct errno code")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: mscc: Fix non-GPL export of regmap APIs
Mark Brown [Tue, 10 Aug 2021 12:37:48 +0000 (13:37 +0100)]
net: mscc: Fix non-GPL export of regmap APIs

The ocelot driver makes use of regmap, wrapping it with driver specific
operations that are thin wrappers around the core regmap APIs. These are
exported with EXPORT_SYMBOL, dropping the _GPL from the core regmap
exports which is frowned upon. Add _GPL suffixes to at least the APIs that
are doing register I/O.

Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge tag 'orphans-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees...
Linus Torvalds [Thu, 12 Aug 2021 06:00:55 +0000 (20:00 -1000)]
Merge tag 'orphans-v5.14-rc6' of git://git./linux/kernel/git/kees/linux

Pull orphan section linker fix from Kees Cook:

 - Handle changes to Clang's Sanitizer section layout (Nathan
   Chancellor)

* tag 'orphans-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  vmlinux.lds.h: Handle clang's module.{c,d}tor sections

3 years agoMerge tag 'seccomp-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees...
Linus Torvalds [Thu, 12 Aug 2021 05:56:10 +0000 (19:56 -1000)]
Merge tag 'seccomp-v5.14-rc6' of git://git./linux/kernel/git/kees/linux

Pull seccomp fixes from Kees Cook:

 - Fix typo in user notification documentation (Rodrigo Campos)

 - Fix userspace counter report when using TSYNC (Hsuan-Chi Kuo, Wiktor
   Garbacz)

* tag 'seccomp-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  seccomp: Fix setting loaded filter count during TSYNC
  Documentation: seccomp: Fix typo in user notification

3 years agoMerge tag 'amd-drm-fixes-5.14-2021-08-11' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Thu, 12 Aug 2021 03:38:12 +0000 (13:38 +1000)]
Merge tag 'amd-drm-fixes-5.14-2021-08-11' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-5.14-2021-08-11:

amdgpu:
- Yellow carp update
- RAS EEPROM fixes
- BACO/BOCO fixes
- Fix a memory leak in an error path
- Freesync fix
- VCN harvesting fix
- Display fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210812022153.4005-1-alexander.deucher@amd.com
3 years agonet: bridge: vlan: fix global vlan option range dumping
Nikolay Aleksandrov [Tue, 10 Aug 2021 09:21:39 +0000 (12:21 +0300)]
net: bridge: vlan: fix global vlan option range dumping

When global vlan options are equal sequentially we compress them in a
range to save space and reduce processing time. In order to have the
proper range end id we need to update range_end if the options are equal
otherwise we get ranges with the same end vlan id as the start.

Fixes: 743a53d9636a ("net: bridge: vlan: add support for dumping global vlan options")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/r/20210810092139.11700-1-razor@blackwall.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomctp: Specify route types, require rtm_type in RTM_*ROUTE messages
Jeremy Kerr [Tue, 10 Aug 2021 02:38:34 +0000 (10:38 +0800)]
mctp: Specify route types, require rtm_type in RTM_*ROUTE messages

This change adds a 'type' attribute to routes, which can be parsed from
a RTM_NEWROUTE message. This will help to distinguish local vs. peer
routes in a future change.

This means userspace will need to set a correct rtm_type in RTM_NEWROUTE
and RTM_DELROUTE messages; we currently only accept RTN_UNICAST.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Link: https://lore.kernel.org/r/20210810023834.2231088-1-jk@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: igmp: increase size of mr_ifc_count
Eric Dumazet [Wed, 11 Aug 2021 19:57:15 +0000 (12:57 -0700)]
net: igmp: increase size of mr_ifc_count

Some arches support cmpxchg() on 4-byte and 8-byte only.
Increase mr_ifc_count width to 32bit to fix this problem.

Fixes: 4a2b285e7e10 ("net: igmp: fix data-race in igmp_ifc_timer_expire()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210811195715.3684218-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: hns3: add support for triggering reset by ethtool
Yufeng Mo [Tue, 10 Aug 2021 13:28:48 +0000 (21:28 +0800)]
net: hns3: add support for triggering reset by ethtool

Currently, four reset types are supported for the HNS3 ethernet
driver: IMP reset, global reset, function reset, and FLR. Only
FLR can now be triggered by the user. To restore the device when
an exception occurs, add support for triggering reset by ethtool.

Run the "ethtool --reset DEVNAME mgmt | all | dedicated" to
trigger the IMP | global | function reset manually.

In addition, VF can only trigger function reset.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Link: https://lore.kernel.org/r/1628602128-15640-1-git-send-email-huangguangbin2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMAINTAINERS: switch to my OMP email for Renesas Ethernet drivers
Sergey Shtylyov [Tue, 10 Aug 2021 20:17:12 +0000 (23:17 +0300)]
MAINTAINERS: switch to my OMP email for Renesas Ethernet drivers

I'm still going to continue looking after the Renesas Ethernet drivers and
device tree bindings. Now my new employer, Open Mobile Platform (OMP), will
pay for all my upstream work. Let's switch to my OMP email for the reviews.

Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/9c212711-a0d7-39cd-7840-ff7abf938da1@omp.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agotcp_bbr: fix u32 wrap bug in round logic if bbr_init() called after 2B packets
Neal Cardwell [Wed, 11 Aug 2021 02:40:56 +0000 (22:40 -0400)]
tcp_bbr: fix u32 wrap bug in round logic if bbr_init() called after 2B packets

Currently if BBR congestion control is initialized after more than 2B
packets have been delivered, depending on the phase of the
tp->delivered counter the tracking of BBR round trips can get stuck.

The bug arises because if tp->delivered is between 2^31 and 2^32 at
the time the BBR congestion control module is initialized, then the
initialization of bbr->next_rtt_delivered to 0 will cause the logic to
believe that the end of the round trip is still billions of packets in
the future. More specifically, the following check will fail
repeatedly:

  !before(rs->prior_delivered, bbr->next_rtt_delivered)

and thus the connection will take up to 2B packets delivered before
that check will pass and the connection will set:

  bbr->round_start = 1;

This could cause many mechanisms in BBR to fail to trigger, for
example bbr_check_full_bw_reached() would likely never exit STARTUP.

This bug is 5 years old and has not been observed, and as a practical
matter this would likely rarely trigger, since it would require
transferring at least 2B packets, or likely more than 3 terabytes of
data, before switching congestion control algorithms to BBR.

This patch is a stable candidate for kernels as far back as v4.9,
when tcp_bbr.c was added.

Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Kevin Yang <yyd@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20210811024056.235161-1-ncardwell@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'bonding-cleanup-header-file-and-error-msgs'
Jakub Kicinski [Wed, 11 Aug 2021 21:57:33 +0000 (14:57 -0700)]
Merge branch 'bonding-cleanup-header-file-and-error-msgs'

Jonathan Toppins says:

====================
bonding: cleanup header file and error msgs

Two small patches removing unreferenced symbols and unifying error
messages across netlink and printk.
====================

Link: https://lore.kernel.org/r/cover.1628650079.git.jtoppins@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobonding: combine netlink and console error messages
Jonathan Toppins [Wed, 11 Aug 2021 02:53:31 +0000 (22:53 -0400)]
bonding: combine netlink and console error messages

There seems to be no reason to have different error messages between
netlink and printk. It also cleans up the function slightly.

Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobonding: remove extraneous definitions from bonding.h
Jonathan Toppins [Wed, 11 Aug 2021 02:53:30 +0000 (22:53 -0400)]
bonding: remove extraneous definitions from bonding.h

All of the symbols either only exist in bond_options.c or nowhere at
all. These symbols were verified to not exist in the code base by
using `git grep` and their removal was verified by compiling bonding.ko.

Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: pcs: xpcs: fix error handling on failed to allocate memory
Wong Vee Khee [Tue, 10 Aug 2021 08:58:12 +0000 (16:58 +0800)]
net: pcs: xpcs: fix error handling on failed to allocate memory

Drivers such as sja1105 and stmmac that call xpcs_create() expects an
error returned by the pcs-xpcs module, but this was not the case on
failed to allocate memory.

Fixed this by returning an -ENOMEM instead of a NULL pointer.

Fixes: 3ad1d171548e ("net: dsa: sja1105: migrate to xpcs for SGMII")
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20210810085812.1808466-1-vee.khee.wong@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: mscc: Fix non-GPL export of regmap APIs
Mark Brown [Tue, 10 Aug 2021 12:37:48 +0000 (13:37 +0100)]
net: mscc: Fix non-GPL export of regmap APIs

The ocelot driver makes use of regmap, wrapping it with driver specific
operations that are thin wrappers around the core regmap APIs. These are
exported with EXPORT_SYMBOL, dropping the _GPL from the core regmap
exports which is frowned upon. Add _GPL suffixes to at least the APIs that
are doing register I/O.

Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210810123748.47871-1-broonie@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: linkwatch: fix failure to restore device state across suspend/resume
Willy Tarreau [Mon, 9 Aug 2021 16:06:28 +0000 (18:06 +0200)]
net: linkwatch: fix failure to restore device state across suspend/resume

After migrating my laptop from 4.19-LTS to 5.4-LTS a while ago I noticed
that my Ethernet port to which a bond and a VLAN interface are attached
appeared to remain up after resuming from suspend with the cable unplugged
(and that problem still persists with 5.10-LTS).

It happens that the following happens:

  - the network driver (e1000e here) prepares to suspend, calls e1000e_down()
    which calls netif_carrier_off() to signal that the link is going down.
  - netif_carrier_off() adds a link_watch event to the list of events for
    this device
  - the device is completely stopped.
  - the machine suspends
  - the cable is unplugged and the machine brought to another location
  - the machine is resumed
  - the queued linkwatch events are processed for the device
  - the device doesn't yet have the __LINK_STATE_PRESENT bit and its events
    are silently dropped
  - the device is resumed with its link down
  - the upper VLAN and bond interfaces are never notified that the link had
    been turned down and remain up
  - the only way to provoke a change is to physically connect the machine
    to a port and possibly unplug it.

The state after resume looks like this:
  $ ip -br li | egrep 'bond|eth'
  bond0            UP             e8:6a:64:64:64:64 <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP>
  eth0             DOWN           e8:6a:64:64:64:64 <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP>
  eth0.2@eth0      UP             e8:6a:64:64:64:64 <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP>

Placing an explicit call to netdev_state_change() either in the suspend
or the resume code in the NIC driver worked around this but the solution
is not satisfying.

The issue in fact really is in link_watch that loses events while it
ought not to. It happens that the test for the device being present was
added by commit 124eee3f6955 ("net: linkwatch: add check for netdevice
being present to linkwatch_do_dev") in 4.20 to avoid an access to
devices that are not present.

Instead of dropping events, this patch proceeds slightly differently by
postponing their handling so that they happen after the device is fully
resumed.

Fixes: 124eee3f6955 ("net: linkwatch: add check for netdevice being present to linkwatch_do_dev")
Link: https://lists.openwall.net/netdev/2018/03/15/62
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/r/20210809160628.22623-1-w@1wt.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agovmlinux.lds.h: Handle clang's module.{c,d}tor sections
Nathan Chancellor [Sat, 31 Jul 2021 02:31:08 +0000 (19:31 -0700)]
vmlinux.lds.h: Handle clang's module.{c,d}tor sections

A recent change in LLVM causes module_{c,d}tor sections to appear when
CONFIG_K{A,C}SAN are enabled, which results in orphan section warnings
because these are not handled anywhere:

ld.lld: warning: arch/x86/pci/built-in.a(legacy.o):(.text.asan.module_ctor) is being placed in '.text.asan.module_ctor'
ld.lld: warning: arch/x86/pci/built-in.a(legacy.o):(.text.asan.module_dtor) is being placed in '.text.asan.module_dtor'
ld.lld: warning: arch/x86/pci/built-in.a(legacy.o):(.text.tsan.module_ctor) is being placed in '.text.tsan.module_ctor'

Fangrui explains: "the function asan.module_ctor has the SHF_GNU_RETAIN
flag, so it is in a separate section even with -fno-function-sections
(default)".

Place them in the TEXT_TEXT section so that these technologies continue
to work with the newer compiler versions. All of the KASAN and KCSAN
KUnit tests continue to pass after this change.

Cc: stable@vger.kernel.org
Link: https://github.com/ClangBuiltLinux/linux/issues/1432
Link: https://github.com/llvm/llvm-project/commit/7b789562244ee941b7bf2cefeb3fc08a59a01865
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Fangrui Song <maskray@google.com>
Acked-by: Marco Elver <elver@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210731023107.1932981-1-nathan@kernel.org
3 years agoseccomp: Fix setting loaded filter count during TSYNC
Hsuan-Chi Kuo [Thu, 4 Mar 2021 23:37:08 +0000 (17:37 -0600)]
seccomp: Fix setting loaded filter count during TSYNC

The desired behavior is to set the caller's filter count to thread's.
This value is reported via /proc, so this fixes the inaccurate count
exposed to userspace; it is not used for reference counting, etc.

Signed-off-by: Hsuan-Chi Kuo <hsuanchikuo@gmail.com>
Link: https://lore.kernel.org/r/20210304233708.420597-1-hsuanchikuo@gmail.com
Co-developed-by: Wiktor Garbacz <wiktorg@google.com>
Signed-off-by: Wiktor Garbacz <wiktorg@google.com>
Link: https://lore.kernel.org/lkml/20210810125158.329849-1-wiktorg@google.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Fixes: c818c03b661c ("seccomp: Report number of loaded filters in /proc/$pid/status")

3 years agonet/mlx5e: Make use of netdev_warn()
Cai Huoqing [Tue, 10 Aug 2021 02:08:22 +0000 (10:08 +0800)]
net/mlx5e: Make use of netdev_warn()

to replace printk(KERN_WARNING ...) with netdev_warn() kindly

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: Fix variable type to match 64bit
Eran Ben Elisha [Tue, 10 Aug 2021 18:15:05 +0000 (21:15 +0300)]
net/mlx5: Fix variable type to match 64bit

Fix the following smatch warning:
wait_func_handle_exec_timeout() warn: should '1 << ent->idx' be a 64 bit type?

Use 1ULL, to have a 64 bit type variable.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Eran Ben Elisha <eranbe@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: Initialize numa node for all core devices
Parav Pandit [Wed, 16 Jun 2021 19:23:23 +0000 (22:23 +0300)]
net/mlx5: Initialize numa node for all core devices

Subsequent patches make use of numa node affinity for memory
allocations. Initialize it for PCI PF, VF and SF devices.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: Allocate individual capability
Parav Pandit [Tue, 13 Jul 2021 11:17:03 +0000 (14:17 +0300)]
net/mlx5: Allocate individual capability

Currently mlx5_core_dev contains array of capabilities. It contains 19
valid capabilities of the device, 2 reserved entries and 12 holes.
Due to this for 14 unused entries, mlx5_core_dev allocates 14 * 8K = 112K
bytes of memory which is never used. Due to this mlx5_core_dev structure
size is 270Kbytes odd. This allocation further aligns to next power of 2
to 512Kbytes.

By skipping non-existent entries,
(a) 112Kbyte is saved,
(b) mlx5_core_dev reduces to 8KB with alignment
(c) 350KB saved in alignment

In future individual capability allocation can be used to skip its
allocation when such capability is disabled at the device level. This
patch prepares mlx5_core_dev to hold capability using a pointer instead
of inline array.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: Reorganize current and maximal capabilities to be per-type
Parav Pandit [Tue, 13 Jul 2021 09:36:05 +0000 (12:36 +0300)]
net/mlx5: Reorganize current and maximal capabilities to be per-type

In the current code, the current and maximal capabilities are
maintained in separate arrays which are both per type. In order to
allow the creation of such a basic structure as a dynamically
allocated array, we move curr and max fields to a unified
structure so that specific capabilities can be allocated as one unit.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: SF, use recent sysfs api
Parav Pandit [Tue, 18 May 2021 05:50:04 +0000 (08:50 +0300)]
net/mlx5: SF, use recent sysfs api

Use sysfs_emit() which is aware of PAGE_SIZE buffer.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: Refcount mlx5_irq with integer
Shay Drory [Tue, 22 Jun 2021 11:20:16 +0000 (14:20 +0300)]
net/mlx5: Refcount mlx5_irq with integer

Currently, all access to mlx5 IRQs are done undere a lock. Hance, there
isn't a reason to have kref in struct mlx5_irq.
Switch it to integer.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: Change SF missing dedicated MSI-X err message to dbg
Shay Drory [Tue, 29 Jun 2021 11:47:30 +0000 (14:47 +0300)]
net/mlx5: Change SF missing dedicated MSI-X err message to dbg

When MSI-X vectors allocated are not enough for SFs to have dedicated,
MSI-X, kernel log buffer has too many entries.
Hence only enable such log with debug level.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: Align mlx5_irq structure
Shay Drory [Wed, 16 Jun 2021 15:58:26 +0000 (18:58 +0300)]
net/mlx5: Align mlx5_irq structure

mlx5_irq structure have holes due to incorrect position of fields in it.
Make them naturally align.

pahole output after alignment:
struct mlx5_irq {
        struct atomic_notifier_head nh;                  /*     0    72 */
        /* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */
        cpumask_var_t              mask;                 /*    72     8 */
        char                       name[32];             /*    80    32 */
        struct mlx5_irq_pool *     pool;                 /*   112     8 */
        struct kref                kref;                 /*   120     4 */
        u32                        index;                /*   124     4 */
        /* --- cacheline 2 boundary (128 bytes) --- */
        int                        irqn;                 /*   128     4 */

        /* size: 136, cachelines: 3, members: 7 */
        /* padding: 4 */
        /* last cacheline: 8 bytes */

};

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: Delete impossible dev->state checks
Leon Romanovsky [Sun, 1 Aug 2021 08:37:57 +0000 (11:37 +0300)]
net/mlx5: Delete impossible dev->state checks

New mlx5_core device structure is allocated through devlink_alloc
with\ kzalloc and that ensures that all fields are equal to zero
and it includes ->state too.

That means that checks of that field in the mlx5_init_one() is
completely redundant, because that function is called only once
in the begging of mlx5_core_dev lifetime.

PCI:
 .probe()
  -> probe_one()
   -> mlx5_init_one()

The recovery flow can't run at that time or before it, because relevant
work initialized later in mlx5_init_once().

Such initialization flow ensures that dev->state can't be
MLX5_DEVICE_STATE_UNINITIALIZED at all, so remove such impossible
checks.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: Fix inner TTC table creation
Maor Gottlieb [Mon, 9 Aug 2021 12:12:45 +0000 (15:12 +0300)]
net/mlx5: Fix inner TTC table creation

Fix typo of the cited commit that calls to mlx5_create_ttc_table, instead
of mlx5_create_inner_ttc_table.

Fixes: f4b45940e9b9 ("net/mlx5: Embed mlx5_ttc_table")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: Fix typo in comments
Cai Huoqing [Fri, 30 Jul 2021 03:03:00 +0000 (11:03 +0800)]
net/mlx5: Fix typo in comments

Fix typo:
*vectores  ==> vectors
*realeased  ==> released
*erros  ==> errors
*namepsace  ==> namespace
*trafic  ==> traffic
*proccessed  ==> processed
*retore  ==> restore
*Currenlty  ==> Currently
*crated  ==> created
*chane  ==> change
*cannnot  ==> cannot
*usuallly  ==> usually
*failes  ==> fails
*importent  ==> important
*reenabled  ==> re-enabled
*alocation  ==> allocation
*recived  ==> received
*tanslation  ==> translation

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agoMerge branch 'dsa-tagger-helpers'
David S. Miller [Wed, 11 Aug 2021 13:44:59 +0000 (14:44 +0100)]
Merge branch 'dsa-tagger-helpers'

Vladimir Oltean says:

====================
DSA tagger helpers

The goal of this series is to minimize the use of memmove and skb->data
in the DSA tagging protocol drivers. Unfiltered access to this level of
information is not very friendly to drive-by contributors, and sometimes
is also not the easiest to review.

For starters, I have converted the most common form of DSA tagging
protocols: the DSA headers which are placed where the EtherType is.

The helper functions introduced by this series are:
- dsa_alloc_etype_header
- dsa_strip_etype_header
- dsa_etype_header_pos_rx
- dsa_etype_header_pos_tx

This series is just a resend as non-RFC of v1.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: create a helper for locating EtherType DSA headers on TX
Vladimir Oltean [Tue, 10 Aug 2021 13:13:56 +0000 (16:13 +0300)]
net: dsa: create a helper for locating EtherType DSA headers on TX

Create a similar helper for locating the offset to the DSA header
relative to skb->data, and make the existing EtherType header taggers to
use it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: create a helper for locating EtherType DSA headers on RX
Vladimir Oltean [Tue, 10 Aug 2021 13:13:55 +0000 (16:13 +0300)]
net: dsa: create a helper for locating EtherType DSA headers on RX

It seems that protocol tagging driver writers are always surprised about
the formula they use to reach their EtherType header on RX, which
becomes apparent from the fact that there are comments in multiple
drivers that mention the same information.

Create a helper that returns a void pointer to skb->data - 2, as well as
centralize the explanation why that is the case.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: create a helper which allocates space for EtherType DSA headers
Vladimir Oltean [Tue, 10 Aug 2021 13:13:54 +0000 (16:13 +0300)]
net: dsa: create a helper which allocates space for EtherType DSA headers

Hide away the memmove used by DSA EtherType header taggers to shift the
MAC SA and DA to the left to make room for the header, after they've
called skb_push(). The call to skb_push() is still left explicit in
drivers, to be symmetric with dsa_strip_etype_header, and because not
all callers can be refactored to do it (for example, brcm_tag_xmit_ll
has common code for a pre-Ethernet DSA tag and an EtherType DSA tag).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: create a helper that strips EtherType DSA headers on RX
Vladimir Oltean [Tue, 10 Aug 2021 13:13:53 +0000 (16:13 +0300)]
net: dsa: create a helper that strips EtherType DSA headers on RX

All header taggers open-code a memmove that is fairly not all that
obvious, and we can hide the details behind a helper function, since the
only thing specific to the driver is the length of the header tag.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'devlink-aux-devices'
David S. Miller [Wed, 11 Aug 2021 13:34:22 +0000 (14:34 +0100)]
Merge branch 'devlink-aux-devices'

Parav Pandit says:

====================
devlink: Control auxiliary devices

Currently, for mlx5 multi-function device, a user is not able to control
which functionality to enable/disable. For example, each PCI
PF, VF, SF function by default has netdevice, RDMA and vdpa-net
devices always enabled.

Hence, enable user to control which device functionality to enable/disable.

This is achieved by using existing devlink params [1] to
enable/disable eth, rdma and vdpa net functionality control knob.

For example user interested in only vdpa device function: performs,

$ devlink dev param set pci/0000:06:00.0 name enable_rdma value false \
                   cmode driverinit
$ devlink dev param set pci/0000:06:00.0 name enable_eth value false \
                   cmode driverinit
$ devlink dev param set pci/0000:06:00.0 name enable_vnet value true \
                   cmode driverinit

$ devlink dev reload pci/0000:06:00.0

Reload command honors parameters set, initializes the device that user
has composed using devlink dev params and resources.
Devices before reload:

            mlx5_core.sf.4
         (subfunction device)
                  /\
                 /| \
                / |  \
               /  |   \
 mlx5_core.eth.4  |  mlx5_core.rdma.4
(SF eth aux dev)  |  (SF rdma aux dev)
    |             |        |
    |             |        |
 enp6s0f0s88      |      mlx5_0
 (SF netdev)      |  (SF rdma device)
                  |
         mlx5_core.vnet.4
         (SF vnet aux dev)
                 |
                 |
        auxiliary/mlx5_core.sf.4
        (vdpa net mgmt device)

Above example reconfigures the device with only VDPA functionality.
Devices after reload:

            mlx5_core.sf.4
         (subfunction device)
                  /\
                 /  \
                /    \
               /      \
 mlx5_core.vnet.4     no eth, no rdma aux devices
 (SF vnet aux dev)

Above parameters enable user to compose the device as needed based
on the use case.

Since devlink params are done on the devlink instance, these
knobs are uniformly usable for PCI PF, VF and SF devices.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/mlx5: Support enable_vnet devlink dev param
Parav Pandit [Tue, 10 Aug 2021 13:24:24 +0000 (16:24 +0300)]
net/mlx5: Support enable_vnet devlink dev param

Enable user to disable VDPA net auxiliary device so that when it is not
required, user can disable it.

For example,

$ devlink dev param set pci/0000:06:00.0 \
              name enable_vnet value false cmode driverinit
$ devlink dev reload pci/0000:06:00.0

At this point devlink instance do not create auxiliary device
mlx5_core.vnet.2 for the VDPA net functionality.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/mlx5: Support enable_rdma devlink dev param
Parav Pandit [Tue, 10 Aug 2021 13:24:23 +0000 (16:24 +0300)]
net/mlx5: Support enable_rdma devlink dev param

Enable user to disable RDMA auxiliary device so that when it is not
required, user can disable it.

For example,

$ devlink dev param set pci/0000:06:00.0 \
              name enable_rdma value false cmode driverinit
$ devlink dev reload pci/0000:06:00.0

At this point devlink instance do not create auxiliary device
mlx5_core.rdma.2 for the RDMA functionality.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/mlx5: Support enable_eth devlink dev param
Parav Pandit [Tue, 10 Aug 2021 13:24:22 +0000 (16:24 +0300)]
net/mlx5: Support enable_eth devlink dev param

Enable user to disable Ethernet auxiliary device so that when it is not
required, user can disable it.

For example,

$ devlink dev param set pci/0000:06:00.0 \
              name enable_eth value false cmode driverinit
$ devlink dev reload pci/0000:06:00.0

At this point devlink instance do not create mlx5_core.eth.2 auxiliary
device for the Ethernet functionality.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/mlx5: Fix unpublish devlink parameters
Parav Pandit [Tue, 10 Aug 2021 13:24:21 +0000 (16:24 +0300)]
net/mlx5: Fix unpublish devlink parameters

Cleanup routine missed to unpublish the parameters. Add it.

Fixes: e890acd5ff18 ("net/mlx5: Add devlink flow_steering_mode parameter")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodevlink: Add APIs to publish, unpublish individual parameter
Parav Pandit [Tue, 10 Aug 2021 13:24:20 +0000 (16:24 +0300)]
devlink: Add APIs to publish, unpublish individual parameter

Enable drivers to publish/unpublish individual parameter.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodevlink: Add API to register and unregister single parameter
Parav Pandit [Tue, 10 Aug 2021 13:24:19 +0000 (16:24 +0300)]
devlink: Add API to register and unregister single parameter

Currently device configuration parameters can be registered as an array.
Due to this a constant array must be registered. A single driver
supporting multiple devices each with different device capabilities end
up registering all parameters even if it doesn't support it.

One possible workaround a driver can do is, it registers multiple single
entry arrays to overcome such limitation.

Better is to provide a API that enables driver to register/unregister a
single parameter. This also further helps in two ways.
(1) to reduce the memory of devlink_param_entry by avoiding in registering
parameters which are not supported by the device.
(2) avoid generating multiple parameter add, delete, publish, unpublish,
init value notifications for such unsupported parameters

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodevlink: Create a helper function for one parameter registration
Parav Pandit [Tue, 10 Aug 2021 13:24:18 +0000 (16:24 +0300)]
devlink: Create a helper function for one parameter registration

Create and use a helper function for one parameter registration.
Subsequent patch also will reuse this for driver facing routine to
register a single parameter.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodevlink: Add new "enable_vnet" generic device param
Parav Pandit [Tue, 10 Aug 2021 13:24:17 +0000 (16:24 +0300)]
devlink: Add new "enable_vnet" generic device param

Add new device generic parameter to enable/disable creation of
VDPA net auxiliary device and associated device functionality
in the devlink instance.

User who prefers to disable such functionality can disable it using below
example.

$ devlink dev param set pci/0000:06:00.0 \
              name enable_vnet value false cmode driverinit
$ devlink dev reload pci/0000:06:00.0

At this point devlink instance do not create auxiliary device for the
VDPA net functionality.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodevlink: Add new "enable_rdma" generic device param
Parav Pandit [Tue, 10 Aug 2021 13:24:16 +0000 (16:24 +0300)]
devlink: Add new "enable_rdma" generic device param

Add new device generic parameter to enable/disable creation of
RDMA auxiliary device and associated device functionality
in the devlink instance.

User who prefers to disable such functionality can disable it using below
example.

$ devlink dev param set pci/0000:06:00.0 \
              name enable_rdma value false cmode driverinit
$ devlink dev reload pci/0000:06:00.0

At this point devlink instance do not create auxiliary device for the
RDMA functionality.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodevlink: Add new "enable_eth" generic device param
Parav Pandit [Tue, 10 Aug 2021 13:24:15 +0000 (16:24 +0300)]
devlink: Add new "enable_eth" generic device param

Add new device generic parameter to enable/disable creation of
Ethernet auxiliary device and associated device functionality
in the devlink instance.

User who prefers to disable such functionality can disable it using below
example.

$ devlink dev param set pci/0000:06:00.0 \
              name enable_eth value false cmode driverinit
$ devlink dev reload pci/0000:06:00.0

At this point devlink instance do not create auxiliary device for the
Ethernet functionality.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'bridge-global-mcast'
David S. Miller [Wed, 11 Aug 2021 12:34:41 +0000 (13:34 +0100)]
Merge branch 'bridge-global-mcast'

Nikolay Aleksandrov says:

====================
net: bridge: vlan: add global mcast options

This is the first follow-up set after the support for per-vlan multicast
contexts which extends global vlan options to support bridge's multicast
config per-vlan, it enables user-space to change and dump the already
existing bridge vlan multicast context options. The global option patches
(01 - 09 and 12-13) follow a similar pattern of changing current mcast
functions to take multicast context instead of a port/bridge directly.
Option equality checks have been added for dumping vlan range compression.
The last 2 patches extend the mcast router dump support so it can be
re-used when dumping vlan config.

patches 01 - 09: add support for various mcast options
patches 10 - 11: prepare for per-vlan querier control
patches 12 - 13: add support for querier control and router control
patches 14 - 15: add support for dumping per-vlan router ports

Next patch-sets:
 - per-port/vlan router option config
 - iproute2 support for all new vlan options
 - selftests
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bridge: vlan: use br_rports_fill_info() to export mcast router ports
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:33 +0000 (18:29 +0300)]
net: bridge: vlan: use br_rports_fill_info() to export mcast router ports

Embed the standard multicast router port export by br_rports_fill_info()
into a new global vlan attribute BRIDGE_VLANDB_GOPTS_MCAST_ROUTER_PORTS.
In order to have the same format for the global bridge mcast context and
the per-vlan mcast context we need a double-nesting:
 - BRIDGE_VLANDB_GOPTS_MCAST_ROUTER_PORTS
   - MDBA_ROUTER

Currently we don't compare router lists, if any router port exists in
the bridge mcast contexts we consider their option sets as different and
export them separately.

In addition we export the router port vlan id when dumping similar to
the router port notification format.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bridge: mcast: use the proper multicast context when dumping router ports
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:32 +0000 (18:29 +0300)]
net: bridge: mcast: use the proper multicast context when dumping router ports

When we are dumping the router ports of a vlan mcast context we need to
use the bridge/vlan and port/vlan's multicast contexts to check if
IPv4/IPv6 router port is present and later to dump the vlan id.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bridge: vlan: add support for mcast router global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:31 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast router global option

Add support to change and retrieve global vlan multicast router state
which is used for the bridge itself. We just need to pass multicast context
to br_multicast_set_router instead of bridge device and the rest of the
logic remains the same.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bridge: vlan: add support for mcast querier global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:30 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast querier global option

Add support to change and retrieve global vlan multicast querier state.
We just need to pass multicast context to br_multicast_set_querier
instead of bridge device and the rest of the logic remains the same.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bridge: mcast: querier and query state affect only current context type
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:29 +0000 (18:29 +0300)]
net: bridge: mcast: querier and query state affect only current context type

It is a minor optimization and better behaviour to make sure querier and
query sending routines affect only the matching multicast context
depending if vlan snooping is enabled (vlan ctx vs bridge ctx).
It also avoids sending unnecessary extra query packets.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bridge: mcast: move querier state to the multicast context
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:28 +0000 (18:29 +0300)]
net: bridge: mcast: move querier state to the multicast context

We need to have the querier state per multicast context in order to have
per-vlan control, so remove the internal option bit and move it to the
multicast context. Also annotate the lockless reads of the new variable.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bridge: vlan: add support for mcast startup query interval global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:27 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast startup query interval global option

Add support to change and retrieve global vlan multicast startup query
interval option.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bridge: vlan: add support for mcast query response interval global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:26 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast query response interval global option

Add support to change and retrieve global vlan multicast query response
interval option.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bridge: vlan: add support for mcast query interval global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:25 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast query interval global option

Add support to change and retrieve global vlan multicast query interval
option.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bridge: vlan: add support for mcast querier interval global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:24 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast querier interval global option

Add support to change and retrieve global vlan multicast querier interval
option.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bridge: vlan: add support for mcast membership interval global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:23 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast membership interval global option

Add support to change and retrieve global vlan multicast membership
interval option.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bridge: vlan: add support for mcast last member interval global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:22 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast last member interval global option

Add support to change and retrieve global vlan multicast last member
interval option.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bridge: vlan: add support for mcast startup query count global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:21 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast startup query count global option

Add support to change and retrieve global vlan multicast startup query
count option.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bridge: vlan: add support for mcast last member count global option
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:20 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast last member count global option

Add support to change and retrieve global vlan multicast last member
count option.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bridge: vlan: add support for mcast igmp/mld version global options
Nikolay Aleksandrov [Tue, 10 Aug 2021 15:29:19 +0000 (18:29 +0300)]
net: bridge: vlan: add support for mcast igmp/mld version global options

Add support to change and retrieve global vlan IGMP/MLD versions.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'ipa-runtime-pm'
David S. Miller [Wed, 11 Aug 2021 12:31:56 +0000 (13:31 +0100)]
Merge branch 'ipa-runtime-pm'

Alex Elder says:

====================
net: ipa: use runtime PM reference counting

This series does further rework of the IPA clock code so that we
rely on some of the core runtime power management code (including
its referencing counting) instead.

The first patch makes ipa_clock_get() act like pm_runtime_get_sync().

The second patch makes system suspend occur regardless of the
current reference count value, which is again more like how the
runtime PM core code behaves.

The third patch creates functions to encapsulate all hardware
suspend and resume activity.  The fourth uses those functions as
the ->runtime_suspend and ->runtime_resume power callbacks.  With
that in place, ipa_clock_get() and ipa_clock_put() are changed to
use runtime PM get and put functions when needed.

The fifth patch eliminates an extra clock reference previously used
to control system suspend.  The sixth eliminates the "IPA clock"
reference count and mutex.

The final patch replaces the one call to ipa_clock_get_additional()
with a call to pm_runtime_get_if_active(), making the former
unnecessary.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipa: kill ipa_clock_get_additional()
Alex Elder [Tue, 10 Aug 2021 19:27:04 +0000 (14:27 -0500)]
net: ipa: kill ipa_clock_get_additional()

Now that ipa_clock_get_additional() is a trivial wrapper around
pm_runtime_get_if_active(), just open-code it in its only caller
and delete the function.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipa: kill IPA clock reference count
Alex Elder [Tue, 10 Aug 2021 19:27:03 +0000 (14:27 -0500)]
net: ipa: kill IPA clock reference count

The runtime power management core code maintains a usage count.  This
count mirrors the IPA clock reference count, and there's no need to
maintain both.  So get rid of the IPA clock reference count and just
rely on the runtime PM usage count to determine when the hardware
should be suspended or resumed.

Use pm_runtime_get_if_active() in ipa_clock_get_additional().  We
care whether power is active, regardless of whether it's in use, so
pass true for its ign_usage_count argument.

The IPA clock mutex is just used to make enabling/disabling the
clock and updating the reference count occur atomically.  Without
the reference count, there's no need for the mutex, so get rid of
that too.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipa: get rid of extra clock reference
Alex Elder [Tue, 10 Aug 2021 19:27:02 +0000 (14:27 -0500)]
net: ipa: get rid of extra clock reference

Suspending the IPA hardware is now managed by the runtime PM core
code.  The ->runtime_idle callback returns a non-zero value, so it
will never suspend except when forced.  As a result, there's no need
to take an extra "do not suspend" clock reference.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipa: use runtime PM core
Alex Elder [Tue, 10 Aug 2021 19:27:01 +0000 (14:27 -0500)]
net: ipa: use runtime PM core

Use the runtime power management core to cause hardware suspend and
resume to occur.  Enable it in ipa_clock_init() (without autosuspend),
and disable it in ipa_clock_exit().

Use ipa_runtime_suspend() as the ->runtime_suspend power operation,
and arrange for it to be called by having ipa_clock_get() call
pm_runtime_get_sync() when the first clock reference is taken.
Similarly, use ipa_runtime_resume() as the ->runtime_resume power
operation, and pm_runtime_put() when the last IPA clock reference
is dropped.

Introduce ipa_runtime_idle() as the ->runtime_idle power operation,
and have it return a non-zero value; this way suspend will never
occur except when forced.

Use pm_runtime_force_suspend() and pm_runtime_force_resume() as the
system suspend and resume callbacks, and remove ipa_suspend() and
ipa_resume().

Store a pointer to the device structure passed to ipa_clock_init(),
so it can be used by ipa_clock_exit() to disable runtime power
management.

For now we preserve IPA clock reference counting.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>