Edward Cree [Mon, 7 Aug 2017 14:29:34 +0000 (15:29 +0100)]
selftests/bpf: add tests for subtraction & negative numbers
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Mon, 7 Aug 2017 14:29:11 +0000 (15:29 +0100)]
selftests/bpf: don't try to access past MAX_PACKET_OFF in test_verifier
A number of selftests fell foul of the changed MAX_PACKET_OFF handling.
For instance, "direct packet access: test2" was potentially reading four
bytes from pkt + 0xffff, which could take it past the verifier's limit,
causing the program to be rejected (checks against pkt_end didn't give
us any reg->range).
Increase the shifts by one so that R2 is now mask 0x7fff instead of
mask 0xffff.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Mon, 7 Aug 2017 14:28:45 +0000 (15:28 +0100)]
selftests/bpf: add test for bogus operations on pointers
Tests non-add/sub operations (AND, LSH) on pointers decaying them to
unknown scalars.
Also tests that a pkt_ptr add which could potentially overflow is rejected
(find_good_pkt_pointers ignores it and doesn't give us any reg->range).
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Mon, 7 Aug 2017 14:28:00 +0000 (15:28 +0100)]
selftests/bpf: add a test to test_align
New test adds 14 to the unknown value before adding to the packet pointer,
meaning there's no 'fixed offset' field and instead we add into the
var_off, yielding a '4n+2' value.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Mon, 7 Aug 2017 14:27:34 +0000 (15:27 +0100)]
selftests/bpf: rewrite test_align
Expectations have changed, as has the format of the logged state.
To make the tests easier to read, add a line-matching framework so that
each match need only quote the register it cares about. (Multiple
matches may refer to the same line, but matches must be listed in
order of increasing line.)
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Mon, 7 Aug 2017 14:27:12 +0000 (15:27 +0100)]
selftests/bpf: change test_verifier expectations
Some of the verifier's error messages have changed, and some constructs
that previously couldn't be verified are now accepted.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Mon, 7 Aug 2017 14:26:56 +0000 (15:26 +0100)]
bpf/verifier: more concise register state logs for constant var_off
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Mon, 7 Aug 2017 14:26:36 +0000 (15:26 +0100)]
bpf/verifier: track signed and unsigned min/max values
Allows us to, sometimes, combine information from a signed check of one
bound and an unsigned check of the other.
We now track the full range of possible values, rather than restricting
ourselves to [0, 1<<30) and considering anything beyond that as
unknown. While this is probably not necessary, it makes the code more
straightforward and symmetrical between signed and unsigned bounds.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Mon, 7 Aug 2017 14:26:19 +0000 (15:26 +0100)]
bpf/verifier: rework value tracking
Unifies adjusted and unadjusted register value types (e.g. FRAME_POINTER is
now just a PTR_TO_STACK with zero offset).
Tracks value alignment by means of tracking known & unknown bits. This
also replaces the 'reg->imm' (leading zero bits) calculations for (what
were) UNKNOWN_VALUEs.
If pointer leaks are allowed, and adjust_ptr_min_max_vals returns -EACCES,
treat the pointer as an unknown scalar and try again, because we might be
able to conclude something about the result (e.g. pointer & 0x40 is either
0 or 0x40).
Verifier hooks in the netronome/nfp driver were changed to match the new
data structures.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Greg Edwards [Wed, 28 Jun 2017 15:22:25 +0000 (09:22 -0600)]
igb: expose mailbox unlock method
Add a mailbox unlock method to e1000_mbx_operations, which will be used
to unlock the PF/VF mailbox by the PF.
Signed-off-by: Greg Edwards <gedwards@ddn.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Edwards [Wed, 28 Jun 2017 15:22:24 +0000 (09:22 -0600)]
igb: add argument names to mailbox op function declarations
Signed-off-by: Greg Edwards <gedwards@ddn.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Arvind Yadav [Tue, 8 Aug 2017 15:58:41 +0000 (21:28 +0530)]
net: usb: rtl8150: constify usb_device_id
usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arvind Yadav [Tue, 8 Aug 2017 15:58:05 +0000 (21:28 +0530)]
net: usb: r8152: constify usb_device_id
usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arvind Yadav [Tue, 8 Aug 2017 15:58:04 +0000 (21:28 +0530)]
net: usb: kaweth: constify usb_device_id
usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arvind Yadav [Tue, 8 Aug 2017 15:58:03 +0000 (21:28 +0530)]
net: usb: ipheth: constify usb_device_id
usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arvind Yadav [Tue, 8 Aug 2017 15:58:02 +0000 (21:28 +0530)]
net: usb: cdc-phonet: constify usb_device_id
usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arvind Yadav [Tue, 8 Aug 2017 15:58:01 +0000 (21:28 +0530)]
net: usb: catc: constify usb_device_id and fix space before '[' error
usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.
Fix checkpatch.pl error:
ERROR: space prohibited before open square bracket '['.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arvind Yadav [Tue, 8 Aug 2017 15:56:45 +0000 (21:26 +0530)]
net: irda: stir4200: constify usb_device_id
usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arvind Yadav [Tue, 8 Aug 2017 15:56:44 +0000 (21:26 +0530)]
net: irda: mcs7780: constify usb_device_id
usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arvind Yadav [Tue, 8 Aug 2017 15:56:43 +0000 (21:26 +0530)]
net: irda: ksdazzle: constify usb_device_id
usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arvind Yadav [Tue, 8 Aug 2017 15:56:42 +0000 (21:26 +0530)]
net: irda: ks959: constify usb_device_id
usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arvind Yadav [Tue, 8 Aug 2017 15:56:41 +0000 (21:26 +0530)]
net: irda: kingsun: constify usb_device_id
usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arvind Yadav [Tue, 8 Aug 2017 15:56:40 +0000 (21:26 +0530)]
net: irda: irda-usb: constify usb_device_id
usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Corinna Vinschen [Fri, 23 Jun 2017 12:26:30 +0000 (14:26 +0200)]
igb: Remove incorrect "unexpected SYS WRAP" log message
TSAUXC.DisableSystime is never set, so SYSTIM runs into a SYS WRAP
every 1100 secs on 80580/i350/i354 (40 bit SYSTIM) and every 35000
secs on 80576 (45 bit SYSTIM).
This wrap event sets the TSICR.SysWrap bit unconditionally.
However, checking TSIM at interrupt time shows that this event does not
actually cause the interrupt. Rather, it's just bycatch while the
actual interrupt is caused by, for instance, TSICR.TXTS.
The conclusion is that the SYS WRAP is actually expected, so the
"unexpected SYS WRAP" message is entirely bogus and just helps to
confuse users. Drop it.
Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Gustavo A R Silva [Tue, 20 Jun 2017 21:22:34 +0000 (16:22 -0500)]
e1000e: add check on e1e_wphy() return value
Check return value from call to e1e_wphy(). This value is being
checked during previous calls to function e1e_wphy() and it seems
a check was missing here.
Addresses-Coverity-ID: 1226905
Signed-off-by: Gustavo A R Silva <garsilva@embeddedor.com>
Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cliff Spradlin [Mon, 19 Jun 2017 20:30:43 +0000 (13:30 -0700)]
igb: protect TX timestamping from API misuse
HW timestamping can only be requested for a packet if the NIC is first
setup via ioctl(SIOCSHWTSTAMP). If this step was skipped, then the igb
driver still allowed TX packets to request HW timestamping. In this
situation, the _IGB_PTP_TX_IN_PROGRESS flag was set and would never
clear. This prevented any future HW timestamping requests to succeed.
Fix this by checking that the NIC is configured for HW TX timestamping
before accepting a HW TX timestamping request.
Signed-off-by: Cliff Spradlin <cspradlin@google.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Gangfeng Huang [Sat, 27 May 2017 01:17:53 +0000 (09:17 +0800)]
igb: Fix error of RX network flow classification
After add an ethertype filter, if user change the adapter speed several
times, the error "ethtool -N: etype filters are all used" is reported by
igb driver.
In older patch, function igb_nfc_filter_exit() and igb_nfc_filter_restore()
is not paried. igb_nfc_filter_restore() exist in igb_up(), but function
igb_nfc_filter_exit() is exist in __igb_close(). In the process of speed
changing, only igb_nfc_filter_restore() is called, it will take a position
of ethertype bitmap.
Reproduce steps:
Step 1: Add a etype filter by ethtool
$ethtool -N eth0 flow-type ether proto 0x88F8 action 1
Step 2: Change the adapter speed to 100M/full duplex
$ethtool -s eth0 speed 100 duplex full
Step 3: Change the adapter speed to 1000M/full duplex
ethtool -s eth0 speed 1000 duplex full
Repeat step2 and step3, then dmesg the system log, you can find the error
message, add new ethtype filter is also failed.
This fixing is move igb_nfc_filter_exit() from __igb_close() to igb_down()
to make igb_nfc_filter_restore()/igb_nfc_filter_exit() is paired.
Signed-off-by: Gangfeng Huang <gangfeng.huang@ni.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Linus Torvalds [Tue, 8 Aug 2017 18:42:33 +0000 (11:42 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
"Third set of -rc fixes for 4.13 cycle
- small set of miscellanous fixes
- a reasonably sizable set of IPoIB fixes that deal with multiple
long standing issues"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/hns: checking for IS_ERR() instead of NULL
RDMA/mlx5: Fix existence check for extended address vector
IB/uverbs: Fix device cleanup
RDMA/uverbs: Prevent leak of reserved field
IB/core: Fix race condition in resolving IP to MAC
IB/ipoib: Notify on modify QP failure only when relevant
Revert "IB/core: Allow QP state transition from reset to error"
IB/ipoib: Remove double pointer assigning
IB/ipoib: Clean error paths in add port
IB/ipoib: Add get statistics support to SRIOV VF
IB/ipoib: Add multicast packets statistics
IB/ipoib: Set IPOIB_NEIGH_TBL_FLUSH after flushed completion initialization
IB/ipoib: Prevent setting negative values to max_nonsrq_conn_qp
IB/ipoib: Make sure no in-flight joins while leaving that mcast
IB/ipoib: Use cancel_delayed_work_sync when needed
IB/ipoib: Fix race between light events and interface restart
Joe Perches [Sun, 6 Aug 2017 01:45:49 +0000 (18:45 -0700)]
parse-maintainers: Move matching sections from MAINTAINERS
Allow any number of command line arguments to match either the
section header or the section contents and create new files.
Create MAINTAINERS.new and SECTION.new.
This allows scripting of the movement of various sections from
MAINTAINERS.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Sun, 6 Aug 2017 01:45:48 +0000 (18:45 -0700)]
parse-maintainers: Use perl hash references and specific filenames
Instead of reading STDIN and writing STDOUT, use specific filenames of
MAINTAINERS and MAINTAINERS.new.
Use hash references instead of global hash %hash so future modifications
can read and write specific hashes to split up MAINTAINERS into multiple
files using a script.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Sun, 6 Aug 2017 01:45:47 +0000 (18:45 -0700)]
parse-maintainers: Add section pattern sorting
Section [A-Z]: patterns are not currently in any required sorting order.
Add a specific sorting sequence to MAINTAINERS entries.
Sort F: and X: patterns in alphabetic order.
The preferred section ordering is:
SECTION HEADER
M: Maintainers
R: Reviewers
P: Named persons without email addresses
L: Mailing list addresses
S: Status of this section (Supported, Maintained, Orphan, etc...)
W: Any relevant URLs
T: Source code control type (git, quilt, etc)
Q: Patchwork patch acceptance queue site
B: Bug tracking URIs
C: Chat URIs
F: Files with wildcard patterns (alphabetic ordered)
X: Excluded files with wildcard patterns (alphabetic ordered)
N: Files with regex patterns
K: Keyword regexes in source code for maintainership identification
Miscellaneous perl neatening:
- Rename %map to %hash, map has a different meaning in perl
- Avoid using \& and local variables for function indirection
- Use return for a little c like clarity
- Use c-like function call style instead of &function
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Sat, 5 Aug 2017 04:45:48 +0000 (21:45 -0700)]
get_maintainer: Prepare for separate MAINTAINERS files
Allow for MAINTAINERS to become a directory and if it is,
read all the files in the directory for maintained sections.
Optionally look for all files named MAINTAINERS in directories
excluding the .git directory by using --find-maintainer-files.
This optional feature adds ~.3 seconds of CPU on an Intel
i5-6200 with an SSD.
Miscellanea:
- Create a read_maintainer_file subroutine from the existing code
- Test only the existence of MAINTAINERS, not whether it's a file
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Wed, 2 Aug 2017 17:57:45 +0000 (10:57 -0700)]
MAINTAINERS: openbmc mailing list is moderated
The openbmc mailing list is moderated for non-subscribers.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Brendan Higgins <brendanhiggins@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Joel Stanley <joel@jms.id.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sedat Dilek [Tue, 25 Jul 2017 12:53:42 +0000 (14:53 +0200)]
MAINTAINERS: greybus: Fix typo s/LOOBACK/LOOPBACK
Fixes:
f47e07bc5f1a5c48 ("Fix up MAINTAINERS file problems")
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 8 Aug 2017 16:38:41 +0000 (09:38 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two small fixes, one re-fix of a previous fix and five patches sorting
out hotplug in the bnx2X class of drivers. The latter is rather
involved, but necessary because these drivers have started dropping
lockdep recursion warnings on the hotplug lock because of its
conversion to a percpu rwsem"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sg: only check for dxfer_len greater than 256M
scsi: aacraid: reading out of bounds
scsi: qedf: Limit number of CQs
scsi: bnx2i: Simplify cpu hotplug code
scsi: bnx2fc: Simplify CPU hotplug code
scsi: bnx2i: Prevent recursive cpuhotplug locking
scsi: bnx2fc: Prevent recursive cpuhotplug locking
scsi: bnx2fc: Plug CPU hotplug race
Helge Deller [Tue, 8 Aug 2017 16:28:41 +0000 (18:28 +0200)]
random: fix warning message on ia64 and parisc
Fix the warning message on the parisc and IA64 architectures to show the
correct function name of the caller by using %pS instead of %pF. The
message is printed with the value of _RET_IP_ which calls
__builtin_return_address(0) and as such returns the IP address caller
instead of pointer to a function descriptor of the caller.
The effect of this patch is visible on the parisc and ia64 architectures
only since those are the ones which use function descriptors while on
all others %pS and %pF will behave the same.
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes:
eecabf567422 ("random: suppress spammy warnings about unseeded randomness")
Fixes:
d06bfd1989fe ("random: warn when kernel uses unseeded randomness")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 8 Aug 2017 01:58:10 +0000 (18:58 -0700)]
Merge tag 'xtensa-
20170807' of git://github.com/jcmvbkbc/linux-xtensa
Pull Xtensa fixes from Max Filippov:
- use asm-generic instances of asm/param.h and asm/device.h instead of
exact copies in arch/xtensa/include/asm;
- fix build error for xtensa cores with aliasing WT cache: define cache
flushing functions and copy_{to,from}_user_page;
- add missing EXPORT_SYMBOLs for clear_user_highpage, copy_user_highpage,
flush_dcache_page, local_flush_cache_range, local_flush_cache_page,
csum_partial and csum_partial_copy_generic.
* tag 'xtensa-
20170807' of git://github.com/jcmvbkbc/linux-xtensa:
xtensa: mm/cache: add missing EXPORT_SYMBOLs
xtensa: don't limit csum_partial export by CONFIG_NET
xtensa: fix cache aliasing handling code for WT cache
xtensa: remove wrapper header for asm/param.h
xtensa: remove wrapper header for asm/device.h
Linus Torvalds [Tue, 8 Aug 2017 01:40:18 +0000 (18:40 -0700)]
Merge tag 'for-linus-
20170807' of git://git.infradead.org/linux-mtd
Pull MTD fixes from Brian Norris:
"I missed getting these out for rc4, but here are some MTD fixes.
Just NAND fixes (in both the core handling, and a few drivers). Notes
stolen from Boris:
Core fixes:
- fix data interface setup for ONFI NANDs that do not support the SET
FEATURES command
- fix a kernel doc header
- fix potential integer overflow when retrieving timing information
from the parameter page
- fix wrong OOB layout for small page NANDs
Driver fixes:
- fix potential division-by-zero bug
- fix backward compat with old atmel-nand DT bindings
- fix ->setup_data_interface() in the atmel NAND driver"
* tag 'for-linus-
20170807' of git://git.infradead.org/linux-mtd:
mtd: nand: atmel: Fix EDO mode check
mtd: nand: Declare tBERS, tR and tPROG as u64 to avoid integer overflow
mtd: nand: Fix timing setup for NANDs that do not support SET FEATURES
mtd: nand: Fix a docs build warning
mtd: nand: sunxi: fix potential divide-by-zero error
nand: fix wrong default oob layout for small pages using soft ecc
mtd: nand: atmel: Fix DT backward compatibility in pmecc.c
Linus Torvalds [Tue, 8 Aug 2017 01:16:22 +0000 (18:16 -0700)]
Merge tag 'xfs-4.13-fixes-3' of git://git./fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
"I have a couple more bug fixes for you today:
- fix memory leak when issuing discard
- fix propagation of the dax inode flag"
* tag 'xfs-4.13-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: Fix per-inode DAX flag inheritance
xfs: Fix leak of discard bio
David Ahern [Mon, 7 Aug 2017 17:08:10 +0000 (10:08 -0700)]
net: vrf: Add extack messages for newlink failures
Add extack error messages for failure paths creating vrf devices. Once
extack support is added to iproute2, we go from the unhelpful:
$ ip li add foobar type vrf
RTNETLINK answers: Invalid argument
to:
$ ip li add foobar type vrf
Error: VRF table id is missing
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe Jaillet [Sun, 6 Aug 2017 22:00:17 +0000 (00:00 +0200)]
qed: Fix a memory allocation failure test in 'qed_mcp_cmd_init()'
We allocate 'p_info->mfw_mb_cur' and 'p_info->mfw_mb_shadow' but we check
'p_info->mfw_mb_addr' instead of 'p_info->mfw_mb_cur'.
'p_info->mfw_mb_addr' is never 0, because it is initiliazed a few lines
above in 'qed_load_mcp_offsets()'.
Update the test and check the result of the 2 'kzalloc()' instead.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bhumika Goyal [Sun, 6 Aug 2017 17:09:06 +0000 (22:39 +0530)]
isdn: kcapi: make capi_version const
Declare this structure as const as it is only used during a copy
operation.
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 7 Aug 2017 21:48:49 +0000 (14:48 -0700)]
Merge branch 'Update-DSAs-FDB-API-and-perform-switchdev-cleanup'
Arkadi Sharshevsky says:
====================
Update DSA's FDB API and perform switchdev cleanup
The patchset adds support for configuring static FDB entries via the
switchdev notification chain. The current method for FDB configuration
uses the switchdev's bridge bypass implementation. In order to support
this legacy way and to perform the switchdev cleanup, the implementation
is moved inside DSA.
The DSA drivers cannot sync the software bridge with hardware learned
entries and use the switchdev's implementation of bypass FDB dumping.
Because they are the only ones using this functionality, the fdb_dump
implementation is moved from switchdev code into DSA.
Finally after this changes a major cleanup in switchdev can be done.
Please see individual patches for patch specific change logs.
v1->v2
- Split MDB/vlan dump removal into core/driver removal.
v2->v3
- The self implementation for FDB add/del is moved inside DSA.
====================
Tested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arkadi Sharshevsky [Sun, 6 Aug 2017 13:15:51 +0000 (16:15 +0300)]
net: switchdev: Remove bridge bypass support from switchdev
Currently the bridge port flags, vlans, FDBs and MDBs can be offloaded
through the bridge code, making the switchdev's SELF bridge bypass
implementation to be redundant. This implies several changes:
- No need for dump infra in switchdev, DSA's special case is handled
privately.
- Remove obj_dump from switchdev_ops.
- FDBs are removed from obj_add/del routines, due to the fact that they
are offloaded through the bridge notification chain.
- The switchdev_port_bridge_xx() and switchdev_port_fdb_xx() functions
can be removed.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arkadi Sharshevsky [Sun, 6 Aug 2017 13:15:50 +0000 (16:15 +0300)]
net: bridge: Remove FDB deletion through switchdev object
At this point no driver supports FDB add/del through switchdev object
but rather via notification chain, thus, it is removed.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arkadi Sharshevsky [Sun, 6 Aug 2017 13:15:49 +0000 (16:15 +0300)]
net: dsa: Move FDB dump implementation inside DSA
>From all switchdev devices only DSA requires special FDB dump. This is due
to lack of ability for syncing the hardware learned FDBs with the bridge.
Due to this it is removed from switchdev and moved inside DSA.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arkadi Sharshevsky [Sun, 6 Aug 2017 13:15:48 +0000 (16:15 +0300)]
net: dsa: Remove redundant MDB dump support
Currently the MDB HW database is synced with the bridge's one, thus,
There is no need to support special dump functionality.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arkadi Sharshevsky [Sun, 6 Aug 2017 13:15:47 +0000 (16:15 +0300)]
net: dsa: Remove support for MDB dump from DSA's drivers
This is done as a preparation before removing support for MDB dump from
DSA core. The MDBs are synced with the bridge and thus there is no
need for special dump operation support.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arkadi Sharshevsky [Sun, 6 Aug 2017 13:15:46 +0000 (16:15 +0300)]
net: dsa: Remove support for bypass bridge port attributes/vlan set
The bridge port attributes/vlan for DSA devices should be set only
from bridge code. Furthermore, The vlans are synced totally with the
bridge so there is no need for special dump support.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arkadi Sharshevsky [Sun, 6 Aug 2017 13:15:45 +0000 (16:15 +0300)]
net: dsa: Remove support for vlan dump from DSA's drivers
This is done as a preparation before removing support for vlan dump from
DSA core. The vlans are synced with the bridge and thus there is no
need for special dump operation support.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arkadi Sharshevsky [Sun, 6 Aug 2017 13:15:44 +0000 (16:15 +0300)]
net: dsa: Add support for querying supported bridge flags
The DSA drivers do not support bridge flags offload. Yet, this attribute
should be added in order for the bridge to fail when one tries set a
flag on the port, as explained in commit
dc0ecabd6231 ("net: switchdev:
Add support for querying supported bridge flags by hardware").
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arkadi Sharshevsky [Sun, 6 Aug 2017 13:15:43 +0000 (16:15 +0300)]
net: dsa: Move FDB add/del implementation inside DSA
Currently DSA uses switchdev's implementation of FDB add/del ndos. This
patch moves the implementation inside DSA in order to support the legacy
way for static FDB configuration.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arkadi Sharshevsky [Sun, 6 Aug 2017 13:15:42 +0000 (16:15 +0300)]
net: dsa: Add support for learning FDB through notification
Add support for learning FDB through notification. The driver defers
the hardware update via ordered work queue. In case of a successful
FDB add a notification is sent back to bridge.
In case of hw FDB del failure the static FDB will be deleted from
the bridge, thus, the interface is moved to down state in order to
indicate inconsistent situation.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arkadi Sharshevsky [Sun, 6 Aug 2017 13:15:41 +0000 (16:15 +0300)]
net: dsa: Remove switchdev dependency from DSA switch notifier chain
Currently, the switchdev objects are embedded inside the DSA notifier
info. This patch removes this dependency. This is done as a preparation
stage before adding support for learning FDB through the switchdev
notification chain.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arkadi Sharshevsky [Sun, 6 Aug 2017 13:15:40 +0000 (16:15 +0300)]
net: dsa: Remove prepare phase for FDB
The prepare phase for FDB add is unneeded because most of DSA devices
can have failures during bus transactions (SPI, I2C, etc.), thus, the
prepare phase cannot guarantee success of the commit stage.
The support for learning FDB through notification chain, which will be
introduced in the following patches, will provide the ability to notify
back the bridge about successful offload.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arkadi Sharshevsky [Sun, 6 Aug 2017 13:15:39 +0000 (16:15 +0300)]
net: dsa: Change DSA slave FDB API to be switchdev independent
In order to support FDB add/del to be on a notifier chain the slave
API need to be changed to be switchdev independent.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bhumika Goyal [Sun, 6 Aug 2017 08:51:45 +0000 (14:21 +0530)]
hamradio: baycom: make hdlcdrv_ops const
Make hdlcdrv_ops structures const as they are only passed to
hdlcdrv_register function. The corresponding argument is of type const,
so make the structures const.
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Sun, 6 Aug 2017 08:19:07 +0000 (10:19 +0200)]
xfrm: check that cached bundle is still valid
Quoting Ilan Tayari:
1. Set up a host-to-host IPSec tunnel (or transport, doesn't matter)
2. Ping over IPSec, or do something to populate the pcpu cache
3. Join a MC group, then leave MC group
4. Try to ping again using same CPU as before -> traffic
doesn't egress the machine at all
Ilan debugged the problem down to the fact that one of the path dsts
devices point to lo due to earlier dst_dev_put().
In this case, dst is marked as DEAD and we cannot reuse the bundle.
The cache only asserted that the requested policy and that of the cached
bundle match, but its not enough - also verify the path is still valid.
Fixes:
ec30d78c14a813 ("xfrm: add xdst pcpu cache")
Reported-by: Ayham Masood <ayhamm@mellanox.com>
Tested-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 7 Aug 2017 21:24:21 +0000 (14:24 -0700)]
Merge branch 'net-dsa-remove-useless-arguments'
Vivien Didelot says:
====================
net: dsa: remove useless arguments
Several DSA core setup functions take many arguments, mostly because of
the legacy code. This patch series removes the useless args of these
functions, where either the dsa_switch or dsa_port argument is enough.
Changes in v2:
- ds->dev is already assigned by dsa_switch_alloc
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Sat, 5 Aug 2017 20:20:19 +0000 (16:20 -0400)]
net: dsa: remove useless args of dsa_slave_create
dsa_slave_create currently takes 4 arguments while it only needs the
related dsa_port and its name. Remove all other arguments.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Sat, 5 Aug 2017 20:20:18 +0000 (16:20 -0400)]
net: dsa: remove useless args of dsa_cpu_dsa_setup
dsa_cpu_dsa_setup currently takes 4 arguments but they are all available
from the dsa_port argument. Remove all others.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Sat, 5 Aug 2017 20:20:17 +0000 (16:20 -0400)]
net: dsa: remove useless argument in legacy setup
dsa_switch_alloc() already assigns ds-dev, which can be used in
dsa_switch_setup_one and dsa_cpu_dsa_setups instead of requiring an
additional struct device argument.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Sat, 5 Aug 2017 13:46:35 +0000 (14:46 +0100)]
net: hns3: fix spelling mistake: "capabilty" -> "capability"
Trivial fix to spelling mistake in dev_err error message and also
split overly long line to avoid a checkpatch warning.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 7 Aug 2017 21:18:01 +0000 (14:18 -0700)]
Merge branch 'Refactor-lan9303_xxx_packet_processing'
Egil Hjelmeland says:
====================
Refactor lan9303_xxx_packet_processing
This series is purely non functional.
It changes the lan9303_enable_packet_processing,
lan9303_disable_packet_processing() to pass port number (0,1,2) as
parameter instead of port offset. This aligns them with
other functions in the module, and makes it possible to simplify the code.
The lan9303_enable_packet_processing, lan9303_disable_packet_processing
functions operate on port. Therefore rename the functions to reflect that
as well.
Reviewer pointed out lan9303_get_ethtool_stats would be better off with
the use of a lan9303_read_switch_port(). So that was added to the series.
Changes v3 -> v4:
- Whitespace adjustments.
Changes v2 -> v3:
- Patch 1: Removed the change in lan9303_get_ethtool_stats
- Added patch 4: rename lan9303_xxx_packet_processing
- Added patch 5: refactor lan9303_get_ethtool_stats
Changes v1 -> v2:
- introduced lan9303_write_switch_port() in first patch
- inserted LAN9303_NUM_PORTS patch
- Use LAN9303_NUM_PORTS in last patch. Plus whitespace change.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Egil Hjelmeland [Sat, 5 Aug 2017 11:05:50 +0000 (13:05 +0200)]
net: dsa: lan9303: refactor lan9303_get_ethtool_stats
In lan9303_get_ethtool_stats: Get rid of 0x400 constant magic
by using new lan9303_read_switch_reg() inside loop.
Reduced scope of two variables.
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Egil Hjelmeland [Sat, 5 Aug 2017 11:05:49 +0000 (13:05 +0200)]
net: dsa: lan9303: Rename lan9303_xxx_packet_processing()
The lan9303_enable_packet_processing, lan9303_disable_packet_processing
functions operate on port, so the names should reflect that.
And to align with lan9303_disable_processing(), rename:
lan9303_enable_packet_processing -> lan9303_enable_processing_port
lan9303_disable_packet_processing -> lan9303_disable_processing_port
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Egil Hjelmeland [Sat, 5 Aug 2017 11:05:48 +0000 (13:05 +0200)]
net: dsa: lan9303: Simplify lan9303_xxx_packet_processing() usage
Simplify usage of lan9303_enable_packet_processing,
lan9303_disable_packet_processing()
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Egil Hjelmeland [Sat, 5 Aug 2017 11:05:47 +0000 (13:05 +0200)]
net: dsa: lan9303: define LAN9303_NUM_PORTS 3
Will be used instead of '3' in upcomming patches.
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Egil Hjelmeland [Sat, 5 Aug 2017 11:05:46 +0000 (13:05 +0200)]
net: dsa: lan9303: Change lan9303_xxx_packet_processing() port param.
lan9303_enable_packet_processing, lan9303_disable_packet_processing()
Pass port number (0,1,2) as parameter instead of port offset.
Because other functions in the module pass port numbers.
And to enable simplifications in following patch.
Introduce lan9303_write_switch_port().
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 7 Aug 2017 21:16:22 +0000 (14:16 -0700)]
Merge branch 'ipv6-sr-add-support-for-advanced-local-segment-processing'
David Lebrun says:
====================
ipv6: sr: add support for advanced local segment processing
v2: use EXPORT_SYMBOL_GPL
The current implementation of IPv6 SR supports SRH insertion/encapsulation
and basic segment endpoint behavior (i.e., processing of an SRH contained in
a packet whose active segment (IPv6 DA) is routed to the local node). This
behavior simply consists of updating the DA to the next segment and forwarding
the packet accordingly. This processing is realised for all such packets,
regardless of the active segment.
The most recent specifications of IPv6 SR [1] [2] extend the SRH processing
features as follows. Each segment endpoint defines a MyLocalSID table.
This table maps segments to operations to perform. For each ingress IPv6
packet whose DA is part of a given prefix, the segment endpoint looks
up the active segment (i.e., the IPv6 DA) in the MyLocalSID table and
applies the corresponding operation. Such specifications enable to specify
arbitrary operations besides the basic SRH processing and allow for a more
fine-grained classification.
This patch series implements those extended specifications by leveraging
a new type of lightweight tunnel, seg6local. The MyLocalSID table is
simply an arbitrary routing table (using CONFIG_IPV6_MULTIPLE_TABLES). The
following commands would assign the prefix fc00::/64 to the MyLocalSID
table, map the segment fc00::42 to the regular SRH processing function
(named "End"), and drop all packets received with an undefined active
segment:
ip -6 rule add fc00::/64 lookup 100
ip -6 route add fc00::42 encap seg6local action End dev eth0 table 100
ip -6 route add blackhole default table 100
As another example, the following command would assign the segment
fc00::1234 to the regular SRH processing function, except that the
processed packet must be forwarded to the next-hop fc42::1 (this operation
is named "End.X"):
ip -6 route add fc00::1234 encap seg6local action End.X nh6 fc42::1 dev eth0 table 100
Those two basic operations (End and End.X) are defined in [1]. A more
extensive list of advanced operations is defined in [2].
The first two patches of the series are preliminary work that remove an
assumption about initial SRH format, and export the two functions used to
insert and encapsulate an SRH onto packets. The third patch defines the
new seg6local lightweight tunnel and implement the core functions. The
fourth patch implements the operations needed to handle the newly defined
rtnetlink attributes. The fifth patch implements a few SRH processing
operations, including End and End.X.
[1] https://tools.ietf.org/html/draft-ietf-6man-segment-routing-header-07
[2] https://tools.ietf.org/html/draft-filsfils-spring-srv6-network-programming-01
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David Lebrun [Sat, 5 Aug 2017 10:39:48 +0000 (12:39 +0200)]
ipv6: sr: implement several seg6local actions
This patch implements the following seg6local actions.
- SEG6_LOCAL_ACTION_END: regular SRH processing. The DA of the packet
is updated to the next segment and forwarded accordingly.
- SEG6_LOCAL_ACTION_END_X: same as above, except that the packet is
forwarded to the specified IPv6 next-hop.
- SEG6_LOCAL_ACTION_END_DX6: decapsulate the packet and forward to
inner IPv6 packet to the specified IPv6 next-hop.
- SEG6_LOCAL_ACTION_END_B6: insert the specified SRH directly after
the IPv6 header of the packet.
- SEG6_LOCAL_ACTION_END_B6_ENCAP: encapsulate the packet within
an outer IPv6 header, containing the specified SRH.
Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Lebrun [Sat, 5 Aug 2017 10:38:27 +0000 (12:38 +0200)]
ipv6: sr: add rtnetlink functions for seg6local action parameters
This patch adds the necessary functions to parse, fill, and compare
seg6local rtnetlink attributes, for all defined action parameters.
- The SRH parameter defines an SRH to be inserted or encapsulated.
- The TABLE parameter defines the table to use for the route lookup of
the next segment or the inner decapsulated packet.
- The NH4 parameter defines the IPv4 next-hop for an inner decapsulated
IPv4 packet.
- The NH6 parameter defines the IPv6 next-hop for the next segment or
for an inner decapsulated IPv6 packet
- The IIF parameter defines an ingress interface index.
- The OIF parameter defines an egress interface index.
Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Lebrun [Sat, 5 Aug 2017 10:38:26 +0000 (12:38 +0200)]
ipv6: sr: define core operations for seg6local lightweight tunnel
This patch implements a new type of lightweight tunnel named seg6local.
A seg6local lwt is defined by a type of action and a set of parameters.
The action represents the operation to perform on the packets matching the
lwt's route, and is not necessarily an encapsulation. The set of parameters
are arguments for the processing function.
Each action is defined in a struct seg6_action_desc within
seg6_action_table[]. This structure contains the action, mandatory
attributes, the processing function, and a static headroom size required by
the action. The mandatory attributes are encoded as a bitmask field. The
static headroom is set to a non-zero value when the processing function
always add a constant number of bytes to the skb (e.g. the header size for
encapsulations).
To facilitate rtnetlink-related operations such as parsing, fill_encap,
and cmp_encap, each type of action parameter is associated to three
function pointers, in seg6_action_params[].
All actions defined in seg6_local.h are detailed in [1].
[1] https://tools.ietf.org/html/draft-filsfils-spring-srv6-network-programming-01
Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Lebrun [Sat, 5 Aug 2017 10:38:25 +0000 (12:38 +0200)]
ipv6: sr: export SRH insertion functions
This patch exports the seg6_do_srh_encap() and seg6_do_srh_inline()
functions. It also removes the CONFIG_IPV6_SEG6_INLINE knob
that enabled the compilation of seg6_do_srh_inline(). This function
is now built-in.
Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Lebrun [Sat, 5 Aug 2017 10:38:24 +0000 (12:38 +0200)]
ipv6: sr: allow SRH insertion with arbitrary segments_left value
The seg6_validate_srh() function only allows SRHs whose active segment is
the first segment of the path. However, an application may insert an SRH
whose active segment is not the first one. Such an application might be
for example an SR-aware Virtual Network Function.
This patch enables to insert SRHs with an arbitrary active segment.
Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Sat, 5 Aug 2017 05:02:19 +0000 (22:02 -0700)]
bpf: devmap fix mutex in rcu critical section
Originally we used a mutex to protect concurrent devmap update
and delete operations from racing with netdev unregister notifier
callbacks.
The notifier hook is needed because we increment the netdev ref
count when a dev is added to the devmap. This ensures the netdev
reference is valid in the datapath. However, we don't want to block
unregister events, hence the initial mutex and notifier handler.
The concern was in the notifier hook we search the map for dev
entries that hold a refcnt on the net device being torn down. But,
in order to do this we require two steps,
(i) dereference the netdev: dev = rcu_dereference(map[i])
(ii) test ifindex: dev->ifindex == removing_ifindex
and then finally we can swap in the NULL dev in the map via an
xchg operation,
xchg(map[i], NULL)
The danger here is a concurrent update could run a different
xchg op concurrently leading us to replace the new dev with a
NULL dev incorrectly.
CPU 1 CPU 2
notifier hook bpf devmap update
dev = rcu_dereference(map[i])
dev = rcu_dereference(map[i])
xchg(map[i]), new_dev);
rcu_call(dev,...)
xchg(map[i], NULL)
The above flow would create the incorrect state with the dev
reference in the update path being lost. To resolve this the
original code used a mutex around the above block. However,
updates, deletes, and lookups occur inside rcu critical sections
so we can't use a mutex in this context safely.
Fortunately, by writing slightly better code we can avoid the
mutex altogether. If CPU 1 in the above example uses a cmpxchg
and _only_ replaces the dev reference in the map when it is in
fact the expected dev the race is removed completely. The two
cases being illustrated here, first the race condition,
CPU 1 CPU 2
notifier hook bpf devmap update
dev = rcu_dereference(map[i])
dev = rcu_dereference(map[i])
xchg(map[i]), new_dev);
rcu_call(dev,...)
odev = cmpxchg(map[i], dev, NULL)
Now we can test the cmpxchg return value, detect odev != dev and
abort. Or in the good case,
CPU 1 CPU 2
notifier hook bpf devmap update
dev = rcu_dereference(map[i])
odev = cmpxchg(map[i], dev, NULL)
[...]
Now 'odev == dev' and we can do proper cleanup.
And viola the original race we tried to solve with a mutex is
corrected and the trace noted by Sasha below is resolved due
to removal of the mutex.
Note: When walking the devmap and removing dev references as needed
we depend on the core to fail any calls to dev_get_by_index() using
the ifindex of the device being removed. This way we do not race with
the user while searching the devmap.
Additionally, the mutex was also protecting list add/del/read on
the list of maps in-use. This patch converts this to an RCU list
and spinlock implementation. This protects the list from concurrent
alloc/free operations. The notifier hook walks this list so it uses
RCU read semantics.
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
in_atomic(): 1, irqs_disabled(): 0, pid: 16315, name: syz-executor1
1 lock held by syz-executor1/16315:
#0: (rcu_read_lock){......}, at: [<
ffffffff8c363bc2>] map_delete_elem kernel/bpf/syscall.c:577 [inline]
#0: (rcu_read_lock){......}, at: [<
ffffffff8c363bc2>] SYSC_bpf kernel/bpf/syscall.c:1427 [inline]
#0: (rcu_read_lock){......}, at: [<
ffffffff8c363bc2>] SyS_bpf+0x1d32/0x4ba0 kernel/bpf/syscall.c:1388
Fixes:
2ddf71e23cc2 ("net: add notifier hooks for devmap bpf map")
Reported-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 7 Aug 2017 21:12:18 +0000 (14:12 -0700)]
Merge branch 'net_sched-clean-up-filter-handle'
Cong Wang says:
====================
net_sched: clean up filter handle
This patchset sits in my local branch for a long time, it is time to
send it out. It cleans up the ambiguous use of 'unsigned long fh',
please see each of them for details.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Sat, 5 Aug 2017 04:31:43 +0000 (21:31 -0700)]
net_sched: use void pointer for filter handle
Now we use 'unsigned long fh' as a pointer in every place,
it is safe to convert it to a void pointer now. This gets
rid of many casts to pointer.
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Sat, 5 Aug 2017 04:31:42 +0000 (21:31 -0700)]
net_sched: refactor notification code for RTM_DELTFILTER
It is confusing to use 'unsigned long fh' as both a handle
and a pointer, especially commit
9ee7837449b3
("net sched filters: fix notification of filter delete with proper handle").
This patch introduces tfilter_del_notify() so that we can
pass it as a pointer as before, and we don't need to check
RTM_DELTFILTER in tcf_fill_node() any more.
This prepares for the next patch.
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roopa Prabhu [Sat, 5 Aug 2017 01:19:18 +0000 (18:19 -0700)]
lwtunnel: replace EXPORT_SYMBOL with EXPORT_SYMBOL_GPL
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 7 Aug 2017 21:09:48 +0000 (14:09 -0700)]
Merge branch 'bpf-add-support-for-sys-enter-exit-tracepoints'
Yonghong Song says:
====================
bpf: add support for sys_{enter|exit}_* tracepoints
Currently, bpf programs cannot be attached to sys_enter_* and sys_exit_*
style tracepoints. The main reason is that syscalls/sys_enter_* and syscalls/sys_exit_*
tracepoints are treated differently from other tracepoints and there
is no bpf hook to it.
This patch set adds bpf support for these syscalls tracepoints and also
adds a test case for it.
Changelogs:
v3 -> v4:
- Check the legality of ctx offset access for syscall tracepoint as well.
trace_event_get_offsets will return correct max offset for each
specific syscall tracepoint.
- Use variable length array to avoid hardcode 6 as the maximum
arguments beyond syscall_nr.
v2 -> v3:
- Fix a build issue
v1 -> v2:
- Do not use TRACE_EVENT_FL_CAP_ANY to identify syscall tracepoint.
Instead use trace_event_call->class.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yonghong Song [Fri, 4 Aug 2017 23:00:10 +0000 (16:00 -0700)]
bpf: add a test case for syscalls/sys_{enter|exit}_* tracepoints
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yonghong Song [Fri, 4 Aug 2017 23:00:09 +0000 (16:00 -0700)]
bpf: add support for sys_enter_* and sys_exit_* tracepoints
Currently, bpf programs cannot be attached to sys_enter_* and sys_exit_*
style tracepoints. The iovisor/bcc issue #748
(https://github.com/iovisor/bcc/issues/748) documents this issue.
For example, if you try to attach a bpf program to tracepoints
syscalls/sys_enter_newfstat, you will get the following error:
# ./tools/trace.py t:syscalls:sys_enter_newfstat
Ioctl(PERF_EVENT_IOC_SET_BPF): Invalid argument
Failed to attach BPF to tracepoint
The main reason is that syscalls/sys_enter_* and syscalls/sys_exit_*
tracepoints are treated differently from other tracepoints and there
is no bpf hook to it.
This patch adds bpf support for these syscalls tracepoints by
. permitting bpf attachment in ioctl PERF_EVENT_IOC_SET_BPF
. calling bpf programs in perf_syscall_enter and perf_syscall_exit
The legality of bpf program ctx access is also checked.
Function trace_event_get_offsets returns correct max offset for each
specific syscall tracepoint, which is compared against the maximum offset
access in bpf program.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Fri, 4 Aug 2017 21:43:43 +0000 (00:43 +0300)]
of_mdio: use of_property_read_u32_array()
The "fixed-link" prop support predated of_property_read_u32_array(), so
basically had to open-code it. Using the modern API saves 24 bytes of the
object code (ARM gcc 4.8.5); the only behavior change would be that the
prop length check is now less strict (however the strict pre-check done
in of_phy_is_fixed_link() is left intact anyway)...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Allen [Mon, 7 Aug 2017 20:42:30 +0000 (15:42 -0500)]
ibmvnic: Report rx buffer return codes as netdev_dbg
Reporting any return code for a receive buffer as an "rx error" only
produces alarming noise and the only values that have been observed to be
used in this field are not error conditions. Change this to a netdev_dbg
with a more descriptive message.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 7 Aug 2017 18:39:22 +0000 (11:39 -0700)]
Merge branch 'net-l3mdev-Support-for-sockets-bound-to-enslaved-device'
David Ahern says:
====================
net: l3mdev: Support for sockets bound to enslaved device
A missing piece to the VRF puzzle is the ability to bind sockets to
devices enslaved to a VRF. This patch set adds the enslaved device
index, sdif, to IPv4 and IPv6 socket lookups. The end result for users
is the following scope options for services:
1. "global" services - sockets not bound to any device
Allows 1 service to work across all network interfaces with
connected sockets bound to the VRF the connection originates
(Requires net.ipv4.tcp_l3mdev_accept=1 for TCP and
net.ipv4.udp_l3mdev_accept=1 for UDP)
2. "VRF" local services - sockets bound to a VRF
Sockets work across all network interfaces enslaved to a VRF but
are limited to just the one VRF.
3. "device" services - sockets bound to a specific network interface
Service works only through the one specific interface.
v3
- convert __inet_lookup_established in dccp_v4_err; missed in v2
v2
- remove sk_lookup struct and add sdif as an argument to existing
functions
Changes since RFC:
- no significant logic changes; mainly whitespace cleanups
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Mon, 7 Aug 2017 15:44:22 +0000 (08:44 -0700)]
net: ipv6: add second dif to raw socket lookups
Add a second device index, sdif, to raw socket lookups. sdif is the
index for ingress devices enslaved to an l3mdev. It allows the lookups
to consider the enslaved device as well as the L3 domain when searching
for a socket.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Mon, 7 Aug 2017 15:44:21 +0000 (08:44 -0700)]
net: ipv6: add second dif to inet6 socket lookups
Add a second device index, sdif, to inet6 socket lookups. sdif is the
index for ingress devices enslaved to an l3mdev. It allows the lookups
to consider the enslaved device as well as the L3 domain when searching
for a socket.
TCP moves the data in the cb. Prior to tcp_v4_rcv (e.g., early demux) the
ingress index is obtained from IPCB using inet_sdif and after tcp_v4_rcv
tcp_v4_sdif is used.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Mon, 7 Aug 2017 15:44:20 +0000 (08:44 -0700)]
net: ipv6: add second dif to udp socket lookups
Add a second device index, sdif, to udp socket lookups. sdif is the
index for ingress devices enslaved to an l3mdev. It allows the lookups
to consider the enslaved device as well as the L3 domain when searching
for a socket.
Early demux lookups are handled in the next patch as part of INET_MATCH
changes.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Mon, 7 Aug 2017 15:44:19 +0000 (08:44 -0700)]
net: ipv4: add second dif to multicast source filter
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Mon, 7 Aug 2017 15:44:18 +0000 (08:44 -0700)]
net: ipv4: add second dif to raw socket lookups
Add a second device index, sdif, to raw socket lookups. sdif is the
index for ingress devices enslaved to an l3mdev. It allows the lookups
to consider the enslaved device as well as the L3 domain when searching
for a socket.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Mon, 7 Aug 2017 15:44:17 +0000 (08:44 -0700)]
net: ipv4: add second dif to inet socket lookups
Add a second device index, sdif, to inet socket lookups. sdif is the
index for ingress devices enslaved to an l3mdev. It allows the lookups
to consider the enslaved device as well as the L3 domain when searching
for a socket.
TCP moves the data in the cb. Prior to tcp_v4_rcv (e.g., early demux) the
ingress index is obtained from IPCB using inet_sdif and after the cb move
in tcp_v4_rcv the tcp_v4_sdif helper is used.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Mon, 7 Aug 2017 15:44:16 +0000 (08:44 -0700)]
net: ipv4: add second dif to udp socket lookups
Add a second device index, sdif, to udp socket lookups. sdif is the
index for ingress devices enslaved to an l3mdev. It allows the lookups
to consider the enslaved device as well as the L3 domain when searching
for a socket.
Early demux lookups are handled in the next patch as part of INET_MATCH
changes.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 7 Aug 2017 18:34:41 +0000 (11:34 -0700)]
Merge tag 'wireless-drivers-next-for-davem-2017-08-07' of git://git./linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for 4.14
The first wireless-drivers-next pull request for 4.14. I'm submitting
this unusally late in the cycle as my vacation postponed this. But
even if this is late there's not still that much new features, mostly
cleanup or fixes.
Major changes:
ath10k
* preparation for wcn3990 support
iwlwifi
* Reorganization of the code into separate directories continues
qtnfmac
* regulatory support updates
* add get_channel, dump_survey and channel_switch cfg80211 handlers
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Anton Volkov [Mon, 7 Aug 2017 12:54:14 +0000 (15:54 +0300)]
hysdn: fix to a race condition in put_log_buffer
The synchronization type that was used earlier to guard the loop that
deletes unused log buffers may lead to a situation that prevents any
thread from going through the loop.
The patch deletes previously used synchronization mechanism and moves
the loop under the spin_lock so the similar cases won't be feasible in
the future.
Found by by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Anton Volkov <avolkov@ispras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Mon, 7 Aug 2017 11:28:39 +0000 (13:28 +0200)]
s390/qeth: fix L3 next-hop in xmit qeth hdr
On L3, the qeth_hdr struct needs to be filled with the next-hop
IP address.
The current code accesses rtable->rt_gateway without checking that
rtable is a valid address. The accidental access to a lowcore area
results in a random next-hop address in the qeth_hdr.
rtable (or more precisely, skb_dst(skb)) can be NULL in rare cases
(for instance together with AF_PACKET sockets).
This patch adds the missing NULL-ptr checks.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Fixes:
87e7597b5a3 qeth: Move away from using neighbour entries in qeth_l3_fill_header()
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Mon, 7 Aug 2017 10:41:53 +0000 (12:41 +0200)]
hns3: fix unused function warning
Without CONFIG_PCI_IOV, we get a harmless warning about an
unused function:
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c:2273:13: error: 'hclge_disable_sriov' defined but not used [-Werror=unused-function]
The #ifdefs in this driver are obviously wrong, so this just
removes them and uses an IS_ENABLED() check that does the same
thing correctly in a more readable way.
Fixes:
46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 7 Aug 2017 17:42:09 +0000 (10:42 -0700)]
Merge tag 'mlx5-shared-2017-08-07' of git://git./linux/kernel/git/mellanox/linux
Saeed Mahameed says:
====================
mlx5-shared-2017-08-07
This series includes some mlx5 updates for both net-next and rdma trees.
From Saeed,
Core driver updates to allow selectively building the driver with
or without some large driver components, such as
- E-Switch (Ethernet SRIOV support).
- Multi-Physical Function Switch (MPFs) support.
For that we split E-Switch and MPFs functionalities into separate files.
From Erez,
Delay mlx5_core events when mlx5 interfaces, namely mlx5_ib, registration
is taking place and until it completes.
From Rabie,
Increase the maximum supported flow counters.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Doug Ledford [Mon, 7 Aug 2017 17:30:40 +0000 (13:30 -0400)]
Merge tag 'rdma-rc-2017-07-26' of git://git./linux/kernel/git/leon/linux-rdma into leon-ipoib
IPoIB fixes for 4.13
The patchset provides various fixes for IPoIB. It is combination of
fixes to various issues discovered during verification along with
static checkers cleanup patches.
Most of the patches are from pre-git era and hence lack of Fixes lines.
There is one exception in this IPoIB group - addition of patch revert:
Revert "IB/core: Allow QP state transition from reset to error", but
it followed by proper fix to the annoying print, so I thought it is
appropriate to include it.
Signed-off-by: Doug Ledford <dledford@redhat.com>
David S. Miller [Mon, 7 Aug 2017 17:10:19 +0000 (10:10 -0700)]
Merge branch 'asix-Improve-robustness'
Dean Jenkins says:
====================
asix: Improve robustness
Please consider taking these patches to improve the robustness of the ASIX USB
to Ethernet driver.
Failures prompting an ASIX driver code review
=============================================
On an ARM i.MX6 embedded platform some strange one-off and two-off failures were
observed in and around the ASIX USB to Ethernet driver. This was observed on a
highly modified kernel 3.14 with the ASIX driver containing back-ported changes
from kernel.org up to kernel 4.8 approximately.
a) A one-off failure in asix_rx_fixup_internal():
There was an occurrence of an attempt to write off the end of the netdev buffer
which was trapped by skb_over_panic() in skb_put().
[20030.846440] skbuff: skb_over_panic: text:
7f2271c0 len:120 put:60 head:
8366ecc0 data:
8366ed02 tail:0x8366ed7a end:0x8366ed40 dev:eth0
[20030.863007] Kernel BUG at
8044ce38 [verbose debug info unavailable]
[20031.215345] Backtrace:
[20031.217884] [<
8044cde0>] (skb_panic) from [<
8044d50c>] (skb_put+0x50/0x5c)
[20031.227408] [<
8044d4bc>] (skb_put) from [<
7f2271c0>] (asix_rx_fixup_internal+0x1c4/0x23c [asix])
[20031.242024] [<
7f226ffc>] (asix_rx_fixup_internal [asix]) from [<
7f22724c>] (asix_rx_fixup_common+0x14/0x18 [asix])
[20031.260309] [<
7f227238>] (asix_rx_fixup_common [asix]) from [<
7f21f7d4>] (usbnet_bh+0x74/0x224 [usbnet])
[20031.269879] [<
7f21f760>] (usbnet_bh [usbnet]) from [<
8002f834>] (call_timer_fn+0xa4/0x1f0)
[20031.283961] [<
8002f790>] (call_timer_fn) from [<
80030834>] (run_timer_softirq+0x230/0x2a8)
[20031.302782] [<
80030604>] (run_timer_softirq) from [<
80028780>] (__do_softirq+0x15c/0x37c)
[20031.321511] [<
80028624>] (__do_softirq) from [<
80028c38>] (irq_exit+0x8c/0xe8)
[20031.339298] [<
80028bac>] (irq_exit) from [<
8000e9c8>] (handle_IRQ+0x8c/0xc8)
[20031.350038] [<
8000e93c>] (handle_IRQ) from [<
800085c8>] (gic_handle_irq+0xb8/0xf8)
[20031.365528] [<
80008510>] (gic_handle_irq) from [<
8050de80>] (__irq_svc+0x40/0x70)
Analysis of the logic of the ASIX driver (containing backported changes from
kernel.org up to kernel 4.8 approximately) suggested that the software could not
trigger skb_over_panic(). The analysis of the kernel BUG() crash information
suggested that the netdev buffer was written with 2 minimal 60 octet length
Ethernet frames (ASIX hardware drops the 4 octet FCS field) and the 2nd Ethernet
frame attempted to write off the end of the netdev buffer.
Note that the netdev buffer should only contain 1 Ethernet frame so if an
attempt to write 2 Ethernet frames into the buffer is made then that is wrong.
However, the logic of the asix_rx_fixup_internal() only allows 1 Ethernet frame
to be written into the netdev buffer.
Potentially this failure was due to memory corruption because it was only seen
once.
b) Two-off failures in the NAPI layer's backlog queue:
There were 2 crashes in the NAPI layer's backlog queue presumably after
asix_rx_fixup_internal() called usbnet_skb_return().
[24097.273945] Unable to handle kernel NULL pointer dereference at virtual address
00000004
[24097.398944] PC is at process_backlog+0x80/0x16c
[24097.569466] Backtrace:
[24097.572007] [<
8045ad98>] (process_backlog) from [<
8045b64c>] (net_rx_action+0xcc/0x248)
[24097.591631] [<
8045b580>] (net_rx_action) from [<
80028780>] (__do_softirq+0x15c/0x37c)
[24097.610022] [<
80028624>] (__do_softirq) from [<
800289cc>] (run_ksoftirqd+0x2c/0x84)
and
[ 1059.828452] Unable to handle kernel NULL pointer dereference at virtual address
00000000
[ 1059.953715] PC is at process_backlog+0x84/0x16c
[ 1060.140896] Backtrace:
[ 1060.143434] [<
8045ad98>] (process_backlog) from [<
8045b64c>] (net_rx_action+0xcc/0x248)
[ 1060.163075] [<
8045b580>] (net_rx_action) from [<
80028780>] (__do_softirq+0x15c/0x37c)
[ 1060.181474] [<
80028624>] (__do_softirq) from [<
80028c38>] (irq_exit+0x8c/0xe8)
[ 1060.199256] [<
80028bac>] (irq_exit) from [<
8000e9c8>] (handle_IRQ+0x8c/0xc8)
[ 1060.210006] [<
8000e93c>] (handle_IRQ) from [<
800085c8>] (gic_handle_irq+0xb8/0xf8)
[ 1060.225492] [<
80008510>] (gic_handle_irq) from [<
8050de80>] (__irq_svc+0x40/0x70)
The embedded board was only using an ASIX USB to Ethernet adaptor eth0.
Analysis suggested that the doubly-linked list pointers of the backlog queue had
been corrupted because one of the link pointers was NULL.
Potentially this failure was due to memory corruption because it was only seen
twice.
Results of the ASIX driver code review
======================================
During the code review some weaknesses were observed in the ASIX driver and the
following patches have been created to improve the robustness.
Brief overview of the patches
-----------------------------
1. asix: Add rx->ax_skb = NULL after usbnet_skb_return()
The current ASIX driver sends the received Ethernet frame to the NAPI layer of
the network stack via the call to usbnet_skb_return() in
asix_rx_fixup_internal() but retains the rx->ax_skb pointer to the netdev
buffer. The driver no longer needs the rx->ax_skb pointer at this point because
the NAPI layer now has the Ethernet frame.
This means that asix_rx_fixup_internal() must not use rx->ax_skb after the call
to usbnet_skb_return() because it could corrupt the handling of the Ethernet
frame within the network layer.
Therefore, to remove the risk of erroneous usage of rx->ax_skb, set rx->ax_skb
to NULL after the call to usbnet_skb_return(). This avoids potential erroneous
freeing of rx->ax_skb and erroneous writing to the netdev buffer. If the
software now somehow inappropriately reused rx->ax_skb, then a NULL pointer
dereference of rx->ax_skb would occur which makes investigation easier.
2. asix: Ensure asix_rx_fixup_info members are all reset
This patch creates reset_asix_rx_fixup_info() to allow all the
asix_rx_fixup_info structure members to be consistently reset to initial
conditions.
Call reset_asix_rx_fixup_info() upon each detectable error condition so that the
next URB is processed from a known state.
Otherwise, there is a risk that some members of the asix_rx_fixup_info structure
may be incorrect after an error occurred so potentially leading to a
malfunction.
3. asix: Fix small memory leak in ax88772_unbind()
This patch creates asix_rx_fixup_common_free() to allow the rx->ax_skb to be
freed when necessary.
asix_rx_fixup_common_free() is called from ax88772_unbind() before the parent
private data structure is freed.
Without this patch, there is a risk of a small netdev buffer memory leak each
time ax88772_unbind() is called during the reception of an Ethernet frame that
spans across 2 URBs.
Testing
=======
The patches have been sanity tested on a 64-bit Linux laptop running kernel
4.13-rc2 with the 3 patches applied on top.
The ASIX USB to Adaptor used for testing was (output of lsusb):
ID 0b95:772b ASIX Electronics Corp. AX88772B
Test #1
-------
The test ran a flood ping test script which slowly incremented the ICMP Echo
Request's payload from 0 to 5000 octets. This eventually causes IPv4
fragmentation to occur which causes Ethernet frames to be sent very close to
each other so increases the probability that an Ethernet frame will span 2 URBs.
The test showed that all pings were successful. The test took about 15 minutes
to complete.
Test #2
-------
A script was run on the laptop to periodically run ifdown and ifup every second
so that the ASIX USB to Adaptor was up for 1 second and down for 1 second.
From a Linux PC connected to the laptop, the following ping command was used
ping -f -s 5000 <ip address of laptop>
The large ICMP payload causes IPv4 fragmentation resulting in multiple
Ethernet frames per original IP packet.
Kernel debug within the ASIX driver was enabled to see whether any ASIX errors
were generated. The test was run for about 24 hours and no ASIX errors were
seen.
Patches
=======
The 3 patches have been rebased off the net-next repo master branch with HEAD
fbbeefd net: fec: Allow reception of frames bigger than 1522 bytes
====================
Signed-off-by: David S. Miller <davem@davemloft.net>