platform/kernel/linux-starfive.git
4 years agostaging: bcm2835-camera: insert emty line after declaration
Houssem KADI [Sat, 9 May 2020 18:08:49 +0000 (20:08 +0200)]
staging: bcm2835-camera: insert emty line after declaration

Missing empty line after variable declaration

Signed-off-by: Houssem KADI <kadi.houssem.eddine@gmail.com>
Link: https://lore.kernel.org/r/20200509180849.GA30426@houssem-MS-7808
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: vt6656: remove difs / sifs adjustments.
Malcolm Priestley [Tue, 5 May 2020 21:23:39 +0000 (22:23 +0100)]
staging: vt6656: remove difs / sifs adjustments.

Now mac89211 is doing frame timing in rxtx these vendor adjustments need
to be removed.

Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Link: https://lore.kernel.org/r/034e445c-b245-52c4-c855-431b9783bcff@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: vt6656: vnt_get_rtscts_rsvtime_le replace with rts/cts duration.
Malcolm Priestley [Tue, 5 May 2020 21:19:45 +0000 (22:19 +0100)]
staging: vt6656: vnt_get_rtscts_rsvtime_le replace with rts/cts duration.

rsvtime is the time needed in firmware to process the received
frame time in firmware so they can be the same as vnt_get_rts_duration
or vnt_get_cts_duration where appropriate.

The rts_rrv_time are now all the same timing in vnt_rxtx_rts.

So vnt_get_rtscts_rsvtime_le and and vnt_get_frame_time are no longer
required.

Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Link: https://lore.kernel.org/r/4c0fe356-7e08-bf66-58b7-5ab683ba9536@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: vt6656: Split RTS and CTS Duration functions
Malcolm Priestley [Tue, 5 May 2020 21:17:26 +0000 (22:17 +0100)]
staging: vt6656: Split RTS and CTS Duration functions

split vnt_get_rtscts_duration_le into vnt_get_rts_duration and
vnt_get_cts_duration.

The duration's are all the same in vnt_rxtx_rts_g_head.

Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Link: https://lore.kernel.org/r/d2983161-7935-48ce-c0ca-a26ebafa3997@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: vt6656: vnt_get_rtscts_duration_le use ieee80211_ctstoself_duration
Malcolm Priestley [Tue, 5 May 2020 21:15:12 +0000 (22:15 +0100)]
staging: vt6656: vnt_get_rtscts_duration_le use ieee80211_ctstoself_duration

use the mac80211 ieee80211_ctstoself_duration for CTS to self frames.

Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Link: https://lore.kernel.org/r/f12b3d71-eb61-340b-e473-83509d9bc38a@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: vt6656: vnt_rxtx_rsvtime_le16 to use ieee80211_generic_frame_duration.
Malcolm Priestley [Tue, 5 May 2020 21:12:04 +0000 (22:12 +0100)]
staging: vt6656: vnt_rxtx_rsvtime_le16 to use ieee80211_generic_frame_duration.

ieee80211_generic_frame_duration is the mac80211 equivalent to
vnt_get_rsvtime use this to get our frame time.

There is a change where there is rrv_time_a and rrv_time_b
the frame duration is always the same so both are equal.

Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Link: https://lore.kernel.org/r/acff7fcc-0add-652b-7d07-22001b641257@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: vt6656: vnt_get_rtscts_duration_le use ieee80211_rts_duration
Malcolm Priestley [Tue, 5 May 2020 21:13:54 +0000 (22:13 +0100)]
staging: vt6656: vnt_get_rtscts_duration_le use ieee80211_rts_duration

use the mac80211 ieee80211_rts_duration for RTS frames.

Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Link: https://lore.kernel.org/r/377a4cc3-cfe3-91aa-cf71-1063f311426a@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: most: usb: sanity check channel before using it as an index into arrays
Colin Ian King [Thu, 7 May 2020 15:06:52 +0000 (16:06 +0100)]
staging: most: usb: sanity check channel before using it as an index into arrays

Currently channel is being sanity checked after it has been used as
an index into some arrays. Fix this by moving the sanity check of
channel before the arrays are indexed with it.

Addresses-Coverity: ("Negative array index read")
Fixes: 59ed0480b950 ("Staging: most: replace pr_*() functions by dev_*()")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20200507150652.52238-1-colin.king@canonical.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: most: usb: add PM functions
Christian Gromm [Tue, 5 May 2020 12:14:52 +0000 (14:14 +0200)]
staging: most: usb: add PM functions

This patch adds the implementation of the PM functions resume and suspend.

Signed-off-by: Christian Gromm <christian.gromm@microchip.com>
Link: https://lore.kernel.org/r/1588680892-9413-1-git-send-email-christian.gromm@microchip.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: update TODO
Jérôme Pouiller [Tue, 12 May 2020 15:04:14 +0000 (17:04 +0200)]
staging: wfx: update TODO

Update the TODO list associated to the wfx driver with the last
progresses.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200512150414.267198-18-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: fix endianness of the field 'channel_number'
Jérôme Pouiller [Tue, 12 May 2020 15:04:13 +0000 (17:04 +0200)]
staging: wfx: fix endianness of the field 'channel_number'

The field 'channel_number' from the structs hif_ind_rx and hif_req_start
is a __le32. Sparse complains this field is not always correctly
accessed:

    drivers/staging/wfx/data_rx.c:95:55: warning: incorrect type in argument 1 (different base types)
    drivers/staging/wfx/data_rx.c:95:55:    expected int chan
    drivers/staging/wfx/data_rx.c:95:55:    got restricted __le16 const [usertype] channel_number

However, the value of channel_number cannot be greater than 14 (this
device only support 2.4Ghz band). So, we only have to access to the
least significant byte. It is finally easier to declare it as an array
of bytes and only access to the first one.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200512150414.267198-17-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: fix endianness of the field 'num_tx_confs'
Jérôme Pouiller [Tue, 12 May 2020 15:04:12 +0000 (17:04 +0200)]
staging: wfx: fix endianness of the field 'num_tx_confs'

The field 'num_tx_confs' from the struct hif_cnf_multi_transmit is a
__le32. Sparse complains this field is not always correctly accessed:

    drivers/staging/wfx/hif_rx.c:82:9: warning: restricted __le32 degrades to integer
    drivers/staging/wfx/hif_rx.c:87:29: warning: restricted __le32 degrades to integer

However, the value of num_tx_confs cannot be greater than 15. So, we
only have to access to the least significant byte. It is finally easier
to declare it as an array of bytes and only access to the first one.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200512150414.267198-16-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: fix endianness of the field 'status'
Jérôme Pouiller [Tue, 12 May 2020 15:04:11 +0000 (17:04 +0200)]
staging: wfx: fix endianness of the field 'status'

The field 'status' appears in most of structs returned by the hardware.
This field is encoded as little endian. Sparse complains this field is
not always correctly accessed:

    drivers/staging/wfx/data_rx.c:53:16: warning: restricted __le32 degrades to integer
    drivers/staging/wfx/data_rx.c:84:16: warning: restricted __le32 degrades to integer
    drivers/staging/wfx/data_tx.c:526:24: warning: restricted __le32 degrades to integer
    drivers/staging/wfx/data_tx.c:569:23: warning: restricted __le32 degrades to integer
    drivers/staging/wfx/hif_rx.c:128:33: warning: restricted __le32 degrades to integer
    drivers/staging/wfx/./traces.h:401:1: warning: restricted __le32 degrades to integer
    drivers/staging/wfx/./traces.h:401:1: warning: restricted __le32 degrades to integer

In most of cases, this field is only compared with HIF_STATUS values.
Finally, it is more convenient to solve the problem by defining the
HIF_STATUS values directly in little endian.

It is also the right time to make some clean up in the HIF_STATUS names.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200512150414.267198-15-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: fix access to le32 attribute 'len'
Jérôme Pouiller [Tue, 12 May 2020 15:04:10 +0000 (17:04 +0200)]
staging: wfx: fix access to le32 attribute 'len'

Sparse complains about the accesses to the field 'len' from struct hif_msg:

    drivers/staging/wfx/bh.c:88:32: warning: restricted __le16 degrades to integer
    drivers/staging/wfx/bh.c:88:32: warning: restricted __le16 degrades to integer
    drivers/staging/wfx/bh.c:93:32: warning: restricted __le16 degrades to integer
    drivers/staging/wfx/bh.c:93:32: warning: cast to restricted __le16
    drivers/staging/wfx/bh.c:93:32: warning: restricted __le16 degrades to integer
    drivers/staging/wfx/bh.c:121:25: warning: incorrect type in argument 2 (different base types)
    drivers/staging/wfx/bh.c:121:25:    expected unsigned int len
    drivers/staging/wfx/bh.c:121:25:    got restricted __le16 [usertype] len
    drivers/staging/wfx/hif_rx.c:27:22: warning: restricted __le16 degrades to integer
    drivers/staging/wfx/hif_rx.c:347:39: warning: incorrect type in argument 7 (different base types)
    drivers/staging/wfx/hif_rx.c:347:39:    expected unsigned int [usertype] len
    drivers/staging/wfx/hif_rx.c:347:39:    got restricted __le16 const [usertype] len
    drivers/staging/wfx/hif_rx.c:365:39: warning: incorrect type in argument 7 (different base types)
    drivers/staging/wfx/hif_rx.c:365:39:    expected unsigned int [usertype] len
    drivers/staging/wfx/hif_rx.c:365:39:    got restricted __le16 const [usertype] len
    drivers/staging/wfx/./traces.h:195:1: warning: incorrect type in assignment (different base types)
    drivers/staging/wfx/./traces.h:195:1:    expected int msg_len
    drivers/staging/wfx/./traces.h:195:1:    got restricted __le16 const [usertype] len
    drivers/staging/wfx/./traces.h:195:1: warning: incorrect type in assignment (different base types)
    drivers/staging/wfx/./traces.h:195:1:    expected int msg_len
    drivers/staging/wfx/./traces.h:195:1:    got restricted __le16 const [usertype] len
    drivers/staging/wfx/debug.c:319:20: warning: restricted __le16 degrades to integer
    drivers/staging/wfx/secure_link.c:85:27: warning: restricted __le16 degrades to integer
    drivers/staging/wfx/secure_link.c:85:27: warning: restricted __le16 degrades to integer

Indeed, the attribute len is little-endian. We have to take to the
endianness when we access it.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200512150414.267198-14-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: fix endianness of the struct hif_ind_startup
Jérôme Pouiller [Tue, 12 May 2020 15:04:09 +0000 (17:04 +0200)]
staging: wfx: fix endianness of the struct hif_ind_startup

The struct hif_ind_startup is received from the hardware. So it is
declared as little endian. However, it is also stored in the main driver
structure and used on different places in the driver. Sparse complains
about that:

    drivers/staging/wfx/data_tx.c:388:43: warning: restricted __le16 degrades to integer
    drivers/staging/wfx/bh.c:199:9: warning: restricted __le16 degrades to integer
    drivers/staging/wfx/bh.c:221:62: warning: restricted __le16 degrades to integer

In order to make Sparse happy and to keep access from the driver easy,
this patch declare hif_ind_startup with native endianness.

On reception of this struct, this patch takes care to do byte-swap and
keep Sparse happy.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200512150414.267198-13-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: declare the field 'packet_id' with native byte order
Jérôme Pouiller [Tue, 12 May 2020 15:04:08 +0000 (17:04 +0200)]
staging: wfx: declare the field 'packet_id' with native byte order

The field packet_id is not interpreted by the device. It is only used as
identifier for the device answer. So it is not necessary to declare it
little endian. It fixes some warnings raised by Sparse without
complexifying the code.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200512150414.267198-12-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: fix access to le32 attribute 'indication_type'
Jérôme Pouiller [Tue, 12 May 2020 15:04:07 +0000 (17:04 +0200)]
staging: wfx: fix access to le32 attribute 'indication_type'

The attribute indication_type is little-endian. We have to take to the
endianness when we access it.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200512150414.267198-11-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: fix access to le32 attribute 'event_id'
Jérôme Pouiller [Tue, 12 May 2020 15:04:06 +0000 (17:04 +0200)]
staging: wfx: fix access to le32 attribute 'event_id'

The attribute event_id is little-endian. We have to take to the
endianness when we access it.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200512150414.267198-10-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: fix access to le32 attribute 'ps_mode_error'
Jérôme Pouiller [Tue, 12 May 2020 15:04:05 +0000 (17:04 +0200)]
staging: wfx: fix access to le32 attribute 'ps_mode_error'

The attribute ps_mode_error is little-endian. We have to take to the
endianness when we access it.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200512150414.267198-9-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: fix endianness of hif_req_read_mib fields
Jérôme Pouiller [Tue, 12 May 2020 15:04:04 +0000 (17:04 +0200)]
staging: wfx: fix endianness of hif_req_read_mib fields

The structs hif_{req,cnf}_read_mib contain only little endian values.
Thus, it is necessary to fix byte ordering before to use them.
Especially, sparse detected wrong accesses to fields mib_id and length.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200512150414.267198-8-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: fix endianness of fields media_delay and tx_queue_delay
Jérôme Pouiller [Tue, 12 May 2020 15:04:03 +0000 (17:04 +0200)]
staging: wfx: fix endianness of fields media_delay and tx_queue_delay

The struct hif_cnf_tx contains only little endian values. Thus, it is
necessary to fix byte ordering before to use them. Especially, sparse
detected wrong access to fields media_delay and tx_queue_delay.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200512150414.267198-7-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: fix output of rx_stats on big endian hosts
Jérôme Pouiller [Tue, 12 May 2020 15:04:02 +0000 (17:04 +0200)]
staging: wfx: fix output of rx_stats on big endian hosts

The struct hif_rx_stats contains only little endian values. Thus, it is
necessary to fix byte ordering before to use them.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200512150414.267198-6-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: fix wrong bytes order
Jérôme Pouiller [Tue, 12 May 2020 15:04:01 +0000 (17:04 +0200)]
staging: wfx: fix wrong bytes order

The field wakeup_period_max from struct hif_mib_beacon_wake_up_period is
a u8. So, assigning it a __le16 produces a nasty bug on big-endian
architectures.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200512150414.267198-5-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: fix cast operator
Jérôme Pouiller [Tue, 12 May 2020 15:04:00 +0000 (17:04 +0200)]
staging: wfx: fix cast operator

Sparse detects that le16_to_cpup() expects a __le16 * as argument.

Change the cast operator to be compliant with sparse.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200512150414.267198-4-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: take advantage of le32_to_cpup()
Jérôme Pouiller [Tue, 12 May 2020 15:03:59 +0000 (17:03 +0200)]
staging: wfx: take advantage of le32_to_cpup()

le32_to_cpu(*x) can be advantageously converted in le32_to_cpup(x).

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200512150414.267198-3-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: fix use of cpu_to_le32 instead of le32_to_cpu
Jérôme Pouiller [Tue, 12 May 2020 15:03:58 +0000 (17:03 +0200)]
staging: wfx: fix use of cpu_to_le32 instead of le32_to_cpu

Sparse detected that le32_to_cpu should be used instead of cpu_to_le32.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200512150414.267198-2-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: use kernel types instead of c99 ones
Jérôme Pouiller [Tue, 5 May 2020 12:37:57 +0000 (14:37 +0200)]
staging: wfx: use kernel types instead of c99 ones

The kernel coding style promotes the use of kernel types (u8, u16, u32,
etc...) instead of the C99 ones.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200505123757.39506-16-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: remove spaces after cast operator
Jérôme Pouiller [Tue, 5 May 2020 12:37:56 +0000 (14:37 +0200)]
staging: wfx: remove spaces after cast operator

The kernel coding style expects no space after cast operator. This patch
make the wfx driver compliant with this rule.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200505123757.39506-15-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: fix alignements of function prototypes
Jérôme Pouiller [Tue, 5 May 2020 12:37:55 +0000 (14:37 +0200)]
staging: wfx: fix alignements of function prototypes

Some function prototypes were not correctly aligned and/or exceed 80
columns.

In some other cases, the prototypes were written on more lines than
necessary.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200505123757.39506-14-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: remove useless header inclusions
Jérôme Pouiller [Tue, 5 May 2020 12:37:54 +0000 (14:37 +0200)]
staging: wfx: remove useless header inclusions

In order to keep the compilation times reasonable, we try to only
include the necessary headers (especially header included from other
headers).

This patch clean up unnecessary headers inclusions.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200505123757.39506-13-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: prefer ARRAY_SIZE instead of a magic number
Jérôme Pouiller [Tue, 5 May 2020 12:37:53 +0000 (14:37 +0200)]
staging: wfx: prefer ARRAY_SIZE instead of a magic number

When possible, we prefer to use the macro ARRAY_SIZE rather than hard
coding the number of elements.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200505123757.39506-12-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: fix missing 'static' keyword
Jérôme Pouiller [Tue, 5 May 2020 12:37:52 +0000 (14:37 +0200)]
staging: wfx: fix missing 'static' keyword

Sparse tool noticed that wfx_enable_beacon() is never used outside of
sta.c. Therefore, it can be declared static.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200505123757.39506-11-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: fix missing 'static' statement
Jérôme Pouiller [Tue, 5 May 2020 12:37:51 +0000 (14:37 +0200)]
staging: wfx: fix missing 'static' statement

The function get_firmware() is only used from fwio.c. It can be declared
static.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200505123757.39506-10-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: poll IRQ during init
Jérôme Pouiller [Tue, 5 May 2020 12:37:50 +0000 (14:37 +0200)]
staging: wfx: poll IRQ during init

When the chip starts in SDIO mode, the external IRQ (aka Out-Of-Band
IRQ) cannot be used before to configure it. Therefore, the first
exchanges with the chip have to be done without the OOB IRQ.

This patch allow to poll the data until the OOB IRQ is correctly setup.
In order to keep the code simpler, this patch also poll data even if OOB
IRQ is not used.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200505123757.39506-9-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: introduce a way to poll IRQ
Jérôme Pouiller [Tue, 5 May 2020 12:37:49 +0000 (14:37 +0200)]
staging: wfx: introduce a way to poll IRQ

It is possible to check if an IRQ is ending by polling the control
register. This function must used with care: if an IRQ fires while the
host reads control register, the IRQ can be lost. However, it could be
useful in some cases.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200505123757.39506-8-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: use threaded IRQ with SPI
Jérôme Pouiller [Tue, 5 May 2020 12:37:48 +0000 (14:37 +0200)]
staging: wfx: use threaded IRQ with SPI

Currently, the SPI implementation use a workqueue to acknowledge IRQ
while the SDIO-OOB implementation use a threaded IRQ.

The threaded also offers the advantage to allow level triggered IRQs.

Uniformize the code and use threaded IRQ in both case. Therefore, prefer
level triggered IRQs if the user does not specify it in the DT.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200505123757.39506-7-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: repair external IRQ for SDIO
Jérôme Pouiller [Tue, 5 May 2020 12:37:47 +0000 (14:37 +0200)]
staging: wfx: repair external IRQ for SDIO

When used over SDIO bus, device is able to use an external line to
signal IRQs (also called Out-Of-Band IRQ). The current code have several
problems:
  1. The ISR cannot directly acknowledge IRQ since access to the bus is
     not atomic. This patch use a threaded IRQ to solve that issue.
  2. On certain platforms, it is necessary to keep SDIO interruption
     enabled (with register SDIO_CCCR_IENx) (this part has inspired from
     the brcmfmac driver).

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200505123757.39506-6-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: drop useless check
Jérôme Pouiller [Tue, 5 May 2020 12:37:46 +0000 (14:37 +0200)]
staging: wfx: drop useless check

Currently, the ISR check if bus->core is not NULL. But, it is a useless
check. bus->core is initialiased before to request IRQ and it is not
assigned to NULL when it is released.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200505123757.39506-5-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: fix double free
Jérôme Pouiller [Tue, 5 May 2020 12:37:45 +0000 (14:37 +0200)]
staging: wfx: fix double free

In case of error in wfx_probe(), wdev->hw is freed. Since an error
occurred, wfx_free_common() is called, then wdev->hw is freed again.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Reviewed-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Fixes: 4033714d6cbe ("staging: wfx: fix init/remove vs IRQ race")
Link: https://lore.kernel.org/r/20200505123757.39506-4-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: reduce timeout for chip initial start up
Jérôme Pouiller [Tue, 5 May 2020 12:37:44 +0000 (14:37 +0200)]
staging: wfx: reduce timeout for chip initial start up

The device take a few hundreds of milliseconds to start. However, the
current code wait up to 10 second for the chip. We can safely reduce
this value to 1 second. Thanks to that change, it is no more necessary
to use an interruptible timeout.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200505123757.39506-3-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agostaging: wfx: add support for hardware revision 2 and further
Jérôme Pouiller [Tue, 5 May 2020 12:37:43 +0000 (14:37 +0200)]
staging: wfx: add support for hardware revision 2 and further

Currently, the driver explicitly exclude support for chip with version
number it does not know. However, it unlikely that any futur hardware
change would break the driver. Therefore, we prefer to invert the test
and only exclude the versions we know the driver does not support.

Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200505123757.39506-2-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agoMerge 5.7-rc5 into staging-next
Greg Kroah-Hartman [Mon, 11 May 2020 06:57:22 +0000 (08:57 +0200)]
Merge 5.7-rc5 into staging-next

We need the staging fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agoLinux 5.7-rc5
Linus Torvalds [Sun, 10 May 2020 22:16:58 +0000 (15:16 -0700)]
Linux 5.7-rc5

4 years agoMerge tag 'x86-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 10 May 2020 18:59:53 +0000 (11:59 -0700)]
Merge tag 'x86-urgent-2020-05-10' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A set of fixes for x86:

   - Ensure that direct mapping alias is always flushed when changing
     page attributes. The optimization for small ranges failed to do so
     when the virtual address was in the vmalloc or module space.

   - Unbreak the trace event registration for syscalls without arguments
     caused by the refactoring of the SYSCALL_DEFINE0() macro.

   - Move the printk in the TSC deadline timer code to a place where it
     is guaranteed to only be called once during boot and cannot be
     rearmed by clearing warn_once after boot. If it's invoked post boot
     then lockdep rightfully complains about a potential deadlock as the
     calling context is different.

   - A series of fixes for objtool and the ORC unwinder addressing
     variety of small issues:

       - Stack offset tracking for indirect CFAs in objtool ignored
         subsequent pushs and pops

       - Repair the unwind hints in the register clearing entry ASM code

       - Make the unwinding in the low level exit to usermode code stop
         after switching to the trampoline stack. The unwind hint is no
         longer valid and the ORC unwinder emits a warning as it can't
         find the registers anymore.

       - Fix unwind hints in switch_to_asm() and rewind_stack_do_exit()
         which caused objtool to generate bogus ORC data.

       - Prevent unwinder warnings when dumping the stack of a
         non-current task as there is no way to be sure about the
         validity because the dumped stack can be a moving target.

       - Make the ORC unwinder behave the same way as the frame pointer
         unwinder when dumping an inactive tasks stack and do not skip
         the first frame.

       - Prevent ORC unwinding before ORC data has been initialized

       - Immediately terminate unwinding when a unknown ORC entry type
         is found.

       - Prevent premature stop of the unwinder caused by IRET frames.

       - Fix another infinite loop in objtool caused by a negative
         offset which was not catched.

       - Address a few build warnings in the ORC unwinder and add
         missing static/ro_after_init annotations"

* tag 'x86-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/unwind/orc: Move ORC sorting variables under !CONFIG_MODULES
  x86/apic: Move TSC deadline timer debug printk
  ftrace/x86: Fix trace event registration for syscalls without arguments
  x86/mm/cpa: Flush direct map alias during cpa
  objtool: Fix infinite loop in for_offset_range()
  x86/unwind/orc: Fix premature unwind stoppage due to IRET frames
  x86/unwind/orc: Fix error path for bad ORC entry type
  x86/unwind/orc: Prevent unwinding before ORC initialization
  x86/unwind/orc: Don't skip the first frame for inactive tasks
  x86/unwind: Prevent false warnings for non-current tasks
  x86/unwind/orc: Convert global variables to static
  x86/entry/64: Fix unwind hints in rewind_stack_do_exit()
  x86/entry/64: Fix unwind hints in __switch_to_asm()
  x86/entry/64: Fix unwind hints in kernel exit path
  x86/entry/64: Fix unwind hints in register clearing code
  objtool: Fix stack offset tracking for indirect CFAs

4 years agoMerge tag 'objtool-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 10 May 2020 18:42:14 +0000 (11:42 -0700)]
Merge tag 'objtool-urgent-2020-05-10' of git://git./linux/kernel/git/tip/tip

Pull objtool fix from Thomas Gleixner:
 "A single fix for objtool to prevent an infinite loop in the
  jump table search which can be triggered when building the
  kernel with '-ffunction-sections'"

* tag 'objtool-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Fix infinite loop in find_jump_table()

4 years agoMerge tag 'locking-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 10 May 2020 18:39:31 +0000 (11:39 -0700)]
Merge tag 'locking-urgent-2020-05-10' of git://git./linux/kernel/git/tip/tip

Pull locking fix from Thomas Gleixner:
 "A single fix for the fallout of the recent futex uacess rework.

  With those changes GCC9 fails to analyze arch_futex_atomic_op_inuser()
  correctly and emits a 'maybe unitialized' warning. While we usually
  ignore compiler stupidity the conditional store is pointless anyway
  because the correct case has to store. For the fault case the extra
  store does no harm"

* tag 'locking-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ARM: futex: Address build warning

4 years agoMerge tag 'iommu-fixes-v5.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 10 May 2020 18:26:23 +0000 (11:26 -0700)]
Merge tag 'iommu-fixes-v5.7-rc4' of git://git./linux/kernel/git/joro/iommu

Pull iommu fixes from Joerg Roedel:

 - Race condition fixes for the AMD IOMMU driver.

   These are five patches fixing two race conditions around
   increase_address_space(). The first race condition was around the
   non-atomic update of the domain page-table root pointer and the
   variable containing the page-table depth (called mode). This is fixed
   now be merging page-table root and mode into one 64-bit field which
   is read/written atomically.

   The second race condition was around updating the page-table root
   pointer and making it public before the hardware caches were flushed.
   This could cause addresses to be mapped and returned to drivers which
   are not reachable by IOMMU hardware yet, causing IO page-faults. This
   is fixed too by adding the necessary flushes before a new page-table
   root is published.

   Related to the race condition fixes these patches also add a missing
   domain_flush_complete() barrier to update_domain() and a fix to bail
   out of the loop which tries to increase the address space when the
   call to increase_address_space() fails.

   Qian was able to trigger the race conditions under high load and
   memory pressure within a few days of testing. He confirmed that he
   has seen no issues anymore with the fixes included here.

 - Fix for a list-handling bug in the VirtIO IOMMU driver.

* tag 'iommu-fixes-v5.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/virtio: Reverse arguments to list_add
  iommu/amd: Do not flush Device Table in iommu_map_page()
  iommu/amd: Update Device Table in increase_address_space()
  iommu/amd: Call domain_flush_complete() in update_domain()
  iommu/amd: Do not loop forever when trying to increase address space
  iommu/amd: Fix race in increase_address_space()/fetch_pte()

4 years agoMerge tag 'block-5.7-2020-05-09' of git://git.kernel.dk/linux-block
Linus Torvalds [Sun, 10 May 2020 18:16:07 +0000 (11:16 -0700)]
Merge tag 'block-5.7-2020-05-09' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - a small series fixing a use-after-free of bdi name (Christoph,Yufen)

 - NVMe fix for a regression with the smaller CQ update (Alexey)

 - NVMe fix for a hang at namespace scanning error recovery (Sagi)

 - fix race with blk-iocost iocg->abs_vdebt updates (Tejun)

* tag 'block-5.7-2020-05-09' of git://git.kernel.dk/linux-block:
  nvme: fix possible hang when ns scanning fails during error recovery
  nvme-pci: fix "slimmer CQ head update"
  bdi: add a ->dev_name field to struct backing_dev_info
  bdi: use bdi_dev_name() to get device name
  bdi: move bdi_dev_name out of line
  vboxsf: don't use the source name in the bdi name
  iocost: protect iocg->abs_vdebt with iocg->waitq.lock

4 years agogcc-10: mark more functions __init to avoid section mismatch warnings
Linus Torvalds [Sun, 10 May 2020 00:50:03 +0000 (17:50 -0700)]
gcc-10: mark more functions __init to avoid section mismatch warnings

It seems that for whatever reason, gcc-10 ends up not inlining a couple
of functions that used to be inlined before.  Even if they only have one
single callsite - it looks like gcc may have decided that the code was
unlikely, and not worth inlining.

The code generation difference is harmless, but caused a few new section
mismatch errors, since the (now no longer inlined) function wasn't in
the __init section, but called other init functions:

   Section mismatch in reference from the function kexec_free_initrd() to the function .init.text:free_initrd_mem()
   Section mismatch in reference from the function tpm2_calc_event_log_size() to the function .init.text:early_memremap()
   Section mismatch in reference from the function tpm2_calc_event_log_size() to the function .init.text:early_memunmap()

So add the appropriate __init annotation to make modpost not complain.
In both cases there were trivially just a single callsite from another
__init function.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoMerge tag 'riscv-for-linus-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 9 May 2020 23:24:16 +0000 (16:24 -0700)]
Merge tag 'riscv-for-linus-5.7-rc5' of git://git./linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:
 "A smattering of fixes and cleanups:

   - Dead code removal.

   - Exporting riscv_cpuid_to_hartid_mask for modules.

   - Per-CPU tracking of ISA features.

   - Setting max_pfn correctly when probing memory.

   - Adding a note to the VDSO so glibc can check the kernel's version
     without a uname().

   - A fix to force the bootloader to initialize the boot spin tables,
     which still get used as a fallback when SBI-0.1 is enabled"

* tag 'riscv-for-linus-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  RISC-V: Remove unused code from STRICT_KERNEL_RWX
  riscv: force __cpu_up_ variables to put in data section
  riscv: add Linux note to vdso
  riscv: set max_pfn to the PFN of the last page
  RISC-V: Remove N-extension related defines
  RISC-V: Add bitmap reprensenting ISA features common across CPUs
  RISC-V: Export riscv_cpuid_to_hartid_mask() API

4 years agogcc-10: avoid shadowing standard library 'free()' in crypto
Linus Torvalds [Sat, 9 May 2020 22:58:04 +0000 (15:58 -0700)]
gcc-10: avoid shadowing standard library 'free()' in crypto

gcc-10 has started warning about conflicting types for a few new
built-in functions, particularly 'free()'.

This results in warnings like:

   crypto/xts.c:325:13: warning: conflicting types for built-in function ‘free’; expected ‘void(void *)’ [-Wbuiltin-declaration-mismatch]

because the crypto layer had its local freeing functions called
'free()'.

Gcc-10 is in the wrong here, since that function is marked 'static', and
thus there is no chance of confusion with any standard library function
namespace.

But the simplest thing to do is to just use a different name here, and
avoid this gcc mis-feature.

[ Side note: gcc knowing about 'free()' is in itself not the
  mis-feature: the semantics of 'free()' are special enough that a
  compiler can validly do special things when seeing it.

  So the mis-feature here is that gcc thinks that 'free()' is some
  restricted name, and you can't shadow it as a local static function.

  Making the special 'free()' semantics be a function attribute rather
  than tied to the name would be the much better model ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agogcc-10: disable 'restrict' warning for now
Linus Torvalds [Sat, 9 May 2020 22:45:21 +0000 (15:45 -0700)]
gcc-10: disable 'restrict' warning for now

gcc-10 now warns about passing aliasing pointers to functions that take
restricted pointers.

That's actually a great warning, and if we ever start using 'restrict'
in the kernel, it might be quite useful.  But right now we don't, and it
turns out that the only thing this warns about is an idiom where we have
declared a few functions to be "printf-like" (which seems to make gcc
pick up the restricted pointer thing), and then we print to the same
buffer that we also use as an input.

And people do that as an odd concatenation pattern, with code like this:

    #define sysfs_show_gen_prop(buffer, fmt, ...) \
        snprintf(buffer, PAGE_SIZE, "%s"fmt, buffer, __VA_ARGS__)

where we have 'buffer' as both the destination of the final result, and
as the initial argument.

Yes, it's a bit questionable.  And outside of the kernel, people do have
standard declarations like

    int snprintf( char *restrict buffer, size_t bufsz,
                  const char *restrict format, ... );

where that output buffer is marked as a restrict pointer that cannot
alias with any other arguments.

But in the context of the kernel, that 'use snprintf() to concatenate to
the end result' does work, and the pattern shows up in multiple places.
And we have not marked our own version of snprintf() as taking restrict
pointers, so the warning is incorrect for now, and gcc picks it up on
its own.

If we do start using 'restrict' in the kernel (and it might be a good
idea if people find places where it matters), we'll need to figure out
how to avoid this issue for snprintf and friends.  But in the meantime,
this warning is not useful.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agogcc-10: disable 'stringop-overflow' warning for now
Linus Torvalds [Sat, 9 May 2020 22:40:52 +0000 (15:40 -0700)]
gcc-10: disable 'stringop-overflow' warning for now

This is the final array bounds warning removal for gcc-10 for now.

Again, the warning is good, and we should re-enable all these warnings
when we have converted all the legacy array declaration cases to
flexible arrays. But in the meantime, it's just noise.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agonvme: fix possible hang when ns scanning fails during error recovery
Sagi Grimberg [Wed, 6 May 2020 22:44:02 +0000 (15:44 -0700)]
nvme: fix possible hang when ns scanning fails during error recovery

When the controller is reconnecting, the host fails I/O and admin
commands as the host cannot reach the controller. ns scanning may
revalidate namespaces during that period and it is wrong to remove
namespaces due to these failures as we may hang (see 205da2434301).

One command that may fail is nvme_identify_ns_descs. Since we return
success due to having ns identify descriptor list optional, we continue
to compare ns identifiers in nvme_revalidate_disk, obviously fail and
return -ENODEV to nvme_validate_ns, which will remove the namespace.

Exactly what we don't want to happen.

Fixes: 22802bf742c2 ("nvme: Namepace identification descriptor list is optional")
Tested-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agonvme-pci: fix "slimmer CQ head update"
Alexey Dobriyan [Thu, 7 May 2020 20:07:04 +0000 (23:07 +0300)]
nvme-pci: fix "slimmer CQ head update"

Pre-incrementing ->cq_head can't be done in memory because OOB value
can be observed by another context.

This devalues space savings compared to original code :-\

$ ./scripts/bloat-o-meter ../vmlinux-000 ../obj/vmlinux
add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-32 (-32)
Function                                     old     new   delta
nvme_poll_irqdisable                         464     456      -8
nvme_poll                                    455     447      -8
nvme_irq                                     388     380      -8
nvme_dev_disable                             955     947      -8

But the code is minimal now: one read for head, one read for q_depth,
one increment, one comparison, single instruction phase bit update and
one write for new head.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reported-by: John Garry <john.garry@huawei.com>
Tested-by: John Garry <john.garry@huawei.com>
Fixes: e2a366a4b0feaeb ("nvme-pci: slimmer CQ head update")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agobdi: add a ->dev_name field to struct backing_dev_info
Christoph Hellwig [Mon, 4 May 2020 12:47:56 +0000 (14:47 +0200)]
bdi: add a ->dev_name field to struct backing_dev_info

Cache a copy of the name for the life time of the backing_dev_info
structure so that we can reference it even after unregistering.

Fixes: 68f23b89067f ("memcg: fix a crash in wb_workfn when a device disappears")
Reported-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agobdi: use bdi_dev_name() to get device name
Yufen Yu [Mon, 4 May 2020 12:47:55 +0000 (14:47 +0200)]
bdi: use bdi_dev_name() to get device name

Use the common interface bdi_dev_name() to get device name.

Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Add missing <linux/backing-dev.h> include BFQ

Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agogcc-10: disable 'array-bounds' warning for now
Linus Torvalds [Sat, 9 May 2020 21:52:44 +0000 (14:52 -0700)]
gcc-10: disable 'array-bounds' warning for now

This is another fine warning, related to the 'zero-length-bounds' one,
but hitting the same historical code in the kernel.

Because C didn't historically support flexible array members, we have
code that instead uses a one-sized array, the same way we have cases of
zero-sized arrays.

The one-sized arrays come from either not wanting to use the gcc
zero-sized array extension, or from a slight convenience-feature, where
particularly for strings, the size of the structure now includes the
allocation for the final NUL character.

So with a "char name[1];" at the end of a structure, you can do things
like

       v = my_malloc(sizeof(struct vendor) + strlen(name));

and avoid the "+1" for the terminator.

Yes, the modern way to do that is with a flexible array, and using
'offsetof()' instead of 'sizeof()', and adding the "+1" by hand.  That
also technically gets the size "more correct" in that it avoids any
alignment (and thus padding) issues, but this is another long-term
cleanup thing that will not happen for 5.7.

So disable the warning for now, even though it's potentially quite
useful.  Having a slew of warnings that then hide more urgent new issues
is not an improvement.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agogcc-10: disable 'zero-length-bounds' warning for now
Linus Torvalds [Sat, 9 May 2020 21:30:29 +0000 (14:30 -0700)]
gcc-10: disable 'zero-length-bounds' warning for now

This is a fine warning, but we still have a number of zero-length arrays
in the kernel that come from the traditional gcc extension.  Yes, they
are getting converted to flexible arrays, but in the meantime the gcc-10
warning about zero-length bounds is very verbose, and is hiding other
issues.

I missed one actual build failure because it was hidden among hundreds
of lines of warning.  Thankfully I caught it on the second go before
pushing things out, but it convinced me that I really need to disable
the new warnings for now.

We'll hopefully be all done with our conversion to flexible arrays in
the not too distant future, and we can then re-enable this warning.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoStop the ad-hoc games with -Wno-maybe-initialized
Linus Torvalds [Sat, 9 May 2020 20:57:10 +0000 (13:57 -0700)]
Stop the ad-hoc games with -Wno-maybe-initialized

We have some rather random rules about when we accept the
"maybe-initialized" warnings, and when we don't.

For example, we consider it unreliable for gcc versions < 4.9, but also
if -O3 is enabled, or if optimizing for size.  And then various kernel
config options disabled it, because they know that they trigger that
warning by confusing gcc sufficiently (ie PROFILE_ALL_BRANCHES).

And now gcc-10 seems to be introducing a lot of those warnings too, so
it falls under the same heading as 4.9 did.

At the same time, we have a very straightforward way to _enable_ that
warning when wanted: use "W=2" to enable more warnings.

So stop playing these ad-hoc games, and just disable that warning by
default, with the known and straight-forward "if you want to work on the
extra compiler warnings, use W=123".

Would it be great to have code that is always so obvious that it never
confuses the compiler whether a variable is used initialized or not?
Yes, it would.  In a perfect world, the compilers would be smarter, and
our source code would be simpler.

That's currently not the world we live in, though.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoMerge tag 'io_uring-5.7-2020-05-08' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 9 May 2020 19:02:09 +0000 (12:02 -0700)]
Merge tag 'io_uring-5.7-2020-05-08' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:

 - Fix finish_wait() balancing in file cancelation (Xiaoguang)

 - Ensure early cleanup of resources in ring map failure (Xiaoguang)

 - Ensure IORING_OP_SLICE does the right file mode checks (Pavel)

 - Remove file opening from openat/openat2/statx, it's not needed and
   messes with O_PATH

* tag 'io_uring-5.7-2020-05-08' of git://git.kernel.dk/linux-block:
  io_uring: don't use 'fd' for openat/openat2/statx
  splice: move f_mode checks to do_{splice,tee}()
  io_uring: handle -EFAULT properly in io_uring_setup()
  io_uring: fix mismatched finish_wait() calls in io_uring_cancel_files()

4 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Fri, 8 May 2020 17:36:56 +0000 (10:36 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Four minor fixes, all in drivers (qla2xxx, ibmvfc, ibmvscsi)"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ibmvscsi: Fix WARN_ON during event pool release
  scsi: ibmvfc: Don't send implicit logouts prior to NPIV login
  scsi: qla2xxx: Delete all sessions before unregister local nvme port
  scsi: qla2xxx: Fix hang when issuing nvme disconnect-all in NPIV

4 years agoMerge tag 'ceph-for-5.7-rc5' of git://github.com/ceph/ceph-client
Linus Torvalds [Fri, 8 May 2020 17:27:00 +0000 (10:27 -0700)]
Merge tag 'ceph-for-5.7-rc5' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "Fixes for an endianness handling bug that prevented mounts on
  big-endian arches, a spammy log message and a couple error paths.

  Also included a MAINTAINERS update"

* tag 'ceph-for-5.7-rc5' of git://github.com/ceph/ceph-client:
  ceph: demote quotarealm lookup warning to a debug message
  MAINTAINERS: remove myself as ceph co-maintainer
  ceph: fix double unlock in handle_cap_export()
  ceph: fix special error code in ceph_try_get_caps()
  ceph: fix endianness bug when handling MDS session feature bits

4 years agoceph: demote quotarealm lookup warning to a debug message
Luis Henriques [Tue, 5 May 2020 12:59:02 +0000 (13:59 +0100)]
ceph: demote quotarealm lookup warning to a debug message

A misconfigured cephx can easily result in having the kernel client
flooding the logs with:

  ceph: Can't lookup inode 1 (err: -13)

Change this message to debug level.

Cc: stable@vger.kernel.org
URL: https://tracker.ceph.com/issues/44546
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
4 years agoMerge tag 'char-misc-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Fri, 8 May 2020 16:11:53 +0000 (09:11 -0700)]
Merge tag 'char-misc-5.7-rc5' of git://git./linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are some small driver fixes for 5.7-rc5 that resolve a number of
  minor reported issues:

   - mhi bus driver fixes found as people actually use the code

   - phy driver fixes and compat string additions

   - most driver fix due to link order changing when the core moved out
     of staging

   - mei driver fix

   - interconnect build warning fix

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  bus: mhi: core: Fix channel device name conflict
  bus: mhi: core: Fix typo in comment
  bus: mhi: core: Offload register accesses to the controller
  bus: mhi: core: Remove link_status() callback
  bus: mhi: core: Make sure to powerdown if mhi_sync_power_up fails
  bus: mhi: Fix parsing of mhi_flags
  mei: me: disable mei interface on LBG servers.
  phy: qualcomm: usb-hs-28nm: Prepare clocks in init
  MAINTAINERS: Add Vinod Koul as Generic PHY co-maintainer
  interconnect: qcom: Move the static keyword to the front of declaration
  most: core: use function subsys_initcall()
  bus: mhi: core: Fix a NULL vs IS_ERR check in mhi_create_devices()
  phy: qcom-qusb2: Re add "qcom,sdm845-qusb2-phy" compat string
  phy: tegra: Select USB_COMMON for usb_get_maximum_speed()

4 years agoMerge tag 'driver-core-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 8 May 2020 16:06:34 +0000 (09:06 -0700)]
Merge tag 'driver-core-5.7-rc5' of git://git./linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
 "Here are a number of small driver core fixes for 5.7-rc5 to resolve a
  bunch of reported issues with the current tree.

  Biggest here are the reverts and patches from John Stultz to resolve a
  bunch of deferred probe regressions we have been seeing in 5.7-rc
  right now.

  Along with those are some other smaller fixes:

   - coredump crash fix

   - devlink fix for when permissive mode was enabled

   - amba and platform device dma_parms fixes

   - component error silenced for when deferred probe happens

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'driver-core-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  regulator: Revert "Use driver_deferred_probe_timeout for regulator_init_complete_work"
  driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires
  driver core: Use dev_warn() instead of dev_WARN() for deferred_probe_timeout warnings
  driver core: Revert default driver_deferred_probe_timeout value to 0
  component: Silence bind error on -EPROBE_DEFER
  driver core: Fix handling of fw_devlink=permissive
  coredump: fix crash when umh is disabled
  amba: Initialize dma_parms for amba devices
  driver core: platform: Initialize dma_parms for platform devices

4 years agoMerge tag 'staging-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Fri, 8 May 2020 16:03:49 +0000 (09:03 -0700)]
Merge tag 'staging-5.7-rc5' of git://git./linux/kernel/git/gregkh/staging

Pull staging driver fixes from Greg KH:
 "Here are three small driver fixes for 5.7-rc5.

  Two of these are documentation fixes:

   - MAINTAINERS update due to removed driver

   - removing Wolfram from the ks7010 driver TODO file

  The other patch is a real fix:

   - fix gasket driver to proper check the return value of a call

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'staging-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: gasket: Check the return value of gasket_get_bar_index()
  staging: ks7010: remove me from CC list
  MAINTAINERS: remove entry after hp100 driver removal

4 years agoMerge tag 'tty-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Fri, 8 May 2020 15:56:16 +0000 (08:56 -0700)]
Merge tag 'tty-5.7-rc5' of git://git./linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
 "Here are three small TTY/Serial/VT fixes for 5.7-rc5:

   - revert for the bcm63xx driver "fix" that was incorrect

   - vt unicode console bugfix

   - xilinx_uartps console driver fix

  All of these have been in linux next with no reported issues"

* tag 'tty-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  tty: xilinx_uartps: Fix missing id assignment to the console
  vt: fix unicode console freeing with a common interface
  Revert "tty: serial: bcm63xx: fix missing clk_put() in bcm63xx_uart"

4 years agoMerge tag 'usb-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Fri, 8 May 2020 15:54:00 +0000 (08:54 -0700)]
Merge tag 'usb-5.7-rc5' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some small USB fixes for 5.7-rc5 to resolve some reported
  issues:

   - syzbot found problems fixed

   - usbfs dma mapping fix

   - typec bugfixs

   - chipidea bugfix

   - usb4/thunderbolt fix

   - new device ids/quirks

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'usb-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: chipidea: msm: Ensure proper controller reset using role switch API
  usb: typec: mux: intel: Handle alt mode HPD_HIGH
  usb: usbfs: correct kernel->user page attribute mismatch
  usb: typec: intel_pmc_mux: Fix the property names
  USB: core: Fix misleading driver bug report
  USB: serial: qcserial: Add DW5816e support
  USB: uas: add quirk for LaCie 2Big Quadra
  thunderbolt: Check return value of tb_sw_read() in usb4_switch_op()
  USB: serial: garmin_gps: add sanity checking for data length

4 years agoMerge tag 'drm-fixes-2020-05-08' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 8 May 2020 15:49:34 +0000 (08:49 -0700)]
Merge tag 'drm-fixes-2020-05-08' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Another pretty normal week. I didn't get any i915 fixes yet, so next
  week I'd expect double the usual i915, but otherwise a bunch of amdgpu
  and some scattered other fixes.

  hdcp:
   - fix HDCP regression

  amdgpu:
   - Runtime PM fixes
   - DC fix for PPC
   - Misc DC fixes

  virtio:
   - fix context ordering issue

  sun4i:
   - old gcc warning fix

  ingenic-drm:
   - missing module support"

* tag 'drm-fixes-2020-05-08' of git://anongit.freedesktop.org/drm/drm:
  drm/amd/display: Prevent dpcd reads with passive dongles
  drm/amd/display: fix counter in wait_for_no_pipes_pending
  drm/amd/display: Update DCN2.1 DV Code Revision
  drm: Fix HDCP failures when SRM fw is missing
  sun6i: dsi: fix gcc-4.8
  drm: ingenic-drm: add MODULE_DEVICE_TABLE
  drm/virtio: create context before RESOURCE_CREATE_2D in 3D mode
  drm/amd/display: work around fp code being emitted outside of DC_FP_START/END
  drm/amdgpu/dc: Use WARN_ON_ONCE for ASSERT
  drm/amdgpu: drop redundant cg/pg ungate on runpm enter
  drm/amdgpu: move kfd suspend after ip_suspend_phase1

4 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Fri, 8 May 2020 15:41:09 +0000 (08:41 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
 "14 fixes and one selftest to verify the ipc fixes herein"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: limit boost_watermark on small zones
  ubsan: disable UBSAN_ALIGNMENT under COMPILE_TEST
  mm/vmscan: remove unnecessary argument description of isolate_lru_pages()
  epoll: atomically remove wait entry on wake up
  kselftests: introduce new epoll60 testcase for catching lost wakeups
  percpu: make pcpu_alloc() aware of current gfp context
  mm/slub: fix incorrect interpretation of s->offset
  scripts/gdb: repair rb_first() and rb_last()
  eventpoll: fix missing wakeup for ovflist in ep_poll_callback
  arch/x86/kvm/svm/sev.c: change flag passed to GUP fast in sev_pin_memory()
  scripts/decodecode: fix trapping instruction formatting
  kernel/kcov.c: fix typos in kcov_remote_start documentation
  mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()
  mm, memcg: fix error return value of mem_cgroup_css_alloc()
  ipc/mqueue.c: change __do_notify() to bypass check_kill_permission()

4 years agoiommu/virtio: Reverse arguments to list_add
Julia Lawall [Tue, 5 May 2020 18:47:47 +0000 (20:47 +0200)]
iommu/virtio: Reverse arguments to list_add

Elsewhere in the file, there is a list_for_each_entry with
&vdev->resv_regions as the second argument, suggesting that
&vdev->resv_regions is the list head.  So exchange the
arguments on the list_add call to put the list head in the
second argument.

Fixes: 2a5a31487445 ("iommu/virtio: Add probe request")
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/1588704467-13431-1-git-send-email-Julia.Lawall@inria.fr
Signed-off-by: Joerg Roedel <jroedel@suse.de>
4 years agoMerge tag 'drm-misc-fixes-2020-05-07' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Fri, 8 May 2020 05:02:49 +0000 (15:02 +1000)]
Merge tag 'drm-misc-fixes-2020-05-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

A few minor fixes for an ordering issue in virtio, an (old) gcc warning
in sun4i, a probe issue in ingenic-drm and a regression in the HDCP
support.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20200507160130.id64niqgf5wsha4u@gilmour.lan
4 years agoMerge tag 'amd-drm-fixes-5.7-2020-05-06' of git://people.freedesktop.org/~agd5f/linux...
Dave Airlie [Fri, 8 May 2020 03:31:38 +0000 (13:31 +1000)]
Merge tag 'amd-drm-fixes-5.7-2020-05-06' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

amd-drm-fixes-5.7-2020-05-06:

amdgpu:
- Runtime PM fixes
- DC fix for PPC
- Misc DC fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200506212257.3893-1-alexander.deucher@amd.com
4 years agoMerge branch 'for-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris...
Linus Torvalds [Fri, 8 May 2020 02:43:13 +0000 (19:43 -0700)]
Merge branch 'for-v5.7' of git://git./linux/kernel/git/jmorris/linux-security

Pull security subsystem fix from James Morris:
 "Fix the default value of fs_context_parse_param hook"

* 'for-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  security: Fix the default value of fs_context_parse_param hook

4 years agomm: limit boost_watermark on small zones
Henry Willard [Fri, 8 May 2020 01:36:27 +0000 (18:36 -0700)]
mm: limit boost_watermark on small zones

Commit 1c30844d2dfe ("mm: reclaim small amounts of memory when an
external fragmentation event occurs") adds a boost_watermark() function
which increases the min watermark in a zone by at least
pageblock_nr_pages or the number of pages in a page block.

On Arm64, with 64K pages and 512M huge pages, this is 8192 pages or
512M.  It does this regardless of the number of managed pages managed in
the zone or the likelihood of success.

This can put the zone immediately under water in terms of allocating
pages from the zone, and can cause a small machine to fail immediately
due to OoM.  Unlike set_recommended_min_free_kbytes(), which
substantially increases min_free_kbytes and is tied to THP,
boost_watermark() can be called even if THP is not active.

The problem is most likely to appear on architectures such as Arm64
where pageblock_nr_pages is very large.

It is desirable to run the kdump capture kernel in as small a space as
possible to avoid wasting memory.  In some architectures, such as Arm64,
there are restrictions on where the capture kernel can run, and
therefore, the space available.  A capture kernel running in 768M can
fail due to OoM immediately after boost_watermark() sets the min in zone
DMA32, where most of the memory is, to 512M.  It fails even though there
is over 500M of free memory.  With boost_watermark() suppressed, the
capture kernel can run successfully in 448M.

This patch limits boost_watermark() to boosting a zone's min watermark
only when there are enough pages that the boost will produce positive
results.  In this case that is estimated to be four times as many pages
as pageblock_nr_pages.

Mel said:

: There is no harm in marking it stable.  Clearly it does not happen very
: often but it's not impossible.  32-bit x86 is a lot less common now
: which would previously have been vulnerable to triggering this easily.
: ppc64 has a larger base page size but typically only has one zone.
: arm64 is likely the most vulnerable, particularly when CMA is
: configured with a small movable zone.

Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external fragmentation event occurs")
Signed-off-by: Henry Willard <henry.willard@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1588294148-6586-1-git-send-email-henry.willard@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoubsan: disable UBSAN_ALIGNMENT under COMPILE_TEST
Kees Cook [Fri, 8 May 2020 01:36:23 +0000 (18:36 -0700)]
ubsan: disable UBSAN_ALIGNMENT under COMPILE_TEST

The documentation for UBSAN_ALIGNMENT already mentions that it should
not be used on all*config builds (and for efficient-unaligned-access
architectures), so just refactor the Kconfig to correctly implement this
so randconfigs will stop creating insane images that freak out objtool
under CONFIG_UBSAN_TRAP (due to the false positives producing functions
that never return, etc).

Link: http://lkml.kernel.org/r/202005011433.C42EA3E2D@keescook
Fixes: 0887a7ebc977 ("ubsan: add trap instrumentation option")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/linux-next/202004231224.D6B3B650@keescook/
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm/vmscan: remove unnecessary argument description of isolate_lru_pages()
Qiwu Chen [Fri, 8 May 2020 01:36:20 +0000 (18:36 -0700)]
mm/vmscan: remove unnecessary argument description of isolate_lru_pages()

Since commit a9e7c39fa9fd9 ("mm/vmscan.c: remove 7th argument of
isolate_lru_pages()"), the explanation of 'mode' argument has been
unnecessary.  Let's remove it.

Signed-off-by: Qiwu Chen <chenqiwu@xiaomi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200501090346.2894-1-chenqiwu@xiaomi.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoepoll: atomically remove wait entry on wake up
Roman Penyaev [Fri, 8 May 2020 01:36:16 +0000 (18:36 -0700)]
epoll: atomically remove wait entry on wake up

This patch does two things:

 - fixes a lost wakeup introduced by commit 339ddb53d373 ("fs/epoll:
   remove unnecessary wakeups of nested epoll")

 - improves performance for events delivery.

The description of the problem is the following: if N (>1) threads are
waiting on ep->wq for new events and M (>1) events come, it is quite
likely that >1 wakeups hit the same wait queue entry, because there is
quite a big window between __add_wait_queue_exclusive() and the
following __remove_wait_queue() calls in ep_poll() function.

This can lead to lost wakeups, because thread, which was woken up, can
handle not all the events in ->rdllist.  (in better words the problem is
described here: https://lkml.org/lkml/2019/10/7/905)

The idea of the current patch is to use init_wait() instead of
init_waitqueue_entry().

Internally init_wait() sets autoremove_wake_function as a callback,
which removes the wait entry atomically (under the wq locks) from the
list, thus the next coming wakeup hits the next wait entry in the wait
queue, thus preventing lost wakeups.

Problem is very well reproduced by the epoll60 test case [1].

Wait entry removal on wakeup has also performance benefits, because
there is no need to take a ep->lock and remove wait entry from the queue
after the successful wakeup.  Here is the timing output of the epoll60
test case:

  With explicit wakeup from ep_scan_ready_list() (the state of the
  code prior 339ddb53d373):

    real    0m6.970s
    user    0m49.786s
    sys     0m0.113s

 After this patch:

   real    0m5.220s
   user    0m36.879s
   sys     0m0.019s

The other testcase is the stress-epoll [2], where one thread consumes
all the events and other threads produce many events:

  With explicit wakeup from ep_scan_ready_list() (the state of the
  code prior 339ddb53d373):

    threads  events/ms  run-time ms
          8       5427         1474
         16       6163         2596
         32       6824         4689
         64       7060         9064
        128       6991        18309

 After this patch:

    threads  events/ms  run-time ms
          8       5598         1429
         16       7073         2262
         32       7502         4265
         64       7640         8376
        128       7634        16767

 (number of "events/ms" represents event bandwidth, thus higher is
  better; number of "run-time ms" represents overall time spent
  doing the benchmark, thus lower is better)

[1] tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c
[2] https://github.com/rouming/test-tools/blob/master/stress-epoll.c

Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jason Baron <jbaron@akamai.com>
Cc: Khazhismel Kumykov <khazhy@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Heiher <r@hev.cc>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200430130326.1368509-2-rpenyaev@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agokselftests: introduce new epoll60 testcase for catching lost wakeups
Roman Penyaev [Fri, 8 May 2020 01:36:13 +0000 (18:36 -0700)]
kselftests: introduce new epoll60 testcase for catching lost wakeups

This test case catches lost wake up introduced by commit 339ddb53d373
("fs/epoll: remove unnecessary wakeups of nested epoll")

The test is simple: we have 10 threads and 10 event fds.  Each thread
can harvest only 1 event.  1 producer fires all 10 events at once and
waits that all 10 events will be observed by 10 threads.

In case of lost wakeup epoll_wait() will timeout and 0 will be returned.

Test case catches two sort of problems: forgotten wakeup on event, which
hits the ->ovflist list, this problem was fixed by:

  5a2513239750 ("eventpoll: fix missing wakeup for ovflist in ep_poll_callback")

the other problem is when several sequential events hit the same waiting
thread, thus other waiters get no wakeups.  Problem is fixed in the
following patch.

Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Khazhismel Kumykov <khazhy@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Heiher <r@hev.cc>
Cc: Jason Baron <jbaron@akamai.com>
Link: http://lkml.kernel.org/r/20200430130326.1368509-1-rpenyaev@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agopercpu: make pcpu_alloc() aware of current gfp context
Filipe Manana [Fri, 8 May 2020 01:36:10 +0000 (18:36 -0700)]
percpu: make pcpu_alloc() aware of current gfp context

Since 5.7-rc1, on btrfs we have a percpu counter initialization for
which we always pass a GFP_KERNEL gfp_t argument (this happens since
commit 2992df73268f78 ("btrfs: Implement DREW lock")).

That is safe in some contextes but not on others where allowing fs
reclaim could lead to a deadlock because we are either holding some
btrfs lock needed for a transaction commit or holding a btrfs
transaction handle open.  Because of that we surround the call to the
function that initializes the percpu counter with a NOFS context using
memalloc_nofs_save() (this is done at btrfs_init_fs_root()).

However it turns out that this is not enough to prevent a possible
deadlock because percpu_alloc() determines if it is in an atomic context
by looking exclusively at the gfp flags passed to it (GFP_KERNEL in this
case) and it is not aware that a NOFS context is set.

Because percpu_alloc() thinks it is in a non atomic context it locks the
pcpu_alloc_mutex.  This can result in a btrfs deadlock when
pcpu_balance_workfn() is running, has acquired that mutex and is waiting
for reclaim, while the btrfs task that called percpu_counter_init() (and
therefore percpu_alloc()) is holding either the btrfs commit_root
semaphore or a transaction handle (done fs/btrfs/backref.c:
iterate_extent_inodes()), which prevents reclaim from finishing as an
attempt to commit the current btrfs transaction will deadlock.

Lockdep reports this issue with the following trace:

  ======================================================
  WARNING: possible circular locking dependency detected
  5.6.0-rc7-btrfs-next-77 #1 Not tainted
  ------------------------------------------------------
  kswapd0/91 is trying to acquire lock:
  ffff8938a3b3fdc8 (&delayed_node->mutex){+.+.}, at: __btrfs_release_delayed_node.part.0+0x3f/0x320 [btrfs]

  but task is already holding lock:
  ffffffffb4f0dbc0 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #4 (fs_reclaim){+.+.}:
         fs_reclaim_acquire.part.0+0x25/0x30
         __kmalloc+0x5f/0x3a0
         pcpu_create_chunk+0x19/0x230
         pcpu_balance_workfn+0x56a/0x680
         process_one_work+0x235/0x5f0
         worker_thread+0x50/0x3b0
         kthread+0x120/0x140
         ret_from_fork+0x3a/0x50

  -> #3 (pcpu_alloc_mutex){+.+.}:
         __mutex_lock+0xa9/0xaf0
         pcpu_alloc+0x480/0x7c0
         __percpu_counter_init+0x50/0xd0
         btrfs_drew_lock_init+0x22/0x70 [btrfs]
         btrfs_get_fs_root+0x29c/0x5c0 [btrfs]
         resolve_indirect_refs+0x120/0xa30 [btrfs]
         find_parent_nodes+0x50b/0xf30 [btrfs]
         btrfs_find_all_leafs+0x60/0xb0 [btrfs]
         iterate_extent_inodes+0x139/0x2f0 [btrfs]
         iterate_inodes_from_logical+0xa1/0xe0 [btrfs]
         btrfs_ioctl_logical_to_ino+0xb4/0x190 [btrfs]
         btrfs_ioctl+0x165a/0x3130 [btrfs]
         ksys_ioctl+0x87/0xc0
         __x64_sys_ioctl+0x16/0x20
         do_syscall_64+0x5c/0x260
         entry_SYSCALL_64_after_hwframe+0x49/0xbe

  -> #2 (&fs_info->commit_root_sem){++++}:
         down_write+0x38/0x70
         btrfs_cache_block_group+0x2ec/0x500 [btrfs]
         find_free_extent+0xc6a/0x1600 [btrfs]
         btrfs_reserve_extent+0x9b/0x180 [btrfs]
         btrfs_alloc_tree_block+0xc1/0x350 [btrfs]
         alloc_tree_block_no_bg_flush+0x4a/0x60 [btrfs]
         __btrfs_cow_block+0x122/0x5a0 [btrfs]
         btrfs_cow_block+0x106/0x240 [btrfs]
         commit_cowonly_roots+0x55/0x310 [btrfs]
         btrfs_commit_transaction+0x509/0xb20 [btrfs]
         sync_filesystem+0x74/0x90
         generic_shutdown_super+0x22/0x100
         kill_anon_super+0x14/0x30
         btrfs_kill_super+0x12/0x20 [btrfs]
         deactivate_locked_super+0x31/0x70
         cleanup_mnt+0x100/0x160
         task_work_run+0x93/0xc0
         exit_to_usermode_loop+0xf9/0x100
         do_syscall_64+0x20d/0x260
         entry_SYSCALL_64_after_hwframe+0x49/0xbe

  -> #1 (&space_info->groups_sem){++++}:
         down_read+0x3c/0x140
         find_free_extent+0xef6/0x1600 [btrfs]
         btrfs_reserve_extent+0x9b/0x180 [btrfs]
         btrfs_alloc_tree_block+0xc1/0x350 [btrfs]
         alloc_tree_block_no_bg_flush+0x4a/0x60 [btrfs]
         __btrfs_cow_block+0x122/0x5a0 [btrfs]
         btrfs_cow_block+0x106/0x240 [btrfs]
         btrfs_search_slot+0x50c/0xd60 [btrfs]
         btrfs_lookup_inode+0x3a/0xc0 [btrfs]
         __btrfs_update_delayed_inode+0x90/0x280 [btrfs]
         __btrfs_commit_inode_delayed_items+0x81f/0x870 [btrfs]
         __btrfs_run_delayed_items+0x8e/0x180 [btrfs]
         btrfs_commit_transaction+0x31b/0xb20 [btrfs]
         iterate_supers+0x87/0xf0
         ksys_sync+0x60/0xb0
         __ia32_sys_sync+0xa/0x10
         do_syscall_64+0x5c/0x260
         entry_SYSCALL_64_after_hwframe+0x49/0xbe

  -> #0 (&delayed_node->mutex){+.+.}:
         __lock_acquire+0xef0/0x1c80
         lock_acquire+0xa2/0x1d0
         __mutex_lock+0xa9/0xaf0
         __btrfs_release_delayed_node.part.0+0x3f/0x320 [btrfs]
         btrfs_evict_inode+0x40d/0x560 [btrfs]
         evict+0xd9/0x1c0
         dispose_list+0x48/0x70
         prune_icache_sb+0x54/0x80
         super_cache_scan+0x124/0x1a0
         do_shrink_slab+0x176/0x440
         shrink_slab+0x23a/0x2c0
         shrink_node+0x188/0x6e0
         balance_pgdat+0x31d/0x7f0
         kswapd+0x238/0x550
         kthread+0x120/0x140
         ret_from_fork+0x3a/0x50

  other info that might help us debug this:

  Chain exists of:
    &delayed_node->mutex --> pcpu_alloc_mutex --> fs_reclaim

   Possible unsafe locking scenario:

         CPU0                    CPU1
         ----                    ----
    lock(fs_reclaim);
                                 lock(pcpu_alloc_mutex);
                                 lock(fs_reclaim);
    lock(&delayed_node->mutex);

   *** DEADLOCK ***

  3 locks held by kswapd0/91:
   #0: (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30
   #1: (shrinker_rwsem){++++}, at: shrink_slab+0x12f/0x2c0
   #2: (&type->s_umount_key#43){++++}, at: trylock_super+0x16/0x50

  stack backtrace:
  CPU: 1 PID: 91 Comm: kswapd0 Not tainted 5.6.0-rc7-btrfs-next-77 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
  Call Trace:
   dump_stack+0x8f/0xd0
   check_noncircular+0x170/0x190
   __lock_acquire+0xef0/0x1c80
   lock_acquire+0xa2/0x1d0
   __mutex_lock+0xa9/0xaf0
   __btrfs_release_delayed_node.part.0+0x3f/0x320 [btrfs]
   btrfs_evict_inode+0x40d/0x560 [btrfs]
   evict+0xd9/0x1c0
   dispose_list+0x48/0x70
   prune_icache_sb+0x54/0x80
   super_cache_scan+0x124/0x1a0
   do_shrink_slab+0x176/0x440
   shrink_slab+0x23a/0x2c0
   shrink_node+0x188/0x6e0
   balance_pgdat+0x31d/0x7f0
   kswapd+0x238/0x550
   kthread+0x120/0x140
   ret_from_fork+0x3a/0x50

This could be fixed by making btrfs pass GFP_NOFS instead of GFP_KERNEL
to percpu_counter_init() in contextes where it is not reclaim safe,
however that type of approach is discouraged since
memalloc_[nofs|noio]_save() were introduced.  Therefore this change
makes pcpu_alloc() look up into an existing nofs/noio context before
deciding whether it is in an atomic context or not.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Dennis Zhou <dennis@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Link: http://lkml.kernel.org/r/20200430164356.15543-1-fdmanana@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm/slub: fix incorrect interpretation of s->offset
Waiman Long [Fri, 8 May 2020 01:36:06 +0000 (18:36 -0700)]
mm/slub: fix incorrect interpretation of s->offset

In a couple of places in the slub memory allocator, the code uses
"s->offset" as a check to see if the free pointer is put right after the
object.  That check is no longer true with commit 3202fa62fb43 ("slub:
relocate freelist pointer to middle of object").

As a result, echoing "1" into the validate sysfs file, e.g.  of dentry,
may cause a bunch of "Freepointer corrupt" error reports like the
following to appear with the system in panic afterwards.

  =============================================================================
  BUG dentry(666:pmcd.service) (Tainted: G    B): Freepointer corrupt
  -----------------------------------------------------------------------------

To fix it, use the check "s->offset == s->inuse" in the new helper
function freeptr_outside_object() instead.  Also add another helper
function get_info_end() to return the end of info block (inuse + free
pointer if not overlapping with object).

Fixes: 3202fa62fb43 ("slub: relocate freelist pointer to middle of object")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Rafael Aquini <aquini@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Vitaly Nikolenko <vnik@duasynt.com>
Cc: Silvio Cesare <silvio.cesare@gmail.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Markus Elfring <Markus.Elfring@web.de>
Cc: Changbin Du <changbin.du@gmail.com>
Link: http://lkml.kernel.org/r/20200429135328.26976-1-longman@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoscripts/gdb: repair rb_first() and rb_last()
Aymeric Agon-Rambosson [Fri, 8 May 2020 01:36:03 +0000 (18:36 -0700)]
scripts/gdb: repair rb_first() and rb_last()

The current implementations of the rb_first() and rb_last() gdb
functions have a variable that references itself in its instanciation,
which causes the function to throw an error if a specific condition on
the argument is met.  The original author rather intended to reference
the argument and made a typo.  Referring the argument instead makes the
function work as intended.

Signed-off-by: Aymeric Agon-Rambosson <aymeric.agon@yandex.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kieran Bingham <kbingham@kernel.org>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Nikolay Borisov <n.borisov.lkml@gmail.com>
Cc: Jackie Liu <liuyun01@kylinos.cn>
Cc: Jason Wessel <jason.wessel@windriver.com>
Link: http://lkml.kernel.org/r/20200427051029.354840-1-aymeric.agon@yandex.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoeventpoll: fix missing wakeup for ovflist in ep_poll_callback
Khazhismel Kumykov [Fri, 8 May 2020 01:35:59 +0000 (18:35 -0700)]
eventpoll: fix missing wakeup for ovflist in ep_poll_callback

In the event that we add to ovflist, before commit 339ddb53d373
("fs/epoll: remove unnecessary wakeups of nested epoll") we would be
woken up by ep_scan_ready_list, and did no wakeup in ep_poll_callback.

With that wakeup removed, if we add to ovflist here, we may never wake
up.  Rather than adding back the ep_scan_ready_list wakeup - which was
resulting in unnecessary wakeups, trigger a wake-up in ep_poll_callback.

We noticed that one of our workloads was missing wakeups starting with
339ddb53d373 and upon manual inspection, this wakeup seemed missing to me.
With this patch added, we no longer see missing wakeups.  I haven't yet
tried to make a small reproducer, but the existing kselftests in
filesystem/epoll passed for me with this patch.

[khazhy@google.com: use if/elif instead of goto + cleanup suggested by Roman]
Link: http://lkml.kernel.org/r/20200424190039.192373-1-khazhy@google.com
Fixes: 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll")
Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Roman Penyaev <rpenyaev@suse.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Roman Penyaev <rpenyaev@suse.de>
Cc: Heiher <r@hev.cc>
Cc: Jason Baron <jbaron@akamai.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200424025057.118641-1-khazhy@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoarch/x86/kvm/svm/sev.c: change flag passed to GUP fast in sev_pin_memory()
Janakarajan Natarajan [Fri, 8 May 2020 01:35:56 +0000 (18:35 -0700)]
arch/x86/kvm/svm/sev.c: change flag passed to GUP fast in sev_pin_memory()

When trying to lock read-only pages, sev_pin_memory() fails because
FOLL_WRITE is used as the flag for get_user_pages_fast().

Commit 73b0140bf0fe ("mm/gup: change GUP fast to use flags rather than a
write 'bool'") updated the get_user_pages_fast() call sites to use
flags, but incorrectly updated the call in sev_pin_memory().  As the
original coding of this call was correct, revert the change made by that
commit.

Fixes: 73b0140bf0fe ("mm/gup: change GUP fast to use flags rather than a write 'bool'")
Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Mike Marshall <hubcap@omnibond.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Link: http://lkml.kernel.org/r/20200423152419.87202-1-Janakarajan.Natarajan@amd.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoscripts/decodecode: fix trapping instruction formatting
Ivan Delalande [Fri, 8 May 2020 01:35:53 +0000 (18:35 -0700)]
scripts/decodecode: fix trapping instruction formatting

If the trapping instruction contains a ':', for a memory access through
segment registers for example, the sed substitution will insert the '*'
marker in the middle of the instruction instead of the line address:

2b:   65 48 0f c7 0f          cmpxchg16b %gs:*(%rdi)          <-- trapping instruction

I started to think I had forgotten some quirk of the assembly syntax
before noticing that it was actually coming from the script.  Fix it to
add the address marker at the right place for these instructions:

28:   49 8b 06                mov    (%r14),%rax
2b:*  65 48 0f c7 0f          cmpxchg16b %gs:(%rdi)           <-- trapping instruction
30:   0f 94 c0                sete   %al

Fixes: 18ff44b189e2 ("scripts/decodecode: make faulting insn ptr more robust")
Signed-off-by: Ivan Delalande <colona@arista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20200419223653.GA31248@visor
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agokernel/kcov.c: fix typos in kcov_remote_start documentation
Maciej Grochowski [Fri, 8 May 2020 01:35:49 +0000 (18:35 -0700)]
kernel/kcov.c: fix typos in kcov_remote_start documentation

Signed-off-by: Maciej Grochowski <maciej.grochowski@pm.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Link: http://lkml.kernel.org/r/20200420030259.31674-1-maciek.grochowski@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()
David Hildenbrand [Fri, 8 May 2020 01:35:46 +0000 (18:35 -0700)]
mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()

Without CONFIG_PREEMPT, it can happen that we get soft lockups detected,
e.g., while booting up.

  watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1]
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.6.0-next-20200331+ #4
  Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
  RIP: __pageblock_pfn_to_page+0x134/0x1c0
  Call Trace:
   set_zone_contiguous+0x56/0x70
   page_alloc_init_late+0x166/0x176
   kernel_init_freeable+0xfa/0x255
   kernel_init+0xa/0x106
   ret_from_fork+0x35/0x40

The issue becomes visible when having a lot of memory (e.g., 4TB)
assigned to a single NUMA node - a system that can easily be created
using QEMU.  Inside VMs on a hypervisor with quite some memory
overcommit, this is fairly easy to trigger.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Shile Zhang <shile.zhang@linux.alibaba.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200416073417.5003-1-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm, memcg: fix error return value of mem_cgroup_css_alloc()
Yafang Shao [Fri, 8 May 2020 01:35:43 +0000 (18:35 -0700)]
mm, memcg: fix error return value of mem_cgroup_css_alloc()

When I run my memcg testcase which creates lots of memcgs, I found
there're unexpected out of memory logs while there're still enough
available free memory.  The error log is

  mkdir: cannot create directory 'foo.65533': Cannot allocate memory

The reason is when we try to create more than MEM_CGROUP_ID_MAX memcgs,
an -ENOMEM errno will be set by mem_cgroup_css_alloc(), but the right
errno should be -ENOSPC "No space left on device", which is an
appropriate errno for userspace's failed mkdir.

As the errno really misled me, we should make it right.  After this
patch, the error log will be

  mkdir: cannot create directory 'foo.65533': No space left on device

[akpm@linux-foundation.org: s/EBUSY/ENOSPC/, per Michal]
[akpm@linux-foundation.org: s/EBUSY/ENOSPC/, per Michal]
Fixes: 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure after many small jobs")
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Link: http://lkml.kernel.org/r/20200407063621.GA18914@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/1586192163-20099-1-git-send-email-laoar.shao@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoipc/mqueue.c: change __do_notify() to bypass check_kill_permission()
Oleg Nesterov [Fri, 8 May 2020 01:35:39 +0000 (18:35 -0700)]
ipc/mqueue.c: change __do_notify() to bypass check_kill_permission()

Commit cc731525f26a ("signal: Remove kernel interal si_code magic")
changed the value of SI_FROMUSER(SI_MESGQ), this means that mq_notify() no
longer works if the sender doesn't have rights to send a signal.

Change __do_notify() to use do_send_sig_info() instead of kill_pid_info()
to avoid check_kill_permission().

This needs the additional notify.sigev_signo != 0 check, shouldn't we
change do_mq_notify() to deny sigev_signo == 0 ?

Test-case:

#include <signal.h>
#include <mqueue.h>
#include <unistd.h>
#include <sys/wait.h>
#include <assert.h>

static int notified;

static void sigh(int sig)
{
notified = 1;
}

int main(void)
{
signal(SIGIO, sigh);

int fd = mq_open("/mq", O_RDWR|O_CREAT, 0666, NULL);
assert(fd >= 0);

struct sigevent se = {
.sigev_notify = SIGEV_SIGNAL,
.sigev_signo = SIGIO,
};
assert(mq_notify(fd, &se) == 0);

if (!fork()) {
assert(setuid(1) == 0);
mq_send(fd, "",1,0);
return 0;
}

wait(NULL);
mq_unlink("/mq");
assert(notified);
return 0;
}

[manfred@colorfullife.com: 1) Add self_exec_id evaluation so that the implementation matches do_notify_parent 2) use PIDTYPE_TGID everywhere]
Fixes: cc731525f26a ("signal: Remove kernel interal si_code magic")
Reported-by: Yoji <yoji.fujihar.min@gmail.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Markus Elfring <elfring@users.sourceforge.net>
Cc: <1vier1@web.de>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/e2a782e4-eab9-4f5c-c749-c07a8f7a4e66@colorfullife.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoMerge tag 'trace-v5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Thu, 7 May 2020 22:27:11 +0000 (15:27 -0700)]
Merge tag 'trace-v5.7-rc3' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Fix bootconfig causing kernels to fail with CONFIG_BLK_DEV_RAM
   enabled

 - Fix allocation leaks in bootconfig tool

 - Fix a double initialization of a variable

 - Fix API bootconfig usage from kprobe boot time events

 - Reject NULL location for kprobes

 - Fix crash caused by preempt delay module not cleaning up kthread
   correctly

 - Add vmalloc_sync_mappings() to prevent x86_64 page faults from
   recursively faulting from tracing page faults

 - Fix comment in gpu/trace kerneldoc header

 - Fix documentation of how to create a trace event class

 - Make the local tracing_snapshot_instance_cond() function static

* tag 'trace-v5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tools/bootconfig: Fix resource leak in apply_xbc()
  tracing: Make tracing_snapshot_instance_cond() static
  tracing: Fix doc mistakes in trace sample
  gpu/trace: Minor comment updates for gpu_mem_total tracepoint
  tracing: Add a vmalloc_sync_mappings() for safe measure
  tracing: Wait for preempt irq delay thread to finish
  tracing/kprobes: Reject new event if loc is NULL
  tracing/boottime: Fix kprobe event API usage
  tracing/kprobes: Fix a double initialization typo
  bootconfig: Fix to remove bootconfig data from initrd while boot

4 years agoMerge tag 'linux-kselftest-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 7 May 2020 22:22:08 +0000 (15:22 -0700)]
Merge tag 'linux-kselftest-5.7-rc5' of git://git./linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:
 "ftrace test fixes and a fix to kvm Makefile for relocatable
  native/cross builds and installs"

* tag 'linux-kselftest-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: fix kvm relocatable native/cross builds and installs
  selftests/ftrace: Make XFAIL green color
  ftrace/selftest: make unresolved cases cause failure if --fail-unresolved set
  ftrace/selftests: workaround cgroup RT scheduling issues

4 years agoio_uring: don't use 'fd' for openat/openat2/statx
Jens Axboe [Thu, 7 May 2020 20:56:15 +0000 (14:56 -0600)]
io_uring: don't use 'fd' for openat/openat2/statx

We currently make some guesses as when to open this fd, but in reality
we have no business (or need) to do so at all. In fact, it makes certain
things fail, like O_PATH.

Remove the fd lookup from these opcodes, we're just passing the 'fd' to
generic helpers anyway. With that, we can also remove the special casing
of fd values in io_req_needs_file(), and the 'fd_non_neg' check that
we have. And we can ensure that we only read sqe->fd once.

This fixes O_PATH usage with openat/openat2, and ditto statx path side
oddities.

Cc: stable@vger.kernel.org: # v5.6
Reported-by: Max Kellermann <mk@cm4all.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agotools/bootconfig: Fix resource leak in apply_xbc()
Yunfeng Ye [Thu, 7 May 2020 09:23:36 +0000 (17:23 +0800)]
tools/bootconfig: Fix resource leak in apply_xbc()

Fix the @data and @fd allocations that are leaked in the error path of
apply_xbc().

Link: http://lkml.kernel.org/r/583a49c9-c27a-931d-e6c2-6f63a4b18bea@huawei.com
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
4 years agotracing: Make tracing_snapshot_instance_cond() static
Zou Wei [Thu, 23 Apr 2020 04:08:25 +0000 (12:08 +0800)]
tracing: Make tracing_snapshot_instance_cond() static

Fix the following sparse warning:

kernel/trace/trace.c:950:6: warning: symbol 'tracing_snapshot_instance_cond'
was not declared. Should it be static?

Link: http://lkml.kernel.org/r/1587614905-48692-1-git-send-email-zou_wei@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
4 years agotracing: Fix doc mistakes in trace sample
Wei Yang [Tue, 28 Apr 2020 21:49:59 +0000 (21:49 +0000)]
tracing: Fix doc mistakes in trace sample

As the example below shows, DECLARE_EVENT_CLASS() is used instead of
DEFINE_EVENT_CLASS().

Link: http://lkml.kernel.org/r/20200428214959.11259-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
4 years agogpu/trace: Minor comment updates for gpu_mem_total tracepoint
Yiwei Zhang [Tue, 28 Apr 2020 22:08:25 +0000 (15:08 -0700)]
gpu/trace: Minor comment updates for gpu_mem_total tracepoint

This change updates the improper comment for the 'size' attribute in the
tracepoint definition. Most gfx drivers pre-fault in physical pages
instead of making virtual allocations. So we drop the 'Virtual' keyword
here and leave this to the implementations.

Link: http://lkml.kernel.org/r/20200428220825.169606-1-zzyiwei@google.com
Signed-off-by: Yiwei Zhang <zzyiwei@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
4 years agotracing: Add a vmalloc_sync_mappings() for safe measure
Steven Rostedt (VMware) [Wed, 6 May 2020 14:36:18 +0000 (10:36 -0400)]
tracing: Add a vmalloc_sync_mappings() for safe measure

x86_64 lazily maps in the vmalloc pages, and the way this works with per_cpu
areas can be complex, to say the least. Mappings may happen at boot up, and
if nothing synchronizes the page tables, those page mappings may not be
synced till they are used. This causes issues for anything that might touch
one of those mappings in the path of the page fault handler. When one of
those unmapped mappings is touched in the page fault handler, it will cause
another page fault, which in turn will cause a page fault, and leave us in
a loop of page faults.

Commit 763802b53a42 ("x86/mm: split vmalloc_sync_all()") split
vmalloc_sync_all() into vmalloc_sync_unmappings() and
vmalloc_sync_mappings(), as on system exit, it did not need to do a full
sync on x86_64 (although it still needed to be done on x86_32). By chance,
the vmalloc_sync_all() would synchronize the page mappings done at boot up
and prevent the per cpu area from being a problem for tracing in the page
fault handler. But when that synchronization in the exit of a task became a
nop, it caused the problem to appear.

Link: https://lore.kernel.org/r/20200429054857.66e8e333@oasis.local.home
Cc: stable@vger.kernel.org
Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code")
Reported-by: "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@gmail.com>
Suggested-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
4 years agotracing: Wait for preempt irq delay thread to finish
Steven Rostedt (VMware) [Wed, 6 May 2020 14:20:10 +0000 (10:20 -0400)]
tracing: Wait for preempt irq delay thread to finish

Running on a slower machine, it is possible that the preempt delay kernel
thread may still be executing if the module was immediately removed after
added, and this can cause the kernel to crash as the kernel thread might be
executing after its code has been removed.

There's no reason that the caller of the code shouldn't just wait for the
delay thread to finish, as the thread can also be created by a trigger in
the sysfs code, which also has the same issues.

Link: http://lore.kernel.org/r/5EA2B0C8.2080706@cn.fujitsu.com
Cc: stable@vger.kernel.org
Fixes: 793937236d1ee ("lib: Add module for testing preemptoff/irqsoff latency tracers")
Reported-by: Xiao Yang <yangx.jy@cn.fujitsu.com>
Reviewed-by: Xiao Yang <yangx.jy@cn.fujitsu.com>
Reviewed-by: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
4 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Thu, 7 May 2020 16:55:58 +0000 (09:55 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fix from Catalin Marinas:
 "Avoid potential NULL dereference in huge_pte_alloc() on pmd_alloc()
  failure"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: hugetlb: avoid potential NULL dereference