mlxsw: core_thermal: Use static trip points for transceiver modules
authorIdo Schimmel <idosch@nvidia.com>
Fri, 31 Mar 2023 14:17:30 +0000 (16:17 +0200)
committerDavid S. Miller <davem@davemloft.net>
Sun, 2 Apr 2023 12:42:30 +0000 (13:42 +0100)
commit5601ef91fba8efea483a7ca3bd6c42a581e4daa1
tree75590caaf20bf6e669ec1770ea3c2d11e4e06b1d
parentdd2d6604407da5b1b260faee409cd601fe914ce9
mlxsw: core_thermal: Use static trip points for transceiver modules

The driver registers a thermal zone for each transceiver module and
tries to set the trip point temperatures according to the thresholds
read from the transceiver. If a threshold cannot be read or if a
transceiver is unplugged, the trip point temperature is set to zero,
which means that it is disabled as far as the thermal subsystem is
concerned.

A recent change in the thermal core made it so that such trip points are
no longer marked as disabled, which lead the thermal subsystem to
incorrectly set the associated cooling devices to the their maximum
state [1]. A fix to restore this behavior was merged in commit
f1b80a3878b2 ("thermal: core: Restore behavior regarding invalid trip
points"). However, the thermal maintainer suggested to not rely on this
behavior and instead always register a valid array of trip points [2].

Therefore, create a static array of trip points with sane defaults
(suggested by Vadim) and register it with the thermal zone of each
transceiver module. User space can choose to override these defaults
using the thermal zone sysfs interface since these files are writeable.

Before:

 $ cat /sys/class/thermal/thermal_zone11/type
 mlxsw-module11
 $ cat /sys/class/thermal/thermal_zone11/trip_point_*_temp
 65000
 75000
 80000

After:

 $ cat /sys/class/thermal/thermal_zone11/type
 mlxsw-module11
 $ cat /sys/class/thermal/thermal_zone11/trip_point_*_temp
 55000
 65000
 80000

Also tested by reverting commit f1b80a3878b2 ("thermal: core: Restore
behavior regarding invalid trip points") and making sure that the
associated cooling devices are not set to their maximum state.

[1] https://lore.kernel.org/linux-pm/ZA3CFNhU4AbtsP4G@shredder/
[2] https://lore.kernel.org/linux-pm/f78e6b70-a963-c0ca-a4b2-0d4c6aeef1fb@linaro.org/

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/ethernet/mellanox/mlxsw/core_thermal.c