devlink: make health report on unregistered instance warn just once
authorJakub Kicinski <kuba@kernel.org>
Wed, 31 May 2023 01:55:23 +0000 (18:55 -0700)
committerJakub Kicinski <kuba@kernel.org>
Thu, 1 Jun 2023 05:34:22 +0000 (22:34 -0700)
Devlink health is involved in error recovery. Machines in bad
state tend to be fairly unreliable, and occasionally get stuck
in error loops. Even with a reasonable grace period devlink health
may get a thousand reports in an hour.

In case of reporting on an unregistered devlink instance
the subsequent reports don't add much value. Switch to
WARN_ON_ONCE() to avoid flooding dmesg and fleet monitoring
dashboards.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230531015523.48961-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net/devlink/health.c

index 0839706d5741aa6bd61e907b8565076166516c7c..194340a8bb8632521ed5784506b7a310791469e5 100644 (file)
@@ -480,7 +480,7 @@ static void devlink_recover_notify(struct devlink_health_reporter *reporter,
        int err;
 
        WARN_ON(cmd != DEVLINK_CMD_HEALTH_REPORTER_RECOVER);
-       WARN_ON(!xa_get_mark(&devlinks, devlink->index, DEVLINK_REGISTERED));
+       ASSERT_DEVLINK_REGISTERED(devlink);
 
        msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
        if (!msg)