drm/i915: Remove WARN_ONCE for failing to pm_runtime_if_in_use
authorChris Wilson <chris@chris-wilson.co.uk>
Mon, 19 Feb 2018 12:50:46 +0000 (12:50 +0000)
committerChris Wilson <chris@chris-wilson.co.uk>
Mon, 19 Feb 2018 15:06:23 +0000 (15:06 +0000)
As the driver is called to handle circumstances beyond it's control, we
cannot assume that the pm_runtime core is happy to see us. For example,
if we are called from shrink_slab to free up our pages during suspend,
rpm may be disabled and pm_runtime_if_in_use() decides to fail with
-EINVAL rather than simply say no. This is expected to happen, so don't
warn.  For example,

[  217.429228] Suspending console(s) (use no_console_suspend to debug)
[  217.557179] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[  217.559399] sd 0:0:0:0: [sda] Stopping disk
[  218.661567] i915 0000:00:02.0: Resetting chip after gpu hang
[  219.523879] ------------[ cut here ]------------
[  219.524474] pm_runtime_get_if_in_use() failed: -22
[  219.524817] WARNING: CPU: 1 PID: 14 at drivers/gpu/drm/i915/intel_runtime_pm.c:3351 intel_runtime_pm_get_if_in_use+0xe3/0x150 [i915]
[  219.524836] Modules linked in: vgem i915 snd_hda_codec_realtek snd_hda_codec_generic coretemp snd_hda_intel snd_hda_codec r8169 lpc_ich snd_hwdep mii snd_hda_core snd_pcm prime_numbers
[  219.525054] CPU: 1 PID: 14 Comm: cpuhp/1 Tainted: G     U           4.16.0-rc1-g740f57c54ecf-kasan_6+ #1
[  219.525070] Hardware name:  /D510MO, BIOS MOPNV10J.86A.0311.2010.0802.2346 08/02/2010
[  219.525294] RIP: 0010:intel_runtime_pm_get_if_in_use+0xe3/0x150 [i915]
[  219.525313] RSP: 0018:ffff880018f5edf8 EFLAGS: 00010286
[  219.525344] RAX: dffffc0000000008 RBX: ffff880007fc0000 RCX: 0000000000000000
[  219.525361] RDX: 0000000000000001 RSI: ffffffff850609c0 RDI: ffffffff872992a0
[  219.525377] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
[  219.525394] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880007fc0000
[  219.525411] R13: ffff880018f5f0f8 R14: ffff880007fc8de8 R15: ffff880018f5f0f0
[  219.525429] FS:  0000000000000000(0000) GS:ffff880019c80000(0000) knlGS:0000000000000000
[  219.525446] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  219.525463] CR2: 0000564df7897e86 CR3: 0000000000d7c000 CR4: 00000000000006e0
[  219.525478] Call Trace:
[  219.525734]  i915_gem_shrink+0x841/0xb50 [i915]
[  219.525802]  ? debug_check_no_locks_freed+0x2a0/0x2a0
[  219.525842]  ? trace_hardirqs_on_thunk+0x1a/0x1c
[  219.526083]  ? i915_gem_shrinker_count+0x2f0/0x2f0 [i915]
[  219.526131]  ? lock_acquire+0x138/0x3c0
[  219.526157]  ? lock_acquire+0x138/0x3c0
[  219.526391]  ? shrinker_lock+0x49/0x210 [i915]
[  219.526465]  ? mutex_trylock+0x15c/0x1a0
[  219.526694]  ? shrinker_lock+0x49/0x210 [i915]
[  219.526969]  ? i915_gem_shrinker_scan+0xc4/0x320 [i915]
[  219.527200]  i915_gem_shrinker_scan+0xc4/0x320 [i915]
[  219.527448]  ? i915_gem_shrinker_vmap+0x3a0/0x3a0 [i915]
[  219.527533]  shrink_slab.part.18+0x2d0/0x8d0
[  219.527613]  ? unregister_shrinker+0x1f0/0x1f0
[  219.527668]  ? mem_cgroup_iter+0x37d/0xc50
[  219.527728]  shrink_node+0x882/0xbe0
[  219.527847]  ? shrink_node_memcg+0x11c0/0x11c0
[  219.527882]  ? mark_held_locks+0xa8/0xf0
[  219.527931]  ? trace_hardirqs_on_caller+0x33f/0x590
[  219.527961]  ? ktime_get+0xad/0x140
[  219.528015]  do_try_to_free_pages+0x2d3/0xd70
[  219.528099]  ? allow_direct_reclaim.part.23+0x1d0/0x1d0
[  219.528132]  ? shrink_node+0xbe0/0xbe0
[  219.528213]  try_to_free_pages+0x1cd/0x570
[  219.528257]  ? do_try_to_free_pages+0xd70/0xd70
[  219.528355]  __alloc_pages_nodemask+0xadf/0x2110
[  219.528423]  ? unwind_next_frame+0x870/0x1970
[  219.528465]  ? deref_stack_reg+0x97/0xc0
[  219.528503]  ? gfp_pfmemalloc_allowed+0x150/0x150
[  219.528539]  ? register_sched_domain_sysctl+0x23a/0x1b90
[  219.528588]  ? unwind_next_frame+0x138/0x1970
[  219.528619]  ? kthread+0x30a/0x3d0
[  219.528677]  ? __read_once_size_nocheck.constprop.4+0x10/0x10
[  219.528698]  ? deref_stack_reg+0xc0/0xc0
[  219.528762]  ? __save_stack_trace+0x6e/0xd0
[  219.528822]  depot_save_stack+0x3bc/0x430
[  219.528870]  kasan_kmalloc+0x142/0x170
[  219.528912]  ? __kmalloc+0xf7/0x340
[  219.528935]  ? register_sched_domain_sysctl+0x23a/0x1b90
[  219.528957]  ? partition_sched_domains+0x4d4/0x840
[  219.528978]  ? sched_cpu_deactivate+0x11b/0x150
[  219.529001]  ? cpuhp_invoke_callback+0x160/0x15f0
[  219.529023]  ? cpuhp_thread_fun+0x35e/0x710
[  219.529044]  ? smpboot_thread_fn+0x50a/0x7f0
[  219.529065]  ? kthread+0x30a/0x3d0
[  219.529086]  ? ret_from_fork+0x24/0x50
[  219.529141]  ? register_sched_domain_sysctl+0x23a/0x1b90
[  219.529169]  ? register_sched_domain_sysctl+0x23a/0x1b90
[  219.529198]  ? set_track+0x87/0x100
[  219.529225]  ? init_object+0x6e/0x80
[  219.529275]  ? ___slab_alloc.constprop.36+0x232/0x3e0
[  219.529303]  ? ___slab_alloc.constprop.36+0x232/0x3e0
[  219.529325]  ? register_sched_domain_sysctl+0x23a/0x1b90
[  219.529410]  ? mark_held_locks+0xa8/0xf0
[  219.529453]  ? register_sched_domain_sysctl+0x23a/0x1b90
[  219.529479]  ? trace_hardirqs_on_caller+0x33f/0x590
[  219.529532]  __kmalloc+0xf7/0x340
[  219.529557]  ? register_sched_domain_sysctl+0x23a/0x1b90
[  219.529604]  register_sched_domain_sysctl+0x23a/0x1b90
[  219.529684]  ? sched_debug_show+0x20/0x20
[  219.529713]  ? debug_object_activate+0x530/0x530
[  219.529771]  ? rcu_lockdep_current_cpu_online+0xdc/0x130
[  219.529802]  ? partition_sched_domains+0x4ae/0x840
[  219.529825]  ? rcu_read_lock_sched_held+0x10f/0x130
[  219.529875]  partition_sched_domains+0x4d4/0x840
[  219.529955]  ? sched_init_domains+0x110/0x110
[  219.529981]  ? __wait_rcu_gp+0x24f/0x390
[  219.530054]  sched_cpu_deactivate+0x11b/0x150
[  219.530086]  ? sched_cpu_activate+0x1e0/0x1e0
[  219.530112]  ? __call_rcu.constprop.53+0x680/0x680
[  219.530132]  ? call_rcu_bh+0x10/0x10
[  219.530166]  ? debug_check_no_locks_freed+0x2a0/0x2a0
[  219.530201]  ? trace_raw_output_rcu_utilization+0xa0/0xa0
[  219.530267]  ? trace_raw_output_rcu_utilization+0xa0/0xa0
[  219.530337]  ? rcu_lockdep_current_cpu_online+0xdc/0x130
[  219.530370]  ? sched_cpu_activate+0x1e0/0x1e0
[  219.530397]  cpuhp_invoke_callback+0x160/0x15f0
[  219.530424]  ? lock_acquire+0x138/0x3c0
[  219.530445]  ? lock_acquire+0x138/0x3c0
[  219.530471]  ? cpuhp_thread_fun+0xaf/0x710
[  219.530507]  ? pci_mmcfg_check_reserved+0x100/0x100
[  219.530565]  cpuhp_thread_fun+0x35e/0x710
[  219.530618]  ? cpuhp_complete_idle_dead+0x10/0x10
[  219.530639]  smpboot_thread_fn+0x50a/0x7f0
[  219.530678]  ? sort_range+0x20/0x20
[  219.530709]  ? __kthread_parkme+0xba/0x1f0
[  219.530739]  ? schedule+0x84/0x1a0
[  219.530768]  ? __kthread_parkme+0xbf/0x1f0
[  219.530805]  ? sort_range+0x20/0x20
[  219.530831]  kthread+0x30a/0x3d0
[  219.530859]  ? _kthread_create_on_node+0xb0/0xb0
[  219.530900]  ret_from_fork+0x24/0x50
[  219.530999] Code: 01 00 00 00 85 c0 74 4a 89 e8 5b 5d c3 80 3d 48 37 4e 00 00 75 f2 89 c6 48 c7 c7 40 f0 61 c0 c6 05 36 37 4e 00 01 e8 ed 2a e1 c2 <0f> ff eb d9 80 3d 3f 37 4e 00 00 75 94 48 c7 c7 60 e8 61 c0 c6
[  219.531880] ---[ end trace 18ec0139488ea0c8 ]---
[  219.607967] IRQ 16: no longer affine to CPU1
[  219.670291] IRQ 24: no longer affine to CPU2
[  219.701489] IRQ 8: no longer affine to CPU3
[  219.701529] IRQ 9: no longer affine to CPU3
[  219.701582] IRQ 18: no longer affine to CPU3
[  219.701640] IRQ 25: no longer affine to CPU3
[  219.743857]  cache: parent cpu1 should not be sleeping
[  219.784549]  cache: parent cpu2 should not be sleeping
[  219.816041]  cache: parent cpu3 should not be sleeping

v2: Add Returns: information to intel_runtime_pm_get_if_in_use() kerneldoc.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180219125046.19363-1-chris@chris-wilson.co.uk
drivers/gpu/drm/i915/intel_runtime_pm.c

index 16790f2..b7924fe 100644 (file)
@@ -3328,18 +3328,19 @@ void intel_runtime_pm_get(struct drm_i915_private *dev_priv)
  * @dev_priv: i915 device instance
  *
  * This function grabs a device-level runtime pm reference if the device is
- * already in use and ensures that it is powered up.
+ * already in use and ensures that it is powered up. It is illegal to try
+ * and access the HW should intel_runtime_pm_get_if_in_use() report failure.
  *
  * Any runtime pm reference obtained by this function must have a symmetric
  * call to intel_runtime_pm_put() to release the reference again.
+ *
+ * Returns: True if the wakeref was acquired, or False otherwise.
  */
 bool intel_runtime_pm_get_if_in_use(struct drm_i915_private *dev_priv)
 {
-       struct pci_dev *pdev = dev_priv->drm.pdev;
-       struct device *kdev = &pdev->dev;
-
        if (IS_ENABLED(CONFIG_PM)) {
-               int ret = pm_runtime_get_if_in_use(kdev);
+               struct pci_dev *pdev = dev_priv->drm.pdev;
+               struct device *kdev = &pdev->dev;
 
                /*
                 * In cases runtime PM is disabled by the RPM core and we get
@@ -3347,9 +3348,7 @@ bool intel_runtime_pm_get_if_in_use(struct drm_i915_private *dev_priv)
                 * function, since the power state is undefined. This applies
                 * atm to the late/early system suspend/resume handlers.
                 */
-               WARN_ONCE(ret < 0,
-                         "pm_runtime_get_if_in_use() failed: %d\n", ret);
-               if (ret <= 0)
+               if (pm_runtime_get_if_in_use(kdev) <= 0)
                        return false;
        }