iommu/amd: Fix schedule-while-atomic BUG in initialization code
authorJoerg Roedel <jroedel@suse.de>
Wed, 26 Jul 2017 12:17:55 +0000 (14:17 +0200)
committerJoerg Roedel <jroedel@suse.de>
Wed, 26 Jul 2017 13:39:14 +0000 (15:39 +0200)
The register_syscore_ops() function takes a mutex and might
sleep. In the IOMMU initialization code it is invoked during
irq-remapping setup already, where irqs are disabled.

This causes a schedule-while-atomic bug:

 BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
 in_atomic(): 0, irqs_disabled(): 1, pid: 1, name: swapper/0
 no locks held by swapper/0/1.
 irq event stamp: 304
 hardirqs last  enabled at (303): [<ffffffff818a87b6>] _raw_spin_unlock_irqrestore+0x36/0x60
 hardirqs last disabled at (304): [<ffffffff8235d440>] enable_IR_x2apic+0x79/0x196
 softirqs last  enabled at (36): [<ffffffff818ae75f>] __do_softirq+0x35f/0x4ec
 softirqs last disabled at (31): [<ffffffff810c1955>] irq_exit+0x105/0x120
 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc2.1.el7a.test.x86_64.debug #1
 Hardware name:          PowerEdge C6145 /040N24, BIOS 3.5.0 10/28/2014
 Call Trace:
  dump_stack+0x85/0xca
  ___might_sleep+0x22a/0x260
  __might_sleep+0x4a/0x80
  __mutex_lock+0x58/0x960
  ? iommu_completion_wait.part.17+0xb5/0x160
  ? register_syscore_ops+0x1d/0x70
  ? iommu_flush_all_caches+0x120/0x150
  mutex_lock_nested+0x1b/0x20
  register_syscore_ops+0x1d/0x70
  state_next+0x119/0x910
  iommu_go_to_state+0x29/0x30
  amd_iommu_enable+0x13/0x23

Fix it by moving the register_syscore_ops() call to the next
initialization step, which runs with irqs enabled.

Reported-by: Artem Savkov <asavkov@redhat.com>
Tested-by: Artem Savkov <asavkov@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Fixes: 2c0ae1720c09 ('iommu/amd: Convert iommu initialization to state machine')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
drivers/iommu/amd_iommu_init.c

index 5cc597b..3723037 100644 (file)
@@ -2440,11 +2440,11 @@ static int __init state_next(void)
                break;
        case IOMMU_ACPI_FINISHED:
                early_enable_iommus();
-               register_syscore_ops(&amd_iommu_syscore_ops);
                x86_platform.iommu_shutdown = disable_iommus;
                init_state = IOMMU_ENABLED;
                break;
        case IOMMU_ENABLED:
+               register_syscore_ops(&amd_iommu_syscore_ops);
                ret = amd_iommu_init_pci();
                init_state = ret ? IOMMU_INIT_ERROR : IOMMU_PCI_INIT;
                enable_iommus_v2();