platform/kernel/linux-rpi3.git
7 years agodrm/amd/powerplay: get raven current sclk and mclk (v2)
Evan Quan [Tue, 26 Sep 2017 03:49:28 +0000 (11:49 +0800)]
drm/amd/powerplay: get raven current sclk and mclk (v2)

v2: squash in rebase fix (Tom)

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: get raven max/min gfx clocks (v2)
Evan Quan [Tue, 26 Sep 2017 03:43:35 +0000 (11:43 +0800)]
drm/amd/powerplay: get raven max/min gfx clocks (v2)

v2: squash in rebase fix (Tom)

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: added new raven ppsmc messages
Evan Quan [Tue, 26 Sep 2017 03:37:34 +0000 (11:37 +0800)]
drm/amd/powerplay: added new raven ppsmc messages

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: fixed wrong return value on error (v2)
Evan Quan [Tue, 26 Sep 2017 03:35:30 +0000 (11:35 +0800)]
drm/amd/powerplay: fixed wrong return value on error (v2)

v2: squash in typo fix (Tom)

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: Fixed a potential circular lock
ozeng [Wed, 27 Sep 2017 21:53:12 +0000 (17:53 -0400)]
drm/amdgpu: Fixed a potential circular lock

The dead circular lock senario captured is as followed.
The idea of the fix is moving read_user_wptr outside of
acquire_queue...release_queue critical section

[   63.477482] WARNING: possible circular locking dependency detected
[   63.484091] 4.12.0-kfd-ozeng #3 Not tainted
[   63.488531] ------------------------------------------------------
[   63.495146] HelloWorldLoop/2526 is trying to acquire lock:
[   63.501011]  (&mm->mmap_sem){++++++}, at: [<ffffffff911898ce>] __might_fault+0x3e/0x90
[   63.509472]
               but task is already holding lock:
[   63.515716]  (&adev->srbm_mutex){+.+...}, at: [<ffffffffc0484feb>] lock_srbm+0x2b/0x50 [amdgpu]
[   63.525099]
               which lock already depends on the new lock.

[   63.533841]
               the existing dependency chain (in reverse order) is:
[   63.541839]
               -> #2 (&adev->srbm_mutex){+.+...}:
[   63.548178]        lock_acquire+0x6d/0x90
[   63.552461]        __mutex_lock+0x70/0x8c0
[   63.556826]        mutex_lock_nested+0x16/0x20
[   63.561603]        gfx_v8_0_kiq_resume+0x1039/0x14a0 [amdgpu]
[   63.567817]        gfx_v8_0_hw_init+0x204d/0x2210 [amdgpu]
[   63.573675]        amdgpu_device_init+0xdea/0x1790 [amdgpu]
[   63.579640]        amdgpu_driver_load_kms+0x63/0x220 [amdgpu]
[   63.585743]        drm_dev_register+0x145/0x1e0
[   63.590605]        amdgpu_pci_probe+0x11e/0x160 [amdgpu]
[   63.596266]        local_pci_probe+0x40/0xa0
[   63.600803]        pci_device_probe+0x134/0x150
[   63.605650]        driver_probe_device+0x2a1/0x460
[   63.610785]        __driver_attach+0xdc/0xe0
[   63.615321]        bus_for_each_dev+0x5f/0x90
[   63.619984]        driver_attach+0x19/0x20
[   63.624337]        bus_add_driver+0x40/0x270
[   63.628908]        driver_register+0x5b/0xe0
[   63.633446]        __pci_register_driver+0x5b/0x60
[   63.638586]        rtsx_pci_switch_output_voltage+0x1d/0x20 [rtsx_pci]
[   63.645564]        do_one_initcall+0x4c/0x1b0
[   63.650205]        do_init_module+0x56/0x1ea
[   63.654767]        load_module+0x208c/0x27d0
[   63.659335]        SYSC_finit_module+0x96/0xd0
[   63.664058]        SyS_finit_module+0x9/0x10
[   63.668629]        entry_SYSCALL_64_fastpath+0x1f/0xbe
[   63.674088]
               -> #1 (reservation_ww_class_mutex){+.+.+.}:
[   63.681257]        lock_acquire+0x6d/0x90
[   63.685551]        __ww_mutex_lock.constprop.11+0x8c/0xed0
[   63.691426]        ww_mutex_lock+0x67/0x70
[   63.695802]        amdgpu_verify_access+0x6d/0x100 [amdgpu]
[   63.701743]        ttm_bo_mmap+0x8e/0x100 [ttm]
[   63.706615]        amdgpu_bo_mmap+0xd/0x60 [amdgpu]
[   63.711814]        amdgpu_mmap+0x35/0x40 [amdgpu]
[   63.716904]        mmap_region+0x3b5/0x5a0
[   63.721255]        do_mmap+0x400/0x4d0
[   63.725260]        vm_mmap_pgoff+0xb0/0xf0
[   63.729625]        SyS_mmap_pgoff+0x19e/0x260
[   63.734292]        SyS_mmap+0x1d/0x20
[   63.738199]        entry_SYSCALL_64_fastpath+0x1f/0xbe
[   63.743681]
               -> #0 (&mm->mmap_sem){++++++}:
[   63.749641]        __lock_acquire+0x1401/0x1420
[   63.754491]        lock_acquire+0x6d/0x90
[   63.758750]        __might_fault+0x6b/0x90
[   63.763176]        kgd_hqd_load+0x24f/0x270 [amdgpu]
[   63.768432]        load_mqd+0x4b/0x50 [amdkfd]
[   63.773192]        create_queue_nocpsch+0x535/0x620 [amdkfd]
[   63.779237]        pqm_create_queue+0x34d/0x4f0 [amdkfd]
[   63.784835]        kfd_ioctl_create_queue+0x282/0x670 [amdkfd]
[   63.790973]        kfd_ioctl+0x310/0x4d0 [amdkfd]
[   63.795944]        do_vfs_ioctl+0x90/0x6e0
[   63.800268]        SyS_ioctl+0x74/0x80
[   63.804207]        entry_SYSCALL_64_fastpath+0x1f/0xbe
[   63.809607]
               other info that might help us debug this:

[   63.818026] Chain exists of:
                 &mm->mmap_sem --> reservation_ww_class_mutex --> &adev->srbm_mutex

[   63.830382]  Possible unsafe locking scenario:

[   63.836605]        CPU0                    CPU1
[   63.841364]        ----                    ----
[   63.846123]   lock(&adev->srbm_mutex);
[   63.850061]                                lock(reservation_ww_class_mutex);
[   63.857475]                                lock(&adev->srbm_mutex);
[   63.864084]   lock(&mm->mmap_sem);
[   63.867657]
                *** DEADLOCK ***

[   63.873884] 3 locks held by HelloWorldLoop/2526:
[   63.878739]  #0:  (&process->mutex){+.+.+.}, at: [<ffffffffc06e1a9a>] kfd_ioctl_create_queue+0x24a/0x670 [amdkfd]
[   63.889543]  #1:  (&dqm->lock){+.+...}, at: [<ffffffffc06eedeb>] create_queue_nocpsch+0x3b/0x620 [amdkfd]
[   63.899684]  #2:  (&adev->srbm_mutex){+.+...}, at: [<ffffffffc0484feb>] lock_srbm+0x2b/0x50 [amdgpu]
[   63.909500]
               stack backtrace:
[   63.914187] CPU: 3 PID: 2526 Comm: HelloWorldLoop Not tainted 4.12.0-kfd-ozeng #3
[   63.922184] Hardware name: AMD Carrizo/Gardenia, BIOS WGA5819N_Weekly_15_08_1 08/19/2015
[   63.930865] Call Trace:
[   63.933464]  dump_stack+0x85/0xc9
[   63.936999]  print_circular_bug+0x1f9/0x207
[   63.941442]  __lock_acquire+0x1401/0x1420
[   63.945745]  ? lock_srbm+0x2b/0x50 [amdgpu]
[   63.950185]  lock_acquire+0x6d/0x90
[   63.953885]  ? __might_fault+0x3e/0x90
[   63.957899]  __might_fault+0x6b/0x90
[   63.961699]  ? __might_fault+0x3e/0x90
[   63.965755]  kgd_hqd_load+0x24f/0x270 [amdgpu]
[   63.970577]  load_mqd+0x4b/0x50 [amdkfd]
[   63.974745]  create_queue_nocpsch+0x535/0x620 [amdkfd]
[   63.980242]  pqm_create_queue+0x34d/0x4f0 [amdkfd]
[   63.985320]  kfd_ioctl_create_queue+0x282/0x670 [amdkfd]
[   63.991021]  kfd_ioctl+0x310/0x4d0 [amdkfd]
[   63.995499]  ? kfd_ioctl_destroy_queue+0x70/0x70 [amdkfd]
[   64.001234]  do_vfs_ioctl+0x90/0x6e0
[   64.005065]  ? up_read+0x1a/0x40
[   64.008496]  SyS_ioctl+0x74/0x80
[   64.011955]  entry_SYSCALL_64_fastpath+0x1f/0xbe
[   64.016863] RIP: 0033:0x7f4b3bd35f07
[   64.020696] RSP: 002b:00007ffe7689ec38 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   64.028786] RAX: ffffffffffffffda RBX: 00000000002a2000 RCX: 00007f4b3bd35f07
[   64.036414] RDX: 00007ffe7689ecb0 RSI: 00000000c0584b02 RDI: 0000000000000005
[   64.044045] RBP: 00007f4a3212d000 R08: 00007f4b3c919000 R09: 0000000000080000
[   64.051674] R10: 00007f4b376b64b8 R11: 0000000000000246 R12: 00007f4a3212d000
[   64.059324] R13: 0000000000000015 R14: 0000000000000064 R15: 00007ffe7689ef50

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/radeon: make functions alloc_pasid and free_pasid static
Colin Ian King [Thu, 28 Sep 2017 13:46:17 +0000 (14:46 +0100)]
drm/radeon: make functions alloc_pasid and free_pasid static

The functions alloc_pasid  and free_pasid are local to the
source and do not need to be in global scope, so make them static.

Cleans up sparse warnings:
warning: symbol 'alloc_pasid' was not declared. Should it be static?
warning: symbol 'free_pasid' was not declared. Should it be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: add FENCE_TO_HANDLE ioctl that returns syncobj or sync_file
Marek Olšák [Tue, 12 Sep 2017 20:42:14 +0000 (22:42 +0200)]
drm/amdgpu: add FENCE_TO_HANDLE ioctl that returns syncobj or sync_file

for being able to convert an amdgpu fence into one of the handles.
Mesa will use this.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/syncobj: add a new helper drm_syncobj_get_fd
Marek Olšák [Tue, 12 Sep 2017 20:42:13 +0000 (22:42 +0200)]
drm/syncobj: add a new helper drm_syncobj_get_fd

Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/syncobj: extract two helpers from drm_syncobj_create
Marek Olšák [Tue, 12 Sep 2017 20:42:12 +0000 (22:42 +0200)]
drm/syncobj: extract two helpers from drm_syncobj_create

For amdgpu.

drm_syncobj_create is renamed to drm_syncobj_create_as_handle, and new
helpers drm_syncobj_create and drm_syncobj_get_handle are added.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: delete flag PP_VALID
Rex Zhu [Thu, 28 Sep 2017 08:12:51 +0000 (16:12 +0800)]
drm/amd/powerplay: delete flag PP_VALID

don't need to check pp_valid, all pp
export functions are moved to ip_funcs
and pp_funcs. so just need to check the
function point.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: move set_clockgating_by_smu to pp func table
Rex Zhu [Tue, 26 Sep 2017 05:39:38 +0000 (13:39 +0800)]
drm/amd/powerplay: move set_clockgating_by_smu to pp func table

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: tidy up ret checks in amd_powerplay.c (v3)
Rex Zhu [Fri, 29 Sep 2017 06:36:15 +0000 (14:36 +0800)]
drm/amd/powerplay: tidy up ret checks in amd_powerplay.c (v3)

v2: squash in regression fix (Rex)
v3: Squash in regression fix (Rex)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: refine code in amd_powerplay.c (v2)
Rex Zhu [Fri, 29 Sep 2017 05:57:54 +0000 (13:57 +0800)]
drm/amd/powerplay: refine code in amd_powerplay.c (v2)

1. use flag PP_DPM_DISABLED within powerplay
   notify amdgpu dpm state by cgs interface.
2. delete redundant virtualization check in
   powerplay

v2: squash in fix for hwmgr_init (Rex)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agoamdgpu/pp: rewrite fiji pwr virus upload code.
Dave Airlie [Fri, 29 Sep 2017 02:30:23 +0000 (12:30 +1000)]
amdgpu/pp: rewrite fiji pwr virus upload code.

Along the same lines as rewriting the polaris code, this rewrites
the fiji code, and reduces the driver size by ~40k.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agoamdgpu/pp: rewrite polaris pwrvirus upload code.
Dave Airlie [Fri, 29 Sep 2017 02:15:46 +0000 (12:15 +1000)]
amdgpu/pp: rewrite polaris pwrvirus upload code.

This reduces the pwrvirus table size by 30k, by moving the
sequences of writes to the data register into blocks.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agoamdgpu/nbio: use constant nbio_hdp_flush_reg structs.
Dave Airlie [Fri, 29 Sep 2017 00:47:43 +0000 (10:47 +1000)]
amdgpu/nbio: use constant nbio_hdp_flush_reg structs.

This removes the init path as well, since the init path
just did some constant init of some structs.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agoamdgpu/soc15: make the pcie index/data registers constant.
Dave Airlie [Fri, 29 Sep 2017 00:08:01 +0000 (10:08 +1000)]
amdgpu/soc15: make the pcie index/data registers constant.

These don't seem to change at runtime, and the initialisers
are constant data. This could be improved by not selecting
the apu/non-apu path on each pcie read/write access.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agoamdgpu/pp: constify soft_dummy_pp_table.
Dave Airlie [Fri, 29 Sep 2017 00:39:30 +0000 (10:39 +1000)]
amdgpu/pp: constify soft_dummy_pp_table.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agoamdgpu/pp: use array_size to size the pwrvirus tables.
Dave Airlie [Fri, 29 Sep 2017 01:12:30 +0000 (11:12 +1000)]
amdgpu/pp: use array_size to size the pwrvirus tables.

This avoids fragile hardcoding of array size.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agoamdgfx/gfx: don't use static objects for ce/de meta. (v2)
Dave Airlie [Fri, 29 Sep 2017 00:12:53 +0000 (10:12 +1000)]
amdgfx/gfx: don't use static objects for ce/de meta. (v2)

This isn't safe if we have multiple GPUs plugged in, since
there is only one copy of this struct in the bss, just allocate
on stack, it's 40/108 bytes which should be safe.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: Add a new flag for SR-IOV to share memory between PF & VF
Horace Chen [Wed, 27 Sep 2017 08:13:17 +0000 (16:13 +0800)]
drm/amdgpu: Add a new flag for SR-IOV to share memory between PF & VF

Add ATOM_VRAM_BLOCK_SRIOV_MSG_SHARE_RESERVATION to identify whether
driver need to reserve VRAM for SR-IOV shared memory.

Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: export new interfaces in amd_pm_funcs
Rex Zhu [Fri, 22 Sep 2017 06:38:45 +0000 (14:38 +0800)]
drm/amd/powerplay: export new interfaces in amd_pm_funcs

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: add comments in struct amd_pm_funcs define
Rex Zhu [Fri, 22 Sep 2017 06:33:24 +0000 (14:33 +0800)]
drm/amdgpu: add comments in struct amd_pm_funcs define

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: change dmesg log level in powerplay
Rex Zhu [Mon, 25 Sep 2017 13:18:19 +0000 (21:18 +0800)]
drm/amd/powerplay: change dmesg log level in powerplay

Use pr_debug to prevent spamming unimportant dmesg.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: create powerplay by cgs interface
Rex Zhu [Mon, 25 Sep 2017 12:46:37 +0000 (20:46 +0800)]
drm/amdgpu: create powerplay by cgs interface

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: add cgs interface to register pp handle
Rex Zhu [Mon, 25 Sep 2017 12:45:52 +0000 (20:45 +0800)]
drm/amdgpu: add cgs interface to register pp handle

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: delete pp_enable in adev
Rex Zhu [Mon, 25 Sep 2017 10:51:50 +0000 (18:51 +0800)]
drm/amdgpu: delete pp_enable in adev

amdgpu not care powerplay or dpm is enabled.
just check ip functions and pp functions

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: delete dead code about fw load check
Rex Zhu [Mon, 25 Sep 2017 09:50:13 +0000 (17:50 +0800)]
drm/amdgpu: delete dead code about fw load check

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: fix memory leak in powerplay
Rex Zhu [Mon, 25 Sep 2017 09:34:00 +0000 (17:34 +0800)]
drm/amd/powerplay: fix memory leak in powerplay

cgs device not free.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: move amdgpu_ucode_init_bo to amdgpu_device.c
Rex Zhu [Fri, 22 Sep 2017 10:03:59 +0000 (18:03 +0800)]
drm/amdgpu: move amdgpu_ucode_init_bo to amdgpu_device.c

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: move common pm sysfs code to amdgpu_device.c
Rex Zhu [Fri, 22 Sep 2017 09:47:27 +0000 (17:47 +0800)]
drm/amdgpu: move common pm sysfs code to amdgpu_device.c

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: Handle GPUVM fault storms
Felix Kuehling [Thu, 21 Sep 2017 20:26:41 +0000 (16:26 -0400)]
drm/amdgpu: Handle GPUVM fault storms

When many wavefronts cause VM faults at the same time, it can
overwhelm the interrupt handler and cause IH ring overflows before
the driver can notify or kill the faulting application.

As a workaround I'm introducing limited per-VM fault credit. After
that number of VM faults have occurred, further VM faults are
filtered out at the prescreen stage of processing.

This depends on the PASID in the interrupt packet, so it currently
only works for KFD contexts.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/chash: Fix typo
Felix Kuehling [Wed, 20 Sep 2017 15:38:19 +0000 (11:38 -0400)]
drm/amd/chash: Fix typo

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: fix spelling mistake: "dividable" -> "divisible"
Colin Ian King [Thu, 28 Sep 2017 10:35:52 +0000 (11:35 +0100)]
drm/amd/powerplay: fix spelling mistake: "dividable" -> "divisible"

Trivial fix to spelling mistakes in pr_err error message and ASSERT
messages.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agoamdgpu/pp: reduce size of vega10_fuses_default
Dave Airlie [Thu, 28 Sep 2017 06:34:44 +0000 (16:34 +1000)]
amdgpu/pp: reduce size of vega10_fuses_default

I've no idea why this is like this, why store 64-bit fields
as a string, and then parse the strings, this is just over
engineered.

Reduce the size of the amdgpu.o by 80k.

   text    data     bss     dec     hex filename
1331332   17982    1008 1350322  149ab2 amdgpu.o
1244668   17982    1008 1263658  13482a amdgpu.o

Acked-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Alex Deucer <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: use designated initialiser for thermal_irq_src.
Dave Airlie [Thu, 28 Sep 2017 06:12:28 +0000 (16:12 +1000)]
drm/amdgpu: use designated initialiser for thermal_irq_src.

This fixes the 0-day build warning.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agoamdgpu/pp: slim down the pwr virus tables.
Dave Airlie [Thu, 28 Sep 2017 04:48:38 +0000 (14:48 +1000)]
amdgpu/pp: slim down the pwr virus tables.

This is what I'd call slightly overengineered, we waste 40k on
storing a value that is write or end, when we could just use the
register value to denote end.

Remove the virus command parameter, and save
   text    data     bss     dec     hex filename
1412724   17982    1008 1431714  15d8a2 ../drm-next-build/drivers/gpu/drm/amd/amdgpu/amdgpu.o
1331332   17982    1008 1350322  149ab2 ../drm-next-build/drivers/gpu/drm/amd/amdgpu/amdgpu.o

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agoamdgpu/pp: move amdgpu_fuses_default into static const.
Dave Airlie [Wed, 27 Sep 2017 23:36:55 +0000 (09:36 +1000)]
amdgpu/pp: move amdgpu_fuses_default into static const.

There is no reason that this gets passed back into the function
from outside the file, just reference the table directly.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agoamdgpu/pp: move PhwVega10_Magic to static const.
Dave Airlie [Wed, 27 Sep 2017 23:32:49 +0000 (09:32 +1000)]
amdgpu/pp: move PhwVega10_Magic to static const.

This isn't used outside this file.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agoamdgpu/pp: remove ci_smc/smumgr split.
Dave Airlie [Wed, 27 Sep 2017 23:01:47 +0000 (09:01 +1000)]
amdgpu/pp: remove ci_smc/smumgr split.

This split serves no purpose, and we can make a bunch of functions
static now.

There are lots of cases of this sort of split in the powerplay code,
please start cleaning them up. Ideally the function table is in the
same file as all the implementations used in it.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu/vega10: static constify channel_number
Dave Airlie [Wed, 27 Sep 2017 20:51:21 +0000 (06:51 +1000)]
drm/amdgpu/vega10: static constify channel_number

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu/pp: constify some powerplay tables
Dave Airlie [Wed, 27 Sep 2017 20:50:54 +0000 (06:50 +1000)]
drm/amdgpu/pp: constify some powerplay tables

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agoamdgpu/powerplay: constify large struct
Dave Airlie [Wed, 27 Sep 2017 20:44:36 +0000 (06:44 +1000)]
amdgpu/powerplay: constify large struct

This moves this from being global data to global rodata, I'm
sure it would be easy to move it to being local data.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agoamdgpu: don't ask about CHASH just default it for now.
Dave Airlie [Wed, 27 Sep 2017 22:37:34 +0000 (08:37 +1000)]
amdgpu: don't ask about CHASH just default it for now.

If we bump this up a level, we can ask about it, for now,
just default to what amdgpu does.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: map compute rings by least recently used pipe
Andres Rodriguez [Tue, 26 Sep 2017 21:43:14 +0000 (17:43 -0400)]
drm/amdgpu: map compute rings by least recently used pipe

This patch provides a guarantee that the first n queues allocated by
an application will be on different pipes. Where n is the number of
pipes available from the hardware.

This helps avoid ring aliasing which can result in work executing in
time-sliced mode instead of truly parallel mode.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: add option for force enable multipipe policy for compute
Andres Rodriguez [Tue, 26 Sep 2017 16:22:46 +0000 (12:22 -0400)]
drm/amdgpu: add option for force enable multipipe policy for compute

Useful for testing the effects of multipipe compute without recompiling.

Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: use multipipe compute policy on non PL11 asics
Andres Rodriguez [Tue, 26 Sep 2017 16:22:45 +0000 (12:22 -0400)]
drm/amdgpu: use multipipe compute policy on non PL11 asics

A performance regression for OpenCL tests on Polaris11 had this feature
disabled for all asics.

Instead, disable it selectively on the affected asics.

Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: fix vf error handling
Alex Deucher [Thu, 28 Sep 2017 13:47:32 +0000 (09:47 -0400)]
drm/amdgpu: fix vf error handling

The error handling for virtual functions assumed a single
vf per VM and didn't properly account for bare metal.  Make
the error arrays per device and add locking.

Reviewed-by: Gavin Wan <gavin.wan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: clarify license in amdgpu_trace_points.c
Alex Deucher [Tue, 19 Sep 2017 14:20:27 +0000 (10:20 -0400)]
drm/amdgpu: clarify license in amdgpu_trace_points.c

It was not clear.  The rest of the driver is MIT/X11.

Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: Add gem_prime_mmap support
Samuel Li [Tue, 22 Aug 2017 19:25:33 +0000 (15:25 -0400)]
drm/amdgpu: Add gem_prime_mmap support

v2: drop hdp invalidate/flush.
v3: honor pgoff during prime mmap. Add a barrier after cpu access.
v4: drop begin/end_cpu_access() for now, revisit later.

Signed-off-by: Samuel Li <Samuel.Li@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: delete dead code in smumgr
Rex Zhu [Wed, 20 Sep 2017 09:34:15 +0000 (17:34 +0800)]
drm/amd/powerplay: delete dead code in smumgr

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: delete SMUM_FIELD_MASK
Rex Zhu [Wed, 20 Sep 2017 09:33:55 +0000 (17:33 +0800)]
drm/amd/powerplay: delete SMUM_FIELD_MASK

repeated defining in hwmgr.h

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: delete SMUM_WAIT_INDIRECT_FIELD
Rex Zhu [Wed, 20 Sep 2017 09:31:07 +0000 (17:31 +0800)]
drm/amd/powerplay: delete SMUM_WAIT_INDIRECT_FIELD

repeated defining in hwmgr.h
use PHM_WAIT_INDIRECT_FIELD instand.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: delete SMUM_READ_FIELD
Rex Zhu [Wed, 20 Sep 2017 09:26:03 +0000 (17:26 +0800)]
drm/amd/powerplay: delete SMUM_READ_FIELD

repeated defining in hwmgr.h

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: delete SMUM_SET_FIELD
Rex Zhu [Wed, 20 Sep 2017 09:24:58 +0000 (17:24 +0800)]
drm/amd/powerplay: delete SMUM_SET_FIELD

repeated defining in hwmgr.h

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: delete SMUM_READ_VFPF_INDIRECT_FIELD
Rex Zhu [Wed, 20 Sep 2017 09:21:25 +0000 (17:21 +0800)]
drm/amd/powerplay: delete SMUM_READ_VFPF_INDIRECT_FIELD

repeated defining in hwmgr.h

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: delete SMUM_WRITE_VFPF_INDIRECT_FIELD
Rex Zhu [Wed, 20 Sep 2017 09:19:58 +0000 (17:19 +0800)]
drm/amd/powerplay: delete SMUM_WRITE_VFPF_INDIRECT_FIELD

repeated defining in hwmgr.h

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: delete SMUM_WRITE_FIELD
Rex Zhu [Wed, 20 Sep 2017 09:18:16 +0000 (17:18 +0800)]
drm/amd/powerplay: delete SMUM_WRITE_FIELD

the macro is as same as PHM_WRITE_FIELD

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: delete SMU_WRITE_INDIRECT_FIELD
Rex Zhu [Wed, 20 Sep 2017 09:17:08 +0000 (17:17 +0800)]
drm/amd/powerplay: delete SMU_WRITE_INDIRECT_FIELD

the macro is as same as PHM_WRITE_INDIRECT_FIELD

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: move macros to hwmgr.h
Rex Zhu [Wed, 20 Sep 2017 09:29:23 +0000 (17:29 +0800)]
drm/amd/powerplay: move macros to hwmgr.h

the macro is not relevant to SMU,
so rename SMU_WAIT_FIELD_UNEQUAL to
PHM_WAIT_FIELD_UNEQUAL and move to hwmgr.h

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: move PHM_WAIT_VFPF_INDIRECT_FIELD to hwmgr.h
Rex Zhu [Wed, 20 Sep 2017 09:04:33 +0000 (17:04 +0800)]
drm/amd/powerplay: move PHM_WAIT_VFPF_INDIRECT_FIELD to hwmgr.h

the macro is not relevant to SMU, so move to hwmgr.h
and rename to PHM_WAIT_VFPF_INDIRECT_FIELD

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: move SMUM_WAIT_VFPF_INDIRECT_FIELD_UNEQUAL to hwmgr.h
Rex Zhu [Wed, 20 Sep 2017 09:00:50 +0000 (17:00 +0800)]
drm/amd/powerplay: move SMUM_WAIT_VFPF_INDIRECT_FIELD_UNEQUAL to hwmgr.h

the macro is not relevant to SMU, so move to hwmgr.h
and rename to PHM_WAIT_VFPF_INDIRECT_FIELD_UNEQUAL

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: move SMUM_WAIT_INDIRECT_FIELD_UNEQUAL to hwmgr.h
Rex Zhu [Wed, 20 Sep 2017 11:28:29 +0000 (19:28 +0800)]
drm/amd/powerplay: move SMUM_WAIT_INDIRECT_FIELD_UNEQUAL to hwmgr.h

the macro is not relevent to SMU, so move to hwmgr.h
and rename to PHM_WAIT_INDIRECT_FIELD_UNEQUAL

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: add new helper functions in hwmgr.h
Rex Zhu [Wed, 20 Sep 2017 11:22:01 +0000 (19:22 +0800)]
drm/amd/powerplay: add new helper functions in hwmgr.h

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: use SMU_IND_INDEX/DATA_11 pair
Rex Zhu [Wed, 20 Sep 2017 08:49:29 +0000 (16:49 +0800)]
drm/amd/powerplay: use SMU_IND_INDEX/DATA_11 pair

in VFPF macros to support virtualization

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: refine powerplay code.
Rex Zhu [Tue, 26 Sep 2017 17:28:27 +0000 (13:28 -0400)]
drm/amd/powerplay: refine powerplay code.

delete struct smumgr, put smu backend function table
in struct hwmgr

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: delete dead code in hwmgr.h
Rex Zhu [Wed, 20 Sep 2017 07:41:33 +0000 (15:41 +0800)]
drm/amd/powerplay: delete dead code in hwmgr.h

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: refine interface in struct pp_smumgr_func
Rex Zhu [Wed, 20 Sep 2017 03:22:56 +0000 (11:22 +0800)]
drm/amd/powerplay: refine interface in struct pp_smumgr_func

unify to use struct hwmgr as function parameter in
smumgr.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: simplify pinning into visible VRAM
Christian König [Mon, 11 Sep 2017 15:29:26 +0000 (17:29 +0200)]
drm/amdgpu: simplify pinning into visible VRAM

Just set the CPU access required flag when we pin it.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu:fix firmware memoryleak(v2)
Monk Liu [Tue, 19 Sep 2017 08:09:53 +0000 (16:09 +0800)]
drm/amdgpu:fix firmware memoryleak(v2)

this fix memory leak due to request_firmware after driver
unloaded

v2:
release gmc firmware for gmc6/7/8 as well

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu:fix uvd ring fini routine(v2)
Monk Liu [Fri, 15 Sep 2017 08:43:01 +0000 (16:43 +0800)]
drm/amdgpu:fix uvd ring fini routine(v2)

fix missing finish uvd enc_ring.
v2:
since the adev pointer check in already in ring_fini
so drop the check outsider

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu/sriov:alloc KIQ MQD in VRAM(v2)
Monk Liu [Thu, 21 Sep 2017 07:10:06 +0000 (15:10 +0800)]
drm/amdgpu/sriov:alloc KIQ MQD in VRAM(v2)

this way after KIQ MQD released in drv unloading, CPC
can still let KIQ access this MQD thus RLCV SAVE_VF
will not fail

v2:
always use VRAM domain for KIQ MQD no matter BM or SRIOV

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu:unmap KCQ in gfx hw_fini(v2)
Monk Liu [Thu, 21 Sep 2017 06:59:40 +0000 (14:59 +0800)]
drm/amdgpu:unmap KCQ in gfx hw_fini(v2)

v2:
move kcq_disable out of SRIOV, make it genearal

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu:halt when vm fault
Monk Liu [Tue, 4 Jul 2017 08:40:58 +0000 (16:40 +0800)]
drm/amdgpu:halt when vm fault

only with this way we can debug the VMC page fault issue

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: Add copy_pte_num_dw member in amdgpu_vm_pte_funcs
Yong Zhao [Tue, 19 Sep 2017 16:58:15 +0000 (12:58 -0400)]
drm/amdgpu: Add copy_pte_num_dw member in amdgpu_vm_pte_funcs

Use it to replace the hard coded value in amdgpu_vm_bo_update_mapping().

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: Fix a bug in amdgpu_fill_buffer()
Yong Zhao [Fri, 15 Sep 2017 22:20:37 +0000 (18:20 -0400)]
drm/amdgpu: Fix a bug in amdgpu_fill_buffer()

When max_bytes is not 8 bytes aligned and bo size is larger than
max_bytes, the last 8 bytes in a ttm node may be left unchanged.
For example, on pre SDMA 4.0, max_bytes = 0x1fffff, and the bo size
is 0x200000, the problem will happen.

In order to fix the problem, we separately store the max nums of
PTEs/PDEs a single operation can set in amdgpu_vm_pte_funcs
structure, rather than inferring it from bytes limit of SDMA
constant fill, i.e. fill_max_bytes.

Together with the fix, we replace the hard code value "10" in
amdgpu_vm_bo_update_mapping() with the corresponding values from
structure amdgpu_vm_pte_funcs.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: Correct bytes limit for SDMA 3.0 copy and fill
Yong Zhao [Mon, 18 Sep 2017 18:25:31 +0000 (14:25 -0400)]
drm/amdgpu: Correct bytes limit for SDMA 3.0 copy and fill

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: use 2MB fragment size for GFX6,7 and 8
Christian König [Mon, 18 Sep 2017 12:32:38 +0000 (14:32 +0200)]
drm/amdgpu: use 2MB fragment size for GFX6,7 and 8

Use 2MB fragment size by default for older hardware generations as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: John Bridgman <john.bridgman@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: Fix driver reloading failure
Xiangliang.Yu [Thu, 21 Sep 2017 02:19:49 +0000 (10:19 +0800)]
drm/amdgpu: Fix driver reloading failure

SRIOV doesn't implement PMC capability of PCIe, so it can't update
power state by reading PMC register.

Currently, amdgpu driver doesn't disable pci device when removing
driver, the enable_cnt of pci device will not be decrease to 0.
When reloading driver, pci_enable_device will do nothing as
enable_cnt is not zero. And power state will not be updated as PMC
is not support.
So current_state of pci device is not D0 state and pci_enable_msi
return fail.

Add pci_disable_device when remmoving driver to fix the issue.

Signed-off-by: Xiangliang.Yu <Xiangliang.Yu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: refine phm_register_thermal_interrupt interface
Rex Zhu [Thu, 21 Sep 2017 02:34:48 +0000 (10:34 +0800)]
drm/amd/powerplay: refine phm_register_thermal_interrupt interface

currently, not all asics implement this callback function
so not return error to avoid powerplay initialize failed
in those asices

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/amdgpu: add vega10/raven mmhub/athub golden settings
Evan Quan [Wed, 20 Sep 2017 08:25:40 +0000 (16:25 +0800)]
drm/amd/amdgpu: add vega10/raven mmhub/athub golden settings

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: change alert temperature range
Eric Huang [Tue, 19 Sep 2017 17:32:10 +0000 (13:32 -0400)]
drm/amd/powerplay: change alert temperature range

Change to more meaningful range that triggers thermal
interrupts.

Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: implement register thermal interrupt for Vega10
Eric Huang [Fri, 15 Sep 2017 20:43:38 +0000 (16:43 -0400)]
drm/amd/powerplay: implement register thermal interrupt for Vega10

Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/powerplay: add register thermal interrupt in hwmgr_hw_init
Eric Huang [Fri, 15 Sep 2017 20:38:49 +0000 (16:38 -0400)]
drm/amd/powerplay: add register thermal interrupt in hwmgr_hw_init

Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu: add cgs query info of pci bus devfn
Eric Huang [Fri, 15 Sep 2017 20:33:38 +0000 (16:33 -0400)]
drm/amdgpu: add cgs query info of pci bus devfn

Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/amdgpu: Partial revert of iova debugfs
Tom St Denis [Tue, 19 Sep 2017 15:29:04 +0000 (11:29 -0400)]
drm/amd/amdgpu: Partial revert of iova debugfs

We discovered that on some devices even with iommu enabled
you can access all of system memory through the iommu translation.

Therefore, we revert the read method to the translation only service
and drop the write method completely.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christan König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/amgpu: update vega10 sdma golden setting
Evan Quan [Wed, 20 Sep 2017 02:55:44 +0000 (10:55 +0800)]
drm/amd/amgpu: update vega10 sdma golden setting

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amd/amgpu: update raven sdma golden setting
Evan Quan [Wed, 20 Sep 2017 02:53:39 +0000 (10:53 +0800)]
drm/amd/amgpu: update raven sdma golden setting

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu/sriov:fix memory leak after gpu reset
Monk Liu [Fri, 15 Sep 2017 06:35:09 +0000 (14:35 +0800)]
drm/amdgpu/sriov:fix memory leak after gpu reset

GPU reset will require all hw doing hw_init thus
ucode_init_bo will be invoked again, which lead to
memory leak

skip the fw_buf allocation during sriov gpu reset to avoid
memory leak.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu:make ctx_add_fence interruptible(v2)
Monk Liu [Fri, 15 Sep 2017 05:40:31 +0000 (13:40 +0800)]
drm/amdgpu:make ctx_add_fence interruptible(v2)

otherwise a gpu hang will make application couldn't be killed
under timedout=0 mode

v2:
Fix memoryleak job/job->s_fence issue
unlock mn
remove the ERROR msg after waiting being interrupted

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu/sriov:init csb for gfxv9
Monk Liu [Fri, 15 Sep 2017 08:58:08 +0000 (16:58 +0800)]
drm/amdgpu/sriov:init csb for gfxv9

RLC need CSB registers initiated under SRIOV during world switch
otherwise the clear state buffer behav will not be recovered to
current VF scheme after switch back

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu/sriov:increate mailbox polling timeout
Horace Chen [Wed, 28 Jun 2017 09:51:50 +0000 (17:51 +0800)]
drm/amdgpu/sriov:increate mailbox polling timeout

increase timeout to 12 seconds,because there may have multiple
FLR waiting for done, the waiting time of events may be long,
increase to 12s to reduce timeout failure.

Signed-off-by: Horace Chen <horace.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu/sriov:fix page fault issue of driver unload
Monk Liu [Fri, 15 Sep 2017 07:34:52 +0000 (15:34 +0800)]
drm/amdgpu/sriov:fix page fault issue of driver unload

bo_free on csa is too late to put in amdgpu_fini because that
time ttm is already finished,
Move it earlier to avoid the page fault.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu:use formal register to trigger hdp invalidate
Monk Liu [Tue, 4 Jul 2017 07:43:38 +0000 (15:43 +0800)]
drm/amdgpu:use formal register to trigger hdp invalidate

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu:hdp flush should be put it initialized
Monk Liu [Fri, 15 Sep 2017 07:03:24 +0000 (15:03 +0800)]
drm/amdgpu:hdp flush should be put it initialized

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu:insert TMZ_BEGIN
Monk Liu [Fri, 9 Jun 2017 07:04:49 +0000 (15:04 +0800)]
drm/amdgpu:insert TMZ_BEGIN

FRAME_CONTROL(begin) is needed for vega10 due to ucode logic change,
it can fix some CTS random fail under gfx preemption enabled mode.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu/sriov:don't load psp fw during gpu reset
Monk Liu [Fri, 15 Sep 2017 10:42:12 +0000 (18:42 +0800)]
drm/amdgpu/sriov:don't load psp fw during gpu reset

At least for SRIOV we found reload PSP fw during
gpu reset cause PSP hang.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu/sriov:move in_reset to adev and rename
Monk Liu [Fri, 15 Sep 2017 10:57:12 +0000 (18:57 +0800)]
drm/amdgpu/sriov:move in_reset to adev and rename

currently in_reset is only used in sriov gpu reset, and it
will be used for other non-gfx hw component later, like
PSP, so move it from gfx to adev and rename to in_sriov_reset
make more sense.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu:no kiq in IH
Monk Liu [Thu, 14 Sep 2017 11:45:33 +0000 (19:45 +0800)]
drm/amdgpu:no kiq in IH

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agodrm/amdgpu/sriov:fix missing error handling
Monk Liu [Tue, 12 Sep 2017 06:33:29 +0000 (14:33 +0800)]
drm/amdgpu/sriov:fix missing error handling

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>