Eric Huang [Fri, 9 Jul 2021 18:59:20 +0000 (14:59 -0400)]
Revert "drm/amdkfd: Add heavy-weight TLB flush after unmapping"
This reverts commit
1098d658bef05e5fee634aab0b6a1fa590cfca24.
Reason for revert: it causes regressions on several Asics.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eric Huang [Fri, 9 Jul 2021 18:58:16 +0000 (14:58 -0400)]
Revert "drm/amdgpu: Add table_freed parameter to amdgpu_vm_bo_update"
This reverts commit
075e8080c1a7571563171a07fa9ce47c4bc80044.
Reason for revert: the related commit is reverted.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eric Huang [Fri, 9 Jul 2021 18:57:11 +0000 (14:57 -0400)]
Revert "drm/amdkfd: Make TLB flush conditional on mapping"
This reverts commit
31f33243788dcbae8bd2819ed83923a73f7dfd30.
Reason for revert: it causes regressions on several Asics.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eric Huang [Fri, 9 Jul 2021 18:55:03 +0000 (14:55 -0400)]
Revert "drm/amdgpu: Fix warning of Function parameter or member not described"
This reverts commit
7a68d188d1c4a9d947369acaa19040a58baaaeda.
Reason for revert: the related commit is reverted.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eric Huang [Fri, 9 Jul 2021 18:51:34 +0000 (14:51 -0400)]
Revert "drm/amdkfd: Add memory sync before TLB flush on unmap"
This reverts commit
3be4dca197010d1328df8b11febc8c40491be498.
Reason for revert: it causes regressions on several Asics.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eric Huang [Fri, 9 Jul 2021 18:48:33 +0000 (14:48 -0400)]
Revert "drm/amdkfd: Only apply TLB flush optimization on ALdebaran"
This reverts commit
51627f03804173a64d23828bc9e4d8474451814f.
Reason for revert: it causes regression on Aldebaran.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Chengming Gui [Fri, 9 Jul 2021 08:23:48 +0000 (16:23 +0800)]
drm/amd/pm: Fix BACO state setting for Beige_Goby
Correct BACO state setting for Beige_Goby
Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Reviewed-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Emily.Deng [Thu, 1 Oct 2020 04:41:50 +0000 (12:41 +0800)]
drm/amdgpu: Restore msix after FLR
After FLR, the msix will be cleared, so need to re-enable it.
Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Signed-off-by: Emily.Deng <Emily.Deng@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Felix Kuehling [Wed, 7 Jul 2021 21:55:55 +0000 (17:55 -0400)]
drm/amdkfd: Allow CPU access for all VRAM BOs
The thunk needs to mmap all BOs for CPU access to allow the debugger to
access them. Invisible ones are mapped with PROT_NONE.
Fixes:
71df0368e9b6 ("drm/amdgpu: Implement mmap as GEM object function")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Zhan Liu [Thu, 8 Jul 2021 04:51:52 +0000 (00:51 -0400)]
drm/amdgpu/display - only update eDP's backlight level when necessary
[Why]
The original logic is to update eDP's backlight level
on every amdgpu dm atomic commit, which causes excessive
DMUB write. As a result, when playing game or moving window
around, DMUB timeout and system lagging are observed.
[How]
We only need to update eDP's backlight level when current level
doesn't match requested level.
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
John Clements [Wed, 9 Jun 2021 08:32:37 +0000 (16:32 +0800)]
drm/amdgpu: initialize umc ras function
support umc ras function initialization for aldebaran
v2: squash in compile fix
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Wed, 7 Jul 2021 16:42:34 +0000 (12:42 -0400)]
drm/amdkfd: handle fault counters on invalid address
prange is NULL if vm fault retry on invalid address, for this case, can
not use prange to get pdd, use adev to get gpuidx and then get pdd
instead, then increase pdd vm fault counter.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Emily Deng [Tue, 12 May 2020 10:27:21 +0000 (18:27 +0800)]
drm/amdgpu: Correct the irq numbers for virtual crtc
The irq number should be decided by num_crtc, and the num_crtc could change
by parameter.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiaomeng Hou [Wed, 7 Jul 2021 08:50:01 +0000 (16:50 +0800)]
drm/amd/display: update header file name
Update the register header file name.
Signed-off-by: Xiaomeng Hou <Xiaomeng.Hou@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiaomeng Hou [Thu, 1 Jul 2021 06:09:18 +0000 (14:09 +0800)]
drm/amd/pm: drop smu_v13_0_1.c|h files for yellow carp
Since there's nothing special in smu implementation for yellow carp,
it's better to reuse the common smu_v13_0 interfaces and drop the
specific smu_v13_0_1.c|h files.
v2: remove the duplicate register offset and shift mask header files as
well.
Signed-off-by: Xiaomeng Hou <Xiaomeng.Hou@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Fri, 25 Jun 2021 07:57:40 +0000 (15:57 +0800)]
drm/amd/pm: bump DRIVER_IF_VERSION for Sienna Cichlid
To suppress the annoying warning about version mismatch.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Tue, 6 Jul 2021 04:27:56 +0000 (12:27 +0800)]
drm/amd/pm: update the gpu metrics data retrieving for Sienna Cichlid
Due to the structure layout change: "uint32_t ThrottlerStatus" -> "
uint8_t ThrottlingPercentage[THROTTLER_COUNT]".
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Tue, 6 Jul 2021 04:25:29 +0000 (12:25 +0800)]
drm/amd/pm: new SmuMetrics data structure for Sienna Cichlid
Due to the structure layout change: "uint32_t ThrottlerStatus" -> "
uint8_t ThrottlingPercentage[THROTTLER_COUNT]".
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dan Carpenter [Sat, 3 Jul 2021 09:46:20 +0000 (12:46 +0300)]
drm/amdgpu: return -EFAULT if copy_to_user() fails
If copy_to_user() fails then this should return -EFAULT instead of
-EINVAL.
Fixes:
c65b0805e77919 ("drm/amdgpu: RAS EEPROM table is now in debugfs")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dan Carpenter [Sat, 3 Jul 2021 09:45:39 +0000 (12:45 +0300)]
drm/amdgpu: unlock on error in amdgpu_ras_debugfs_table_read()
This error path needs to unlock before returning. While we're at it,
the correct error code from copy_to_user() failure is -EFAULT, not
-EINVAL.
Fixes:
c65b0805e77919 ("drm/amdgpu: RAS EEPROM table is now in debugfs")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dan Carpenter [Sat, 3 Jul 2021 09:44:57 +0000 (12:44 +0300)]
drm/amdgpu: Fix signedness bug in __amdgpu_eeprom_xfer()
The i2c_transfer() function returns negatives or else the number of
messages transferred. This code does not work because ARRAY_SIZE()
is type size_t and so that means negative values of "r" are type
promoted to high positive values which are greater than the ARRAY_SIZE().
Fix this by changing the < to != which works regardless of type
promotion.
Fixes:
746b584762e452 ("drm/amdgpu: Fixes to the AMDGPU EEPROM driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dan Carpenter [Sat, 3 Jul 2021 09:44:12 +0000 (12:44 +0300)]
drm/amdgpu: fix a signedness bug in __verify_ras_table_checksum()
If amdgpu_eeprom_read() returns a negative error code then the error
handling checks:
if (res < buf_size) {
The problem is that "buf_size" is a u32 so negative values are type
promoted to a high positive values and the condition is false. Fix
this by changing the type of "buf_size" to int.
Fixes:
63d4c081a556a1 ("drm/amdgpu: Optimize EEPROM RAS table I/O")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aric Cyr [Tue, 22 Jun 2021 20:17:50 +0000 (16:17 -0400)]
drm/amd/display: increase max EDID size to 2k
[Why]
EDID CTS requires at least 2k (16 blocks) to be readable.
[How]
Increase EDID buffer size to 2k
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aric Cyr [Mon, 21 Jun 2021 16:01:45 +0000 (12:01 -0400)]
drm/amd/display: Round KHz up when calculating clock requests
[Why]
When requesting clocks from SMU which takes MHz inputs, DC will round
down KHz when converting to MHz, thus potentially requesting too low a
clock value.
[How]
Round up (ceil) when converting KHz to MHz for clock requests to SMU.
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aric Cyr [Mon, 21 Jun 2021 03:07:37 +0000 (23:07 -0400)]
drm/amd/display: 3.2.142
DC version 3.2.142 brings improvements in multiple areas. In summary, we
highlight:
- Freesync improvements
- Remove unnecessary assert
- Firmware release 0.0.72
- Improve the EDID manipulation and DML calculations
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Anthony Koo [Sun, 20 Jun 2021 02:53:05 +0000 (22:53 -0400)]
drm/amd/display: [FW Promotion] Release 0.0.72
- Updated SCR definition for FW boot options for Separate DCN init
for DMUB FW loaded in VBL
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alvin Lee [Thu, 17 Jun 2021 23:35:50 +0000 (19:35 -0400)]
drm/amd/display: Adjust types and formatting for future development
Type adjustments and formatting fixes.
Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dmytro Laktyushkin [Tue, 15 Jun 2021 19:11:31 +0000 (15:11 -0400)]
drm/amd/display: remove faulty assert
Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wesley Chalmers [Wed, 16 Jun 2021 20:11:23 +0000 (16:11 -0400)]
Revert "drm/amd/display: Always write repeater mode regardless of LTTPR"
This reverts commit
2b7605d73b97e2fa28e0817242e66ca968d2a7cb
Some displays are not lighting up when put in LTTPR Transparent Mode
Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Wed, 16 Jun 2021 21:11:12 +0000 (17:11 -0400)]
drm/amd/display: Fix updating infoframe for DCN3.1 eDP
[Why]
We're only treating TMDS as a valid target for infoframe updates which
results in PSR being unable to transition from state 4 to state 5.
[How]
Also allow infoframe updates for DCN3.1 - following how we handle
this path for earlier ASIC as well.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Eric Yang <eric.yang2@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Stylon Wang [Sat, 29 May 2021 06:19:20 +0000 (14:19 +0800)]
drm/amd/display: Add Freesync HDMI support to DM with DMUB
[Why]
Changes in DM needed to support Freesync HDMI on DMUB.
[How]
Change implementation to parse CEA blocks in case of DMUB-enabled ASICs.
Signed-off-by: Stylon Wang <stylon.wang@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wang [Tue, 15 Jun 2021 21:22:57 +0000 (17:22 -0400)]
drm/amd/display: Add null checks
Added NULL checks before two problematic statements
Signed-off-by: Wang <anguwang@amd.com>
Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Chun-Liang Chang [Fri, 11 Jun 2021 03:05:20 +0000 (11:05 +0800)]
drm/amd/display: DMUB Outbound Interrupt Process-X86
[Why]
dmub would notify x86 response time violation by GPINT_DATAOUT
[How]
1. Use GPINT_DATAOUT to trigger x86 interrupt
2. Register GPINT_DATAOUT interrupt handler.
3. Trigger ACR while GPINT_DATAOUT occurred.
Signed-off-by: Chun-Liang Chang <Chun-Liang.Chang@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wenjing Liu [Tue, 4 May 2021 20:39:08 +0000 (16:39 -0400)]
drm/amd/display: isolate link training setting override to its own function
There is a difference between our default behavior and override
behavior. For default behavior we need to decide link training settings
within specs' limitation and mandates.
For override behavior we do not need to follow all these requirements.
We are isolating override decision to its own function to maintain the
integrity of our specs compliant default behavior.
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Reviewed-by: George Shen <George.Shen@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Fri, 2 Jul 2021 22:35:14 +0000 (18:35 -0400)]
drm/amdgpu: Return error if no RAS
In amdgpu_ras_query_error_count() return an error
if the device doesn't support RAS. This prevents
that function from having to always set the values
of the integer pointers (if set), and thus
prevents function side effects--always to have to
set values of integers if integer pointers set,
regardless of whether RAS is supported or
not--with this change this side effect is
mitigated.
Also, if no pointers are set, don't count, since
we've no way of reporting the counts.
Also, give this function a kernel-doc.
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Reported-by: Tom Rix <trix@redhat.com>
Fixes:
a46751fbcde505 ("drm/amdgpu: Fix RAS function interface")
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jingwen Chen [Thu, 1 Jul 2021 02:19:17 +0000 (10:19 +0800)]
drm/amdgpu: SRIOV flr_work should take write_lock
[Why]
If flr_work takes read_lock, then other threads who takes
read_lock can access hardware when host is doing vf flr.
[How]
flr_work should take write_lock to avoid this case.
Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Fri, 2 Jul 2021 23:58:42 +0000 (19:58 -0400)]
drm/amdgpu: The I2C IP doesn't support 0 writes/reads
The I2C IP doesn't support writes or reads of 0 bytes.
In order for a START/STOP transaction to take
place on the bus, the data written/read has to be
at least one byte.
That is, you cannot generate a write with 0 bytes,
just to get the ACK from a device, just so you can
probe that device if it is on the bus and so to
discover all devices on the bus--you'll have to
read at least one byte. Writes of 0 bytes generate
no START/STOP on this I2C IP--the bus is not
engaged at all.
Set the I2C_AQ_NO_ZERO_LEN to the existing I2C
quirk tables for Aldebaran, Arcturus, Navi10 and
Sienna Cichlid, and add a quirk table to the I2C
driver which drives the bus when the SMU
doesn't--for instance on Vega20.
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Lijo Lazar <Lijo.Lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Fri, 2 Jul 2021 23:58:42 +0000 (19:58 -0400)]
drm/amd/pm: Add I2C quirk table to Aldebaran
Add I2C quirk table to Aldebaran.
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Lijo Lazar <Lijo.Lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
YuBiao Wang [Tue, 29 Jun 2021 03:21:25 +0000 (11:21 +0800)]
drm/amdgpu: Read clock counter via MMIO to reduce delay (v5)
[Why]
GPU timing counters are read via KIQ under sriov, which will introduce
a delay.
[How]
It could be directly read by MMIO.
v2: Add additional check to prevent carryover issue.
v3: Only check for carryover for once to prevent performance issue.
v4: Add comments of the rough frequency where carryover happens.
v5: Remove mutex and gfxoff ctrl unused with current timing registers.
Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Acked-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.co>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eric Huang [Wed, 30 Jun 2021 21:58:12 +0000 (17:58 -0400)]
drm/amdkfd: Only apply TLB flush optimization on ALdebaran
It is based on reverting two patches back.
drm/amdkfd: Make TLB flush conditional on mapping
drm/amdgpu: Add table_freed parameter to amdgpu_vm_bo_update
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nirmoy Das [Fri, 2 Jul 2021 09:10:52 +0000 (11:10 +0200)]
drm/amdgpu: separate out vm pasid assignment
Use new helper function amdgpu_vm_set_pasid() to
assign vm pasid value. This also ensures that we don't free
a pasid from vm code as pasids are allocated somewhere else.
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nirmoy Das [Mon, 28 Jun 2021 21:29:39 +0000 (23:29 +0200)]
drm/amdgpu: use xarray for storing pasid in vm
Replace idr with xarray as we actually need hash functionality.
Cleanup code related to vm pasid by adding helper function.
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jiri Kosina [Thu, 24 Jun 2021 11:11:36 +0000 (13:11 +0200)]
drm/amdgpu: Avoid printing of stack contents on firmware load error
In case when psp_init_asd_microcode() fails to load ASD microcode file,
psp_v12_0_init_microcode() tries to print the firmware filename that
failed to load before bailing out.
This is wrong because:
- the firmware filename it would want it print is an incorrect one as
psp_init_asd_microcode() and psp_v12_0_init_microcode() are loading
different filenames
- it tries to print fw_name, but that's not yet been initialized by that
time, so it prints random stack contents, e.g.
amdgpu 0000:04:00.0: Direct firmware load for amdgpu/renoir_asd.bin failed with error -2
amdgpu 0000:04:00.0: amdgpu: fail to initialize asd microcode
amdgpu 0000:04:00.0: amdgpu: psp v12.0: Failed to load firmware "\xfeTO\x8e\xff\xff"
Fix that by bailing out immediately, instead of priting the bogus error
message.
Reported-by: Vojtech Pavlik <vojtech@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jiri Kosina [Thu, 24 Jun 2021 11:20:21 +0000 (13:20 +0200)]
drm/amdgpu: Fix resource leak on probe error path
This reverts commit
4192f7b5768912ceda82be2f83c87ea7181f9980.
It is not true (as stated in the reverted commit changelog) that we never
unmap the BAR on failure; it actually does happen properly on
amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() ->
amdgpu_device_fini() error path.
What's worse, this commit actually completely breaks resource freeing on
probe failure (like e.g. failure to load microcode), as
amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too
early, leaving all the resources that'd normally be freed in
amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading
to all sorts of oopses when someone tries to, for example, access the
sysfs and procfs resources which are still around while the driver is
gone.
Fixes:
4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure")
Reported-by: Vojtech Pavlik <vojtech@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lang Yu [Thu, 24 Jun 2021 04:05:16 +0000 (12:05 +0800)]
drm/amdgpu: show explicit name instead of id in psp_cmd_submit_buf
Use amdgpu_ucode_name to show ucode name and psp_gfx_cmd_name to
show psp_gfx_cmd name in psp_cmd_submit_buf.
v2: adjust function name
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lang Yu [Thu, 24 Jun 2021 03:51:46 +0000 (11:51 +0800)]
drm/amdgpu: add function to show psp_gfx_cmd name via id
Implement function psp_gfx_cmd_name to show cmd name
via cmd id.
v2: rename it to psp_gfx_cmd_name
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lang Yu [Thu, 24 Jun 2021 03:36:57 +0000 (11:36 +0800)]
drm/amdgpu: add function to show ucode name via id
Implement function amdgpu_ucode_name to show ucode name
via ucode id.
v2: rename it to amdgpu_ucode_name
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 29 Jun 2021 14:11:12 +0000 (10:11 -0400)]
drm/amdgpu: add license to umc_8_7_0_sh_mask.h
Was missing. Add it.
Fixes:
6b36fa6143f6ca ("drm/amdgpu: add umc v8_7_0 IP headers")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lukas Bulwahn [Mon, 28 Jun 2021 10:53:34 +0000 (12:53 +0200)]
drm/amdgpu: rectify line endings in umc v8_7_0 IP headers
Commit
6b36fa6143f6 ("drm/amdgpu: add umc v8_7_0 IP headers") adds the new
file ./drivers/gpu/drm/amd/include/asic_reg/umc/umc_8_7_0_sh_mask.h with
DOS line endings, which is very uncommon for the kernel repository.
Rectify the line endings in this file with dos2unix.
Identified by a checkpatch evaluation on the whole kernel repository and
spot-checking for really unexpected checkpatch rule violations.
Reported-by: Dwaipayan Ray <dwaipayanray1@gmail.com>
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Mon, 28 Jun 2021 16:22:13 +0000 (12:22 -0400)]
drm/amd/pm: Simplify managed I2C transfer of Aldebaran
Simplify Aldebaran managed I2C transfer function
to correctly play with the upper I2C layers.
This gets it in line with Navi10, Acturus, and
Sienna Cichlid.
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Fri, 11 Jun 2021 06:15:49 +0000 (02:15 -0400)]
drm/amdgpu: Correctly disable the I2C IP block
On long transfers to the EEPROM device,
i.e. write, it is observed that the driver aborts
the transfer.
The reason for this is that the driver isn't
patient enough--the IC_STATUS register's contents
is 0x27, which is MST_ACTIVITY | TFE | TFNF |
ACTIVITY. That is, while the transmission FIFO is
empty, we, the I2C master device, are still
driving the bus.
Implement the correct procedure to disable
the block, as described in the DesignWare I2C
Databook, section 3.8.3 Disabling DW_apb_i2c on
page 56. Now there are no premature aborts on long
data transfers.
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Fri, 4 Jun 2021 03:58:19 +0000 (23:58 -0400)]
drm/amdgpu: Use a single loop
In smu_v11_0_i2c_transmit() use a single loop to
transmit bytes, instead of two nested loops.
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Fri, 30 Apr 2021 00:15:42 +0000 (20:15 -0400)]
drm/amdgpu: Fix koops when accessing RAS EEPROM
Debugfs RAS EEPROM files are available when
the ASIC supports RAS, and when the debugfs is
enabled, an also when "ras_enable" module
parameter is set to 0. However in this case,
we get a kernel oops when accessing some of
the "ras_..." controls in debugfs. The reason
for this is that struct amdgpu_ras::adev is
unset. This commit sets it, thus enabling access
to those facilities. Note that this facilitates
EEPROM access and not necessarily RAS features or
functionality.
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 30 Jun 2021 15:19:14 +0000 (11:19 -0400)]
drm/amdgpu: fix 64 bit divide in eeprom code
pos is 64 bits.
Fixes:
c65b0805e77919 ("drm/amdgpu: RAS EEPROM table is now in debugfs")
Cc: luben.tuikov@amd.com
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Thu, 8 Apr 2021 15:34:26 +0000 (11:34 -0400)]
drm/amdgpu: RAS EEPROM table is now in debugfs
Add "ras_eeprom_size" file in debugfs, which
reports the maximum size allocated to the RAS
table in EEROM, as the number of bytes and the
number of records it could store. For instance,
$cat /sys/kernel/debug/dri/0/ras/ras_eeprom_size
262144 bytes or 10921 records
$_
Add "ras_eeprom_table" file in debugfs, which
dumps the RAS table stored EEPROM, in a formatted
way. For instance,
$cat ras_eeprom_table
Signature Version FirstOffs Size Checksum
0x414D4452 0x00010000 0x00000014 0x000000EC 0x000000DA
Index Offset ErrType Bank/CU TimeStamp Offs/Addr MemChl MCUMCID RetiredPage
0 0x00014 ue 0x00 0x00000000607608DC 0x000000000000 0x00 0x00 0x000000000000
1 0x0002C ue 0x00 0x00000000607608DC 0x000000001000 0x00 0x00 0x000000000001
2 0x00044 ue 0x00 0x00000000607608DC 0x000000002000 0x00 0x00 0x000000000002
3 0x0005C ue 0x00 0x00000000607608DC 0x000000003000 0x00 0x00 0x000000000003
4 0x00074 ue 0x00 0x00000000607608DC 0x000000004000 0x00 0x00 0x000000000004
5 0x0008C ue 0x00 0x00000000607608DC 0x000000005000 0x00 0x00 0x000000000005
6 0x000A4 ue 0x00 0x00000000607608DC 0x000000006000 0x00 0x00 0x000000000006
7 0x000BC ue 0x00 0x00000000607608DC 0x000000007000 0x00 0x00 0x000000000007
8 0x000D4 ue 0x00 0x00000000607608DD 0x000000008000 0x00 0x00 0x000000000008
$_
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Xinhui Pan <xinhui.pan@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Tue, 6 Apr 2021 11:07:54 +0000 (07:07 -0400)]
drm/amdgpu: Optimize EEPROM RAS table I/O
Split functionality between read and write, which
simplifies the code and exposes areas of
optimization and more or less complexity, and take
advantage of that.
Read and write the table in one go; use a separate
stage to decode or encode the data, as opposed to
on the fly, which keeps the I2C bus busy. Use a
single read/write to read/write the table or at
most two if the number of records we're
reading/writing wraps around.
Check the check-sum of a table in EEPROM on init.
Update the checksum at the same time as when
updating the table header signature, when the
threshold was increased on boot.
Take advantage of arithmetic modulo 256, that is,
use a byte!, to greatly simplify checksum
arithmetic.
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Wed, 16 Jun 2021 17:21:03 +0000 (13:21 -0400)]
drm/amdgpu: Get rid of test function
The code is now tested from userspace.
Remove already macroed out test function.
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Wed, 16 Jun 2021 17:09:37 +0000 (13:09 -0400)]
drm/amdgpu: Some renames
Qualify with "ras_". Use kernel's own--don't
redefine your own.
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Wed, 16 Jun 2021 15:26:07 +0000 (11:26 -0400)]
drm/amdgpu: Nerf buff
buff --> buf. Essentially buffer abbreviates to
buf, remove 1/2 of it, or just the iron part, as
opposed to just the Er,
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Sat, 27 Mar 2021 07:57:49 +0000 (03:57 -0400)]
drm/amdgpu: Use explicit cardinality for clarity
RAS_MAX_RECORD_NUM may mean the maximum record
number, as in the maximum house number on your
street, or it may mean the maximum number of
records, as in the count of records, which is also
a number. To make this distinction whether the
number is ordinal (index) or cardinal (count),
rename this macro to RAS_MAX_RECORD_COUNT.
This makes it easy to understand what it refers
to, especially when we compute quantities such as,
how many records do we have left in the table,
especially when there are so many other numbers,
quantities and numerical macros around.
Also rename the long,
amdgpu_ras_eeprom_get_record_max_length() to the
more succinct and clear,
amdgpu_ras_eeprom_max_record_count().
When computing the threshold, which also deals
with counts, i.e. "how many", use cardinal
"max_eeprom_records_count", than the quantitative
"max_eeprom_records_len".
Simplify the logic here and there, as well.
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Sat, 27 Mar 2021 00:13:10 +0000 (20:13 -0400)]
drm/amdgpu: Simplify RAS EEPROM checksum calculations
Rename update_table_header() to
write_table_header() as this function is actually
writing it to EEPROM.
Use kernel types; use u8 to carry around the
checksum, in order to take advantage of arithmetic
modulo 8-bits (256).
Tidy up to 80 columns.
When updating the checksum, just recalculate the
whole thing.
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Fri, 26 Mar 2021 20:40:22 +0000 (16:40 -0400)]
drm/amdgpu: Fix amdgpu_ras_eeprom_init()
No need to account for the 2 bytes of EEPROM
address--this is now well abstracted away by
the fixes the the lower layers.
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Thu, 11 Mar 2021 15:34:28 +0000 (10:34 -0500)]
drm/amdgpu: Return result fix in RAS
The low level EEPROM write method, doesn't return
1, but the number of bytes written. Thus do not
compare to 1, instead, compare to greater than 0
for success.
Other cleanup: if the lower layers returned
-errno, then return that, as opposed to
overwriting the error code with one-fits-all
-EINVAL. For instance, some return -EAGAIN.
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Wed, 10 Mar 2021 19:02:04 +0000 (14:02 -0500)]
drm/amdgpu: Fix width of I2C address
The I2C address is kept as a 16-bit quantity in
the kernel. The I2C_TAR::I2C_TAR field is 10-bit
wide.
Fix the width of the I2C address for Vega20 from 8
bits to 16 bits to accommodate the full spectrum
of I2C address space.
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Tue, 9 Mar 2021 22:29:22 +0000 (17:29 -0500)]
drm/amd/pm: Simplify managed I2C transfer functions
Now that we have an I2C quirk table for
SMU-managed I2C controllers, the I2C core does the
checks for us, so we don't need to do them, and so
simplify the managed I2C transfer functions.
Also, for Arcturus and Navi10, fix setting the
command type from "cmd->CmdConfig" to "cmd->Cmd".
The latter is what appears to be taking in
the enumeration I2C_CMD_... as an integer,
not a bit-flag.
For Sienna, the "Cmd" field seems to have been
eliminated, and command type and flags all live in
the "CmdConfig" field--this is left untouched.
Fix: Detect and add changing of direction
bit-flag, as this is necessary for the SMU to
detect the direction change in the 1-d array of
data it gets.
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Sat, 20 Feb 2021 04:37:28 +0000 (23:37 -0500)]
drm/amd/pm: Extend the I2C quirk table
Extend the I2C quirk table for SMU access
controlled I2C adapters. Let the kernel I2C layer
check that the messages all have the same address,
and that their combined size doesn't exceed the
maximum size of a SMU software I2C request.
Suggested-by: Jean Delvare <jdelvare@suse.de>
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Thu, 11 Mar 2021 22:12:32 +0000 (17:12 -0500)]
drm/amdgpu: EEPROM: add explicit read and write
Add explicit amdgpu_eeprom_read() and
amdgpu_eeprom_write() for clarity.
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Thu, 11 Mar 2021 16:20:15 +0000 (11:20 -0500)]
drm/amdgpu: RAS xfer to read/write
Wrap amdgpu_ras_eeprom_xfer(..., bool write),
into amdgpu_ras_eeprom_read() and
amdgpu_ras_eeprom_write(), as that makes reading
and understanding the code clearer.
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Fri, 5 Feb 2021 23:48:01 +0000 (18:48 -0500)]
drm/amdgpu: Rename misspelled function
Instead of fixing the spelling in
amdgpu_ras_eeprom_process_recods(),
rename it to,
amdgpu_ras_eeprom_xfer(),
to look similar to other I2C and protocol
transfer (read/write) functions.
Also to keep the column span to within reason by
using a shorter name.
Change the "num" function parameter from "int" to
"const u32" since it is the number of items
(records) to xfer, i.e. their count, which cannot
be a negative number.
Also swap the order of parameters, keeping the
pointer to records and their number next to each
other, while the direction now becomes the last
parameter.
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Thu, 4 Feb 2021 19:15:01 +0000 (14:15 -0500)]
drm/amdgpu: RAS: EEPROM --> RAS
In amdgpu_ras_eeprom.c--the interface from RAS to
EEPROM, rename macros from EEPROM to RAS, to
indicate that the quantities and objects are RAS
specific, not EEPROM. We can decrease the RAS
table, or put it in different offset of EEPROM as
needed in the future.
Remove EEPROM_ADDRESS_SIZE macro definition, equal
to 2, from the file and calculations, as that
quantity is computed and added on the stack,
in the lower layer, amdgpu_eeprom_xfer().
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Wed, 3 Feb 2021 20:19:12 +0000 (15:19 -0500)]
drm/amdgpu: I2C class is HWMON
Set the auto-discoverable class of I2C bus to
HWMON. Remove SPD.
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Tue, 2 Feb 2021 23:36:25 +0000 (18:36 -0500)]
drm/amdgpu: Fix wrap-around bugs in RAS
Fix the size of the EEPROM from 256000 bytes
to 262144 bytes (256 KiB).
Fix a couple or wrap around bugs. If a valid
value/address is 0 <= addr < size, the inverse of
this inequality (barring negative values which
make no sense here) is addr >= size. Fix this in
the RAS code.
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Tue, 2 Feb 2021 16:26:13 +0000 (11:26 -0500)]
drm/amdgpu: RAS and FRU now use 19-bit I2C address
Convert RAS and FRU code to use the 19-bit I2C
memory address and remove all "slave_addr", as
this is now absolved into the 19-bit address.
Cc: Jean Delvare <jdelvare@suse.de>
Cc: John Clements <john.clements@amd.com>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Mon, 1 Feb 2021 18:38:54 +0000 (13:38 -0500)]
drm/amdgpu: I2C EEPROM full memory addressing
* "eeprom_addr" is now 32-bit wide.
* Remove "slave_addr" from the I2C EEPROM driver
interface. The I2C EEPROM Device Type Identifier
is fixed at 1010b, and the rest of the bits
of the Device Address Byte/Device Select Code,
are memory address bits, where the first three
of those bits are the hardware selection bits.
All this is now a 19-bit address and passed
as "eeprom_addr". This abstracts the I2C bus
for EEPROM devices for this I2C EEPROM driver.
Now clients only pass the 19-bit EEPROM memory
address, to the I2C EEPROM driver, as the 32-bit
"eeprom_addr", from which they want to read from
or write to.
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Thu, 28 Jan 2021 19:24:09 +0000 (14:24 -0500)]
drm/amdgpu: EEPROM respects I2C quirks
Consult the i2c_adapter.quirks table for
the maximum read/write data length per bus
transaction. Do not exceed this transaction
limit.
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Wed, 27 Jan 2021 22:09:52 +0000 (17:09 -0500)]
drm/amdgpu: Fixes to the AMDGPU EEPROM driver
* When reading from the EEPROM device, there is no
device limitation on the number of bytes
read--they're simply sequenced out. Thus, read
the whole data requested in one go.
* When writing to the EEPROM device, there is a
256-byte page limit to write to before having to
generate a STOP on the bus, as well as the
address written to mustn't cross over the page
boundary (it actually rolls over). Maximize the
data written to per bus acquisition.
* Return the number of bytes read/written, or -errno.
* Add kernel doc.
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Tue, 26 Jan 2021 20:56:45 +0000 (15:56 -0500)]
drm/amdgpu: Fix Vega20 I2C to be agnostic (v2)
Teach Vega20 I2C to be agnostic. Allow addressing
different devices while the master holds the bus.
Set STOP as per the controller's specification.
v2: Qualify generating ReSTART before the 1st byte
of the message, when set by the caller, as
those functions are separated, as caught by
Andrey G.
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Andrey Grodzovsky [Tue, 26 Jan 2021 17:48:10 +0000 (12:48 -0500)]
drm/amdgpu/pm: ADD I2C quirk adapter table
To be used by kernel clients of the adapter.
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Lazar Lijo <Lijo.Lazar@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Andrey Grodzovsky [Tue, 26 Jan 2021 17:09:18 +0000 (12:09 -0500)]
drm/amd/pm: SMU I2C: Return number of messages processed
Fix from number of processed bytes to number of
processed I2C messages.
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Andrey Grodzovsky [Sat, 23 Jan 2021 04:25:31 +0000 (23:25 -0500)]
drm/amdgpu: Send STOP for the last byte of msg only
Let's just ignore the I2C_M_STOP hint from upper
layer for SMU I2C code as there is no clean
mapping between single per I2C message STOP flag
at the kernel I2C layer and the SMU, per each byte
STOP flag. We will just by default set it at the
end of the SMU I2C message.
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Lazar Lijo <Lijo.Lazar@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Andrey Grodzovsky [Sat, 23 Jan 2021 04:15:25 +0000 (23:15 -0500)]
drm/amdgpu: Drop i > 0 restriction for issuing RESTART
Drop i > 0 restriction for issuing RESTART.
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Andrey Grodzovsky [Fri, 22 Jan 2021 21:29:56 +0000 (16:29 -0500)]
dmr/amdgpu: Add RESTART handling also to smu_v11_0_i2c (VG20)
Also generilize the code to accept and translate to
HW bits any I2C relvent flags both for read and write.
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Andrey Grodzovsky [Fri, 22 Jan 2021 16:19:48 +0000 (11:19 -0500)]
drm/amdgpu: Remember to wait 10ms for write buffer flush v2
EEPROM spec requests this.
v2: Only to be done for write data transactions.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Alex Deucher [Fri, 22 Jan 2021 04:56:32 +0000 (23:56 -0500)]
drm/amdgpu: only set restart on first cmd of the smu i2c transaction
Not sure how the firmware interprets these.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Aaron Rice [Fri, 22 Jan 2021 04:45:54 +0000 (23:45 -0500)]
drm/amdgpu: rework smu11 i2c for generic operation
Handle things besides EEPROMS.
Signed-off-by: Aaron Rice <wolf@lovehindpa.ws>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Alex Deucher [Fri, 22 Jan 2021 04:12:11 +0000 (23:12 -0500)]
drm/amdgpu: add I2C_CLASS_HWMON to SMU i2c buses
Not sure that this really matters that much, but these could
have various other hwmon chips on them.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Alex Deucher [Fri, 22 Jan 2021 04:32:38 +0000 (23:32 -0500)]
drm/amdgpu: i2c subsystem uses 7 bit addresses
Convert from 8 bit to 7 bit.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Alex Deucher [Thu, 21 Jan 2021 20:26:38 +0000 (15:26 -0500)]
drm/amdgpu/ras: switch fru eeprom handling to use generic helper (v2)
Use the new helper rather than doing i2c transfers directly.
v2: fix typo
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Alex Deucher [Thu, 21 Jan 2021 20:13:05 +0000 (15:13 -0500)]
drm/amdgpu/ras: switch ras eeprom handling to use generic helper
Use the new helper rather than doing i2c transfers directly.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Alex Deucher [Thu, 21 Jan 2021 19:41:27 +0000 (14:41 -0500)]
drm/amdgpu: add new helper for handling EEPROM i2c transfers
Encapsulates the i2c protocol handling so other parts of the
driver can just tell it the offset and size of data to write.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Alex Deucher [Thu, 21 Jan 2021 19:17:34 +0000 (14:17 -0500)]
drm/amdgpu/pm: add smu i2c implementation for navi1x (v5)
And handle more than just EEPROMs.
v2: fix restart handling between transactions.
v3: handle 7 to 8 bit addr conversion
v4: Fix &req --> req. (Luben T)
v5: squash in i2c channel fix
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Alex Deucher [Thu, 21 Jan 2021 19:17:03 +0000 (14:17 -0500)]
drm/amdgpu/pm: rework i2c xfers on arcturus (v5)
Make it generic so we can support more than just EEPROMs.
v2: fix restart handling between transactions.
v3: handle 7 to 8 bit addr conversion
v4: Fix &req --> req. (Luben T)
v5: squash in i2c channel fix
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Alex Deucher [Thu, 21 Jan 2021 19:08:29 +0000 (14:08 -0500)]
drm/amdgpu/pm: rework i2c xfers on sienna cichlid (v4)
Make it generic so we can support more than just EEPROMs.
v2: fix restart handling between transactions.
v3: handle 7 to 8 bit addr conversion
v4: Fix &req --> req. (Luben T)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Alex Deucher [Thu, 21 Jan 2021 21:39:59 +0000 (16:39 -0500)]
drm/amdgpu: add a mutex for the smu11 i2c bus (v2)
So we lock software as well as hardware access to the bus.
v2: fix mutex handling.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Mukul Joshi [Tue, 29 Jun 2021 20:34:10 +0000 (16:34 -0400)]
drm/amdgpu: Conditionally reset SDMA RAS error counts
Reset SDMA RAS error counts during init only if persistent
EDC harvesting is not supported.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Wed, 10 Mar 2021 04:12:15 +0000 (22:12 -0600)]
drm/amdkfd: Maintain svm_bo reference in page->zone_device_data
Each zone-device page holds a reference to the SVM BO that manages its
backing storage. This is necessary to correctly hold on to the BO in
case zone_device pages are shared with a child-process.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Wed, 12 May 2021 16:02:33 +0000 (11:02 -0500)]
drm/amdkfd: add invalid pages debug at vram migration
This is for debug purposes only.
It conditionally generates partial migrations to test mixed
CPU/GPU memory domain pages in a prange easily.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Wed, 30 Jun 2021 19:09:10 +0000 (15:09 -0400)]
drm/amdkfd: skip migration for pages already in VRAM
Migration skipped for pages that are already in VRAM
domain. These could be the result of previous partial
migrations to SYS RAM, and prefetch back to VRAM.
Ex. Coherent pages in VRAM that were not written/invalidated after
a copy-on-write.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Wed, 12 May 2021 15:54:58 +0000 (10:54 -0500)]
drm/amdkfd: skip invalid pages during migrations
Invalid pages can be the result of pages that have been migrated
already due to copy-on-write procedure or pages that were never
migrated to VRAM in first place. This is not an issue anymore,
as pranges now support mixed memory domains (CPU/GPU).
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Wed, 5 May 2021 19:15:50 +0000 (14:15 -0500)]
drm/amdkfd: classify and map mixed svm range pages in GPU
[Why]
svm ranges can have mixed pages from device or system memory.
A good example is, after a prange has been allocated in VRAM and a
copy-on-write is triggered by a fork. This invalidates some pages
inside the prange. Endding up in mixed pages.
[How]
By classifying each page inside a prange, based on its type. Device or
system memory, during dma mapping call. If page corresponds
to VRAM domain, a flag is set to its dma_addr entry for each GPU.
Then, at the GPU page table mapping. All group of contiguous pages within
the same type are mapped with their proper pte flags.
v2:
Instead of using ttm_res to calculate vram pfns in the svm_range. It is now
done by setting the vram real physical address into drm_addr array.
This makes more flexible VRAM management, plus removes the need to have
a BO reference in the svm_range.
v3:
Remove mapping member from svm_range
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>