AMD Radeon Vega progressively slower over time

Hi, looking for some suggestions on troubleshooting. I have FreeBSD 14.2 working successfully on this little usff AMD Thinkstation, with both Xorg (LXQT desktop) and Wayland/Sway working. However, over time, the display rendering gets slower and slower, regardless of whether I'm runing X or Wayland. CPU and system load are negligible, and yet, for example, minimizing windows with openbox takes a few seconds, with the little "animation" leaving artifacts on the screen for a delay, rather than being nearly instant. Scrolling in Firefox also becomes slower and slower. I had a session of Wayland/Sway open for a few days, and eventually it slowed to such a crawl, it wouldn't even switch to a virtual terminal within the 10 minutes I waited. A reboot fixes the problem 100% of the time, and everything is snappy again. I hope I'm explaining myself correctly, in summary, the desktop rendering becomes unusably slow after a couple of days, and a reboot fixes it.

Just looking for some hints/pointers as I've never run into this type of problem before. New to FreeBSD, but many years of Linux. It could be a hardware issue.

I do not currently have an xorg.conf file.

Code:
vgapci0@pci0:3:0:0:    class=0x030000 rev=0xd6 hdr=0x00 vendor=0x1002 device=0x15dd subvendor=0x17aa subdevice=0x3130
    vendor     = 'Advanced Micro Devices, Inc. [AMD/ATI]'
    device     = 'Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series]'
    class      = display
    subclass   = VGA
    
    
$  kldstat | grep amdgpu
 4    1 0xffffffff83200000   6688e8 amdgpu.ko
11    1 0xffffffff830c9000     2220 amdgpu_raven_gpu_info_bin.ko
12    1 0xffffffff830cc000     64e0 amdgpu_raven_sdma_bin.ko
13    1 0xffffffff830d3000    2b2e0 amdgpu_raven_asd_bin.ko
14    1 0xffffffff830ff000     a3e0 amdgpu_raven_ta_bin.ko
15    1 0xffffffff8310a000     7560 amdgpu_raven_pfp_bin.ko
16    1 0xffffffff83112000     6560 amdgpu_raven_me_bin.ko
17    1 0xffffffff83119000     4560 amdgpu_raven_ce_bin.ko
18    1 0xffffffff8311e000     b9c8 amdgpu_raven_rlc_bin.ko
19    1 0xffffffff8312a000    437f0 amdgpu_raven_mec_bin.ko
20    1 0xffffffff8316e000    437f0 amdgpu_raven_mec2_bin.ko
21    1 0xffffffff83869000    5b4c0 amdgpu_raven_vcn_bin.ko
 
Most likely it's not a hardware issue. I experience the same with a "Lucienne" iGPU on a 5700U CPU.

Running amdgpu from graphics/drm-61-kmod or graphics/drm-515-kmod over time the response in Xorg is fading until temporary or permanent freeze, RAM is consumed to the maximum, increasingly swap memory is used. This affects also the systems reaction time. Killing Xorg or a system reboot recovers from the issue, until it repeats.

It looks like these symptoms are related to the LinuxKPI, see Bug 277476 graphics/drm-515-kmod: amdgpu periodic hangs due to phys contig allocations. The discontinued graphics/drm-510-kmod worked fine.

There is a patch addressing the problem commited on CURRENT (main) and STABLE of the 14 branch (see PR above). Unfortunately there is no patch for 14.2-RELEASE.

Try 14 STABLE ( 14.3-PRERELEASE) now, or wait until 14.3-RELEASE is available in June this year.
 
Back
Top