New desktop machine with Ryzen 2700 random hick-ups/freezes and reboots even after microcode update

I built a new computer last week and am trying to run FreeBSD 12.0 on it. What makes my life difficult is that I'm experiencing random reboots and short freezes. They appear to be of random nature and usually a reboot is preceeded by several of those hick-ups/freezes.

In this context I call a freeze a short interruption of audio/video and the mouse activity. They happen simultaneously, so during audio playing and mouse movement they both freeze at the same time. From time to time there is also a failure in Firefox and Chrome. Tabs crash, not the entire browser. If I reload the tab there goes a reboot of the whole system. Falkon appears to be OK so far for browsing.

The hardware configuration is as follows:

Mainboard: Asus ROG Crosshair VII Hero WiFi, latest bios installed (1201 from 2019/01/04)
CPU: AMD Ryzen 7 2700 65w TDP
Cooler: Thermalright ARO-m14 Grey
RAM: G.Skill Aegis DDR4 4x16GB 3000MHz CL16
GPU: AMD Radeon RX 580 8GB
SDD: 2x Samsung Evo 860 1TB (zfs mirror zpool)
PSU: Seasonic Prime 600 Titanium Fanless

The mainboard autoconfigures the RAM at 2133MHz and the CPU at 3200MHz. This appears to be working except for the aforementioned freezes/reboots. Bumping the RAM to 3000MHz also works (either manually or though the D.O.C.P setting). Although it "feels" like the machine reboots more often under 3000MHz than 2133MHz. The system boots using UEFI. The temperature is reported at 30ish degrees Celsius (I take it from amdtemp, not sure if I need to transform the value somehow). Also under load (e.g. kernel compile) it doesn't go higher than 40 degrees so I assume that the cooling is OK. Also the air streeming out of the case is cool. I didn't do any tests using other operating systems or did any "burn-in" attempts.

uname -r
12.0-RELEASE-p3


I searched online and found some horror stories on reddit. Also I stumbled upon https://lists.freebsd.org/pipermail/freebsd-current/2018-June/069799.html where Konstantin Belousov explains a possible fix to comparable issues. I execute his script on every start of the machine. The relevant portion of dmesg looks like this:

CPU: AMD Ryzen 7 2700 Eight-Core Processor (3219.30-MHz K8-class CPU)
Origin="AuthenticAMD" Id=0x800f82 Family=0x17 Model=0x8 Stepping=2
Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
Features2=0x7ed8320b<SSE3,PCLMULQDQ,MON,SSSE3,FMA,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM>
AMD Features2=0x35c233ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,SKINIT,WDT,TCE,Topology,PCXC,PNXC,DBE,PL2I,MWAITX>
Structured Extended Features=0x209c01a9<FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA>
XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
AMD Extended Feature Extensions ID EBX=0x1007<CLZERO,IRPerf,XSaveErPtr>
SVM: NP,NRIP,VClean,AFlush,DAssist,NAsids=32768
TSC: P-state invariant, performance statistics


However this doesn't appear to solve my problems :-(. Any help is greatly appreciated. Also I'd like to learn more about why this may happen so that I can debug further on my own the next time :-)
 
I have a very similar system, but I do not experience any freezes, crashes or reboots. The system is rock stable since I built it together (4 months ago). Therefore I think there must be a difference between your system and mine.

You description sounds a little bit like the RAM could be the culprit. Since you have four modules, have you tried with running just two of them? And if the problem persists, then running with the other two? (Note that you have to fill the slots in a certain order, see the mainboard's manual for details.) That way you could rule out defective RAM as the cause of the problem. Also note that there are certain restrictions when all four DDR4 slots are filled with the maximum amount of memory supported (i.e. 4 × 16 GB). Apparently this does not work with arbitrary RAM chips. There is a detailed compatibility list online.

Mainboard: Asus ROG Crosshair VII Hero WiFi, latest bios installed (1201 from 2019/01/04)
CPU: AMD Ryzen 7 2700 65w TDP
Cooler: Thermalright ARO-m14 Grey
RAM: G.Skill Aegis DDR4 4x16GB 3000MHz CL16
GPU: AMD Radeon RX 580 8GB
SDD: 2x Samsung Evo 860 1TB (zfs mirror zpool)
PSU: Seasonic Prime 600 Titanium Fanless
For comparison, here's mine:
Mainboard: Asus ROG Crosshair VII Hero X470 (no Wifi), not latest BIOS (2018/xx/xx, will have to check for exact date)
CPU: AMD Ryzen 7 2700, 65 W TDP (8 cores, 16 threads)
Cooler: AMD Wraith Spire (the one that came with the boxed CPU)
RAM: G.Skill Ripjaws V DDR4 kit, 2 × 16 GB, 3200 MHz CL16 (running at 2133 MHz)
GPU: Nvidia GeForce GT 1030 (2 GB), connected to a Samsung CF791 (34" curved UWQHD 3440 x 1440 @ 60 Hz, 110 DPI) via DisplayPort
SSD: Samsung SSD 970 PRO, 1 TB (NVMe M.2), 3.5 GB/s read, 2.7 GB/s write, 500,000 IOPS
HDD: HGST Ultrastar HE12, 12 TB, SATA-III, 7200 rpm
PSU: Don't know, some cheap no-name with large&slow (thus quiet) 14 cm fan
I did have a fanless PSU once, but it caused problems under load (even though it shouldn't have, according to the specs).

Note that I did not engage any of the “overclocking” features of the mainboard. I'm not willing to sacrifice stability for a few percent of speed. The hardware is already very speedy with the default settings, so I'm happy with it as-is.
I searched online and found some horror stories on reddit.
Well … Reddit is not a very good source of information, especially not for FreeBSD. There are a lot of clueless people over there, I'm afraid.
Also I stumbled upon https://lists.freebsd.org/pipermail/freebsd-current/2018-June/069799.html where Konstantin Belousov explains a possible fix to comparable issues. I execute his script on every start of the machine.
That shouldn't be necessary. Konstantin (kib@) committed the changes to 12-current in July 2018, so they are already contained in 12-stable and 12-RELEASE.

The relevant portion of dmesg looks like this:
CPU: AMD Ryzen 7 2700 Eight-Core Processor (3219.30-MHz K8-class CPU)
[...]
Looks exactly the same on my system. Every single CPU feature is the same.

I'm sorry I can't really tell you what's wrong with your system. However, it might be worth checking the RAM (see above).
 
Hi olli@,

Note that I did not engage any of the “overclocking” features of the mainboard. I'm not willing to sacrifice stability for a few percent of speed. The hardware is already very speedy with the default settings, so I'm happy with it as-is.

I totally agree. I'm interested in stability and silence, not the 10% speed increase. I'm totally impressed by the hardware as is.

Well … Reddit is not a very good source of information, especially not for FreeBSD. There are a lot of clueless people over there, I'm afraid.

I think you might be onto something here. It appears to be a hardware configuration issue. I was preparing to try your RAM suggestion. Yesterday, for the kicks of it, I decided to clear the CMOS with the button on the back plate. Since in the manual it says only press but not for how long, I pressed it twice shortly and once for 30 seconds. After that I disabled the AURA LED of the board (the setting is called "blackout" or something similar). Also I selected the Standard D.O.C.P setting which put the RAM into 3000 MHz. Why I'm saying this is, the mainboard didn't POST over DP when I first installed it. Then I could bring it to justice using the HDMI port of the video card. Then I updated the bios to the latest version. I remembered that after that I forgot to clear the CMOS after updating the bios. Which improved on the posting of the board, just to stumble upon the instability.

After clearing the CMOS, the system is running like a champ. From time to time there are audio freezes for .2 sec or so, but the system doesn't restart and the browsers are stable. So it is like 99% improvement and perfectly usable. At least it appears so. I ran

stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M --timeout 36000s --verbose

over night, which yielded no problems whatsoever and at the evening, while it was working I watched a x264 encoded movie without any glitches. So it appears totally legit!

That shouldn't be necessary. Konstantin (kib@) committed the changes to 12-current in July 2018, so they are already contained in 12-stable and 12-RELEASE.

This explains why there was no change in the behavior of the system after I applied the fix. Thanks for clarifying.

I think that the problem is solved.

One more thing... what temperatures do your cores run at? Is this legit, that under load the temp is ~ +40 degrees C? This seems very low to me and I'm happy about it. I measured the heatpipes using a laser thermometer and they are below 30 degrees, which makes the output of sysctl/amdtemp look legit. The Thermalright cooler is specifically designed for Ryzen (the plate is absolutely flat as compared to cooler plates for intel which are apparently slightly convex) and I was wondering how the AMD Wraith Spire performs as well as to really understand whether the reported degrees are really legit.

Cheers
 
Obviously I spoke to early :-/

It crashed again twice. I'm trying only one memory slot now... will post an update afterwards...
 
So yesterday I ran the machine the whole day with only one RAM stick - coding, documents, browsing, ZFS tasks etc. No problem whatsoever. I marked the stick as "OK". Today I'm running it with two, placed according to the manual. One is the "OK" marked and the other is an "Unknown". I also run memtester 32G just to be sure. It appears that you had the right hunch and there is either a faulty stick amongst the four or the mainboard cannot take all 4 at once. I'll post updates for the sake of completeness and future reference.
 
This is the second day report. The system is super stable with two RAM sticks (32G). So obviously RAM was the culprit and the purpose of this thread is done. Thanks olli@ for the support.
 
Just in case someone stumbles upon it, I haven't been able to make FreeBSD 12 with Ryzen 1700X stable. It's a server machine and every once in a while, randomly from 24 hours to a week it would hard lock. No panic, just freezes. The very same hardware runs FreeBSD 11.2-RELEASE and DragonflyBSD fine for days without freezes or reboots. I have built two different machines that are only alike in terms of their motherboards (ASRock X370 mini ITX) and CPUs. The lockup behaviour is the same for both. The latest microcode updates were installed as of April 2019.
 
Just in case someone stumbles upon it, I haven't been able to make FreeBSD 12 with Ryzen 1700X stable. It's a server machine and every once in a while, randomly from 24 hours to a week it would hard lock. No panic, just freezes. The very same hardware runs FreeBSD 11.2-RELEASE and DragonflyBSD fine for days without freezes or reboots. I have built two different machines that are only alike in terms of their motherboards (ASRock X370 mini ITX) and CPUs. The lockup behaviour is the same for both. The latest microcode updates were installed as of April 2019.

I also have a Ryzen 1700X, but I have had the same issues on Linux until I figured out what causes it.
It seems to be caused by the CPU cores going into power saving mode (C6) and when they need to be waked up by the OS it deadlocks the entire system causing it to freeze completely.

What has solved my issue is setting the power supply in my BIOS to "idle current" under the CPU tab. The location of the setting is different depending on the motherboard.

So far I haven't installed FreeBSD on my main desktop yet (only my laptop), so I can't say it will work for certain, but it should work because as far as I understand this is an OS-agnostic issue with the CPU itself. Even on Windows AMD works around the issue by installing power profiles specific to Ryzen. I'll post an update once I get around to migrating my desktop to FreeBSD.

Here's the lengthy linux kernel bug report about this CPU issue: https://bugzilla.kernel.org/show_bug.cgi?id=196683

EDIT: I am now running FreeBSD on my desktop as well and it works without issues, so it seems like the idle current setting should work as well.
 
Last edited:
Нi! I have R5 3600 and mb Asrock B450 Pro.
Very long time detected keyboard and mouse on time boot. And very slowly work disk system - zfs raidz1 3 hdd. W7 keyb and mouse normal work only PS/2 and need setup amd drv for USB keyb and mouse. Support EHCI USB finished.

Whats make ?
 
Chiming in as well, like with rsz, I'm running a 1700X, and now on the 2nd motherboard, RAM, PSU, everything trying to figure out the stability issue. Just finally had a friend trying to do the same thing but with Linux find the C6 state issue, and applied the idle current setting. Too soon to tell yet, as my system would be a day to almost a week just fine before hanging, so 2 full weeks are needed for me to call it good. But I'm experiencing the same thing.

My friend with identical mobo (ASRock Rack x470 mobo) with a 3800X with Linux wouldn't stay up for more than a few hours, even after he applied the mitigation fix, so apparently YMMV. I'm still hopeful about mine. This has been bugging me for nearly a year!
 
Back
Top