Solved VM vCPU performance inconsistent between host reboots

I've been trying to benchmark the performance of my Windows 10 bhyve VM for a while now, comparing various CPU topologies and pinning configurations. I plan to start a blog to share my findings, with this as one of the options for the first post.

The problem is that I give up after the benchmark programs gives me completely different results under exactly the same conditions. Today I found something.

I heard that the CPU frequency is controlled by the host, and one shouldn't pay attention to what is displayed in the VM, but I have noticed a clear correlation.

My CPU is AMD Ryzen 9 5950X. In Geekbench 6, there is a “Maximum Frequency” field. When it displays something around 5000 MHz, I get more or less the same benchmark results, and restarting the VM does not spoil them in any way. However, after the host restart, with an uncertain probability, this number can become equal to something close to the base frequency of my processor. In this case, the benchmark results are significantly reduced. Restarting the VM, as in the previous case, does not change anything. Restarting the host may help.

There is another program: PassMark PerformanceTest. The results in it are also changes along with that number from Geekbench. By the way:

bad.png
good.png


So, performance of my VM becomes significantly lower than it should be from time to time, for unknown reason. What if the same happening to my poudriere build VM? This is a major blocker for my experience with bhyve. Any help is greatly appreciated.
 
My CPU is AMD Ryzen 9 5950X. In Geekbench 6, there is a “Maximum Frequency” field. When it displays something around 5000 MHz, I get more or less the same benchmark results, and restarting the VM does not spoil them in any way. However, after the host restart, with an uncertain probability, this number can become equal to something close to the base frequency of my processor. In this case, the benchmark results are significantly reduced. Restarting the VM, as in the previous case, does not change anything. Restarting the host may help.
I think the way you're processor is working is that only a single core can achieve near maximum performance at a time. Not all cores. The rest of the cores stays at base or lower frequency. When you rebooted host you had the bad luck to got a core which ran at average base frequency.

AMD Ryzen™ 9 5950X Desktop Processor
 
Why did you stuff poudriere into a VM instead of using its jails?
This may sound silly, but I feel some grace in separating the responsibilities between my host and the poudriere host. It's like a remote server whose responsibilities are clearly defined. Packages are prepared there, and obhttpd is running, and ccache is simply directed to /var/cache/ccache, and so on.

But there's also a more compelling reason why I decided to do it this way. I'm not happy with advice to use PARALLEL_JOBS=1, because I want the CPU to be maxed out, even if I lose a bit on context switching. But, naturally, the desktop becomes almost unusable during heavy builds, even if it's just a few ports. So I thought I could put it in a VM, because there I can limit the number of cores, and let it hit the top there, while the desktop remains usable. It worked pretty well.

I suspect there are better solutions, but I haven't looked for them yet, taking the path of least resistance.
 
I think the way you're processor is working is that only a single core can achieve near maximum performance at a time. Not all cores. The rest of the cores stays at base or lower frequency.
I heard something like that, but I don't get what it has to do with reboots, and how this can cause the problem.

When you rebooted host you had the bad luck to got a core which ran at average base frequency.
Why every time I power on the PC it should randomly choose one of its 16 cores? For what?

Are you saying that the cores I have pinned for VM using bhyve -p (0th, 2nd, 4th, etc.) are not the same between host reboots, and next time it could be that these cores are not capable of boosting?
 
Are you saying that the cores I have pinned for VM using bhyve -p (0th, 2nd, 4th, etc.) are not the same between host reboots, and next time it could be that these cores are not capable of boosting?
I don't have an explanation. It just crossed my mind...
 
I heard something like that, but I don't get what it has to do with reboots, and how this can cause the problem.


Why every time I power on the PC it should randomly choose one of its 16 cores? For what?

Are you saying that the cores I have pinned for VM using bhyve -p (0th, 2nd, 4th, etc.) are not the same between host reboots, and next time it could be that these cores are not capable of boosting?
Just out of curiosity - how cores are shown ? Asking because your CPU has 2 chip-lets with 8C/16T each.
 
I don't have an explanation. It just crossed my mind...
That would kind of explain it, but it seems that this is not the case. In the same session (no host reboots), I have pinned the first 8 cores to the VM, and Geekbench showed 5 GHz. Then I pin the last 8 cores, reboot the VM, and it still shows 5 GHz. I would expect to see the base frequency if there is such a lottery. But even then, this seems like a weird behavior, where in order to benefit from CPU pinning you have to play some stupid reboot lottery.

Just out of curiosity - how cores are shown ? Asking because your CPU has 2 chip-lets with 8C/16T each.
Output of which command I can provide you with? I'm just pinning them by their numbers from 0 to 31. I believe that every second core (1, 3, etc.) is a thread, while the others are real cores, as explained in this post, and as I, as I believe, verified myself with the benchmarks.
 

This command taken from the link:
sysctl -a | grep -i cpu | less
If i understand - it shows some info what cores pinned/taken/used ?

Also if understand correctly - cache could be a player in your results.
Taken from powerup
With the new "Zen 3" microarchitecture, the biggest high-level change with the CCD is AMD's enlargement of the CCX to now include up to eight cores (essentially taking up the whole CCD). There's now one 8-core CCX per CCD. The biggest dividend of this change has to be improved inter-core latency as the eight cores now share the same L3 cache; the other big dividend has to be cache size. Each core on the CCD now has access to the full 32 MB L3 as a victim cache, so lightly threaded workloads should see a performance uplift.

So can it be that cache is a bit taken more by other process and your result dropped dramatically ?
Im just guessing .. dont take it as "how it works" :)
 
Character limit :)
Pastebin link


This is nice hint I guess to further play with the pinning, but I believe unrelated to the described problem.
I was wrong on cpu pinning on that info. I have dual cpu and my info a bit differs from you and i got mistaken by it. my bad.
P.s. have to tried to use sysutils/cpu-x while you do your tests ?
P.p.s: read this: https://linustechtips.com/topic/159...not-boosting/?do=findComment&comment=16611890 - Open up task manager and do your tests again after reboot etc. try to replicate it ...
Also try to remove pinned cores / threads . you dont have p-cores and e-cores so im not sure its worth to pin them.
And if i understand correctly - any core can hit highest not specified ones and not specifically core0 or core2 - plus temps, vrms , cooling - maybe its to much ?
 

Attachments

  • cpu-x.png
    cpu-x.png
    50.7 KB · Views: 22
There is no such thing as a real core and a hyperthreaded one. They are identical and both run full speed - when they run without their sibling.

The whole situation smells like a CPU clock problem.
 
I can't believe. I killed the whole day on this, not to mention my previous attempts to figure out this sorcery. I found the culprit.

It's not bhyve or any of its options. It's not FreeBSD. It's not Windows. It's not some boot lottery.

It's my UEFI firmware.

When I just power on or reboot my box, after boot I have this:
Code:
dev.cpu.0.freq_levels: 3400/3740 2800/2800 2200/1980
dev.cpu.0.freq: 3400

Then the CPU is able to boost in FreeBSD, bhyve VMs, and “bare-metal” Windows.

But if during firmware splash screen I press Del, and boot from the firmware settings UI, after boot I have this:
Code:
dev.cpu.0.freq_levels: 3400/3740 2800/2800 2200/1980 (unchanged)
dev.cpu.0.freq: 2800

When I boot this way, the CPU is unable to boost even in “bare-metal” Windows.

On FreeBSD, I can run sysctl dev.cpu.0.freq=3400, or use powerd{,xx}, but this won't change anything except the reported number.

Nice one, thanks. And thank you for the hint about grepping the sysctls, that helped a lot.

allocate memory for guest without ballooning
Just out of curiosity, could you elaborate? I heard that bhyve doesn't support ballooning, whatever this word means (I only have a superficial idea).

you dont have p-cores and e-cores so im not sure its worth to pin them
My experience shows that pinning improves the performance of both the host and VMs when the CPU is under load. Now I can continue to compare benchmarks to prove or disprove this, since cracauer@ disproves the entire theoretical basis:
There is no such thing as a real core and a hyperthreaded one. They are identical and both run full speed - when they run without their sibling.
Please, is the information in this post wrong? Or is that only applicable to Intel CPUs?
 
Just in case: my mobo is ASRock B550 PG Velocita, and the firmware is up-to-date. I have notified the manufacturer about the bug.
 
Just out of curiosity, could you elaborate? I heard that bhyve doesn't support ballooning, whatever this word means (I only have a superficial idea).

From config.sample (vm-bhyve):
Code:
# wired memory
# All requested memory should be wired to the guest
# 
wired_memory="no"

If I understand correctly, this is dynamic/static allocation RAM for guest (ballooning).
 
wired memory
Ah, I have to use that for passthrough.

I don't know whether this information from 2018 is still relevant, but anyway:
Unfortunately, the -S option has nothing to do with virtio_balloon(4). virtio_ballon module exist in FreeBSD, However, it helps, when FreeBSD works as guest in a hypervisor with balloon support (XEN, ESX, KVM). However, the bhyve at the moment is not able to be as virtio-balloon backend.
To see that the bhyve guest never returns memory (regardless of the virtio-balloon inside guest) is very easy, for example run in guest:

tail -f /dev/zero

or malloc + free (in any accessible programming language). And watch from bhyve side on top(1) or

rctl -hu process:<pid>
By the way, for a very long time, modern hypervisors (in addition to a balloon) are able to make compression and deduplication (KSM) of memory ( https://www.kernel.org/doc/ols/2009/ols2009-pages-19-28.pdf ). Unfortunately, this also does not apply to bhyve.
 
That post matches what I think. There is no difference between AMD and Intel when it comes to Hyperthreading.
Thank you, after a bit of thinking, I believe, I finally understand the whole thing.

My initial confusion seems to originate from that I didn't interpret this correctly:
But now we no longer have four equal vcpus in the guest, because vcpu 0+1 are just threads on the same physical core, as are 2+3. Running a program on vcpu 0+2 will give us two full cores performance, but running it on 0+1 will give only 1.3 performance.

So, vCPU 1+3 should give 2 full cores performance just as 0+2, but only if their siblings are idle (in both cases). That makes sense.

Let me ask you one more question on this. Do threads come in pairs on all OSes? IIRC, on Linux a 4c8t CPU (example) reported as (0,4) (1,5) (2,6) (3,7), while on FreeBSD it's (0,1) (2,3) (4,5) (6,7).
 
So, vCPU 1+3 should give 2 full cores performance just as 0+2, but only if their siblings are idle (in both cases). That makes sense.

Let me ask you one more question on this. Do threads come in pairs on all OSes? IIRC, on Linux a 4c8t CPU (example) reported as (0,4) (1,5) (2,6) (3,7), while on FreeBSD it's (0,1) (2,3) (4,5) (6,7).

Both Linux and FreeBSD give you a hyperthreaded sibling pair if you specify 0 and 1. I don't think the nomenclature or the counting order is any different.

As mentioned that should be taken into account when strictly allocating CPUs, to VMs or otherwise.
 
Back
Top