Audio glitches, hot NVME drive -- Looking for advice -- 14.0 Release

Hey there, long time linux user but somewhat new to FreeBSD. I've had zero issues using it in the past, provided I got all my hardware working. I recently decided to move my main machine to RELEASE 14.0.

Computer is a Kaby Lake based Lenovo ThinkCentre. I'm concerned about two things, one is more of an annoyance I'd like to find a fix for.

My audio glitches every so often. Not sure exactly how to describe the sound, but pretty similar to the kind of glitch / blip noise you'll hear when a machine locks up in the midst of playing audio/video and needs a hard reboot. My machine hasn't crashed, but that's how I'd describe the audio distortion. It isn't all the time, or even often -- sort of. I could play YouTube videos for an hour or two and not hear it at all. But odds are that I do hear it, and it happens at least twice a day. I know very little about audio on FreeBSD, but I am amusing this is some kind of driver / kernel module issue.

Here's some info from dmesg:
Code:
# dmesg | grep hda
hdac0: <Intel Kaby Lake-H HDA Controller> mem 0xf7240000-0xf7243fff,0xf7220000-0xf722ffff irq 16 at device 31.3 on pci0
hdacc0: <Realtek ALC235 HDA CODEC> at cad 0 on hdac0
hdaa0: <Realtek ALC235 Audio Function Group> at nid 1 on hdacc0
pcm0: <Realtek ALC235 (Analog)> at nid 20 and 25 on hdaa0
pcm1: <Realtek ALC235 (Front Analog Headphones)> at nid 33 on hdaa0
hdacc1: <Intel Kaby Lake HDA CODEC> at cad 2 on hdac0
hdaa1: <Intel Kaby Lake Audio Function Group> at nid 1 on hdacc1

And the other thing, which honestly I don't know if it is even really an issue. My Samsung nvme ssd is HOT. Too hot to leave my finger tip on, and this is after the machine sits idle doing nothing at all. I'm not 100% sure, but I do not think this was the case when I was using Linux. Should I be concerned? Do I have some misconfiguration perhaps, or can I somehow mitigate this thermal situation? Obviously a heat sink would be beneficial no matter what, but mostly I just want to make sure I'm not doing something wrong here.
I'm using UFS with softupdates on, softjournal off and trim on. Are these settings ideal? I'm using a 256 GB Samsung m.2 drive. PCI-E / nvme. From 2016/2017, I think.

Code:
# dmesg |grep nvme
nvme0: <Generic NVMe Device> mem 0xf7100000-0xf7103fff irq 16 at device 0.0 on pci1
nda0 at nvme0 bus 0 scbus7 target 0 lun 1
nda0: nvme version 1.2
 
Moving this to "Multimedia/Gaming" for the audio issues. Might want to split it up into two posts, deal with one problem at a time.
 
You should be able to get a temperature reading for the M.2 drive with smartctl. Then compare FreeBSD and Linux.

For audio I would compare to some generic USB "card" to see whether it is something specific with your onboard sound or whether your FreeBSD is glitching generally.
 
I don't know if it works.
nvmecontrol(8)
Code:
   power
     Manage the power modes of the NVMe controller.

     -l      List all supported power modes.

     -p mode
             Set the power mode to mode.  This must be a mode listed with the
                   nvmecontrol power -l
             command.

     -w hint
             Set the workload hint for automatic power mode control.
 
Okay, so unfortunately after about a week or two of using FreeBSD on my ThinkCentre, I'm back to Linux. Biggest reason being WiFi support -- The WiFi 6 AX 5 GHz network I use can't be seen \ connected to on FreeBSD, or at least I couldn't figure it out. I was able to use 2.4 GHz wireless N, but for some reason only ever got about 2.5 / 3 Mbps and honestly that's a no go, since I use this machine to pass internet on to another switch connected through wired Ethernet. The bottleneck was just too real, didn't even need to run any benchmarks, it was painfully obviously dealing with a fraction of the previous throughput. If I get another box to act as a wifi to wired bridge, so I can just use wired Ethernet on the thinkcentre I'd probably switch back to FreeBSD.

Biggest reason I'm back to post in this thread is, I wasn't crazy, the SSD runs much cooler on Linux. How much cooler? Well, I can leave my finger pressed on it, as long as I want, and it isn't warm in the slightest. Probably about the same temperature as body temperature. On FreeBSD it was hot. So hot, that it was painful to press a fingertip to the label for more than a whole second. I'd guess that on Linux it runs around 100 F, while on BSD probably around 200 F or close to it. And I did turn off swap... Didn't seem to make a big difference.

This seems like a legitimate concern, and I don't think slapping a heatsink on is really a proper fix. I like my systems to last a long healthy life. Please let me know if there is any kind of diagnostic steps or testing I could do in order to benefit this situation for myself and others too of course. I'd be happy to help... If this is a one off / edge case, let me know.
 
The WiFi 6 AX 5 GHz network I use can't be seen \ connected to on FreeBSD, or at least I couldn't figure it out. I was able to use 2.4 GHz wireless N
It's nothing you can change or configure:
Code:
     While iwlwifi supports all 802.11 a/b/g/n/ac/ax the compatibility code
     currently only supports 802.11 a/b/g modes.  Support for 802.11 n/ac is
     to come. 802.11ax and 6Ghz support are planned.
Driver works but only on the lower modes at the moment.

This seems like a legitimate concern
It doesn't sound good and I agree it's a concern. Turning of swap probably wouldn't do much, there's usually not that much swapping happening anyway.
Please let me know if there is any kind of diagnostic steps or testing I could do in order to benefit this situation for myself and others too of course. I'd be happy to help... If this is a one off / edge case, let me know.
You could install FreeBSD on a memory stick and boot from that. Just for testing.

If this is a one off / edge case, let me know.
It may not be a one off. There are certainly less FreeBSD users than Linux users. Lenovo is a popular brand among FreeBSD users. There might be a low chance of someone else having the exact same ThinkCenter and NVMe combination.
 
Thanks for the reply. Here's some more information, conjured up on the Linux side.

From lspci, so this would be my exact NVMe device:
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961/SM963
And dmesg from Linux as well. I don't think this is particularly useful personally, but I'll admit that there are some much smarter people than me in this world :) Included for completeness:
ben@thinkcentre:~$ sudo dmesg|grep nvm
[ 1.701942] nvme nvme0: pci function 0000:01:00.0
[ 1.715064] nvme nvme0: 4/0/0 default/read/poll queues
[ 1.734300] nvme0n1: p1 p2 p3
[ 2.515659] EXT4-fs (nvme0n1p2): mounted filesystem with ordered data mode. Quota mode: none.
[ 3.045094] EXT4-fs (nvme0n1p2): re-mounted. Quota mode: none.
[ 3.269960] Adding 1000444k swap on /dev/nvme0n1p3. Priority:-2 extents:1 across:1000444k SSFS
 
I was wondering if my nvme disk runs hot but couldn't check the temperature with smartmontools, I got INQUIRY ERROR. I guess because it is my system disk.

# smartctl -A /dev/nda0
smartctl 7.4 2023-08-01 r5530 [FreeBSD 14.0-RELEASE-p5 amd64] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

Smartctl open device: /dev/nda0 failed: INQUIRY failed


# pciconf -lv | grep -A4 nv
nvme0@pci0:7:0:0: class=0x010802 rev=0x03 hdr=0x00 vendor=0x126f device=0x2263 subvendor=0x126f subdevice=0x2263
vendor = 'Silicon Motion, Inc.'
device = 'SM2263EN/SM2263XT (DRAM-less) NVMe SSD Controllers'
class = mass storage
subclass = NVM



Edit: I found out that /dev/nvme0 and /dev/nvme0ns1 works with smartctl.
# smartctl -a /dev/nvme0ns1 | grep -i temperature
Temperature: 40 Celsius
 
There is also nvmecontrol(8), it can display the log pages from the controller:
Code:
root@kg-core2:~ # nvmecontrol logpage -p 2 nvme0
SMART/Health Information Log
============================
Critical Warning State:         0x00
 Available spare:               0
 Temperature:                   0
 Device reliability:            0
 Read only:                     0
 Volatile memory backup:        0
Temperature:                    311 K, 37.85 C, 100.13 F
Available spare:                100
Available spare threshold:      10
Percentage used:                0
Data units (512,000 byte) read: 3736263
Data units written:             3318355
Host read commands:             183139019
Host write commands:            28055936
Controller busy time (minutes): 4164
Power cycles:                   29
Power on hours:                 43024
Unsafe shutdowns:               20
Media errors:                   0
No. error info log entries:     0
Warning Temp Composite Time:    0
Error Temp Composite Time:      0
Temperature 1 Transition Count: 0
Temperature 2 Transition Count: 0
Total Time For Temperature 1:   0
Total Time For Temperature 2:   0
 
Just wanted to report back, I'm now running 14.1 on the same machine. Audio is no longer having any issues. Installed to an mSATA SSD instead of my NVMe drive, and despite that, it still got HOT and quick. So pulled it out for now. Going to test 14.1 on here for a while. So far so good!
 
Below I've provided a small summary on how one can try to localize the issue with hot NVMe using nvmecontrol (in all examples nvme0 should be replaced with device node in question):
1. nvmecontrol power nvme0 to get the controller's current power state;
2. nvmecontrol power -l nvme0 to list all supported power states;
3. nvmecontrol power -pN nvme0 to manually change the power state to N-th. Note: when switching to non-operational power states (marked with an asterisk) it will only take effect until the system asks the controller to perform any IO. In this scenario the device will wake up and enter the last used operational power state. When IO is complete and if the Autonomous Power State Transition feature is disabled (see also https://forums.freebsd.org/threads/nvme-autonomous-power-state-transition-apst.90837), the controller won't switch back on its own. When APST is enabled, the controller will return to its preconfigured non-operational state, but it may not be the one you specified. So, to effectively check all the states APST needs to be disabled and the device needs to be not in use;
4. nvmecontrol admin-passthru -o 10 --cdw10 12 nvme0 to get APST status from the controller (0x1 - enabled, 0x0 - disabled);
5. nvmecontrol admin-passthru -o 9 --cdw10 12 --cdw11 0 nvme0 to disable APST manually (or --cdw11 1 to enable it *back*, so it shouldn't work if the feature was initially disabled);
6. nvmecontrol logpage -p2 to get the controller temperature.

I don't know any straight ways to monitor power consumption in real time except unplugging your laptop power cable and run while true; do acpiconf -i0 | fgrep rate; sleep 10; done. Would be great to find a better way for this.

Bottom line, if you have APST disabled by default and all states seems to be working correctly, there is little you can do at this point other than set other operational state with a lower maximum consumption and measure the result. If APST is enabled and the controller autonomously switches to a non-operational state that abnormally draws too much power, you can simply turn the whole feature off, and leave the controller running in an operational state.
 
Back
Top