Which Kernel Module Is Using All My Memory?

After my system has been up for a few days it crashes. I can see before it crashes that basically all of my memory has been used. It's almost all used up as I type this. Looking at top I see that I have 26GB wired! Is there an easy way to find out which kernel module is using all of my memory? Assuming it is a kernel module that is. I thought that maybe it could by my network driver (if_re). Because I had to compile it and when compiling it I got an error and upon investigation it seems that the driver was written for 12.somthing and I am 13.0 and I think one of the random number functions was removed. Anyway to get it to compile I removed an if statment that called one of two random number functions (where one of them was the one that didn't seem to exist anymore.) Anyway the driver seems to have been working fine and there's been no problem at all with my network connectivity. I unloaded the module and it didn't change anything and I know it's obviously bad practice when I don't fully understand the code to aulter the driver but I really don't think it's the problem anymore. Especially since when my system crashes I see a message on the console from the GPU driver (something about waiting for a message if I remember correctly.) I'll try unloading my gpu driver after I post this to see if it's the problem. However if it is that sucks because I'll have to restart Xorg and then I might as well have restarted my whole computer haha. Anyway I know I'm rambling but if anyone has any ideas I would appreciate it. I'm using an r9 290 BTW.
 
Hmm... after looking at the output of kldstat again I've noticed that there are a lot of kernel modules related to my GPU. Should there be so many loaded?


Code:
Id Refs Address                Size Name
 1   93 0xffffffff80200000  1f10db0 kernel
 3    1 0xffffffff82520000     3378 acpi_wmi.ko
 4    1 0xffffffff82524000     3218 intpm.ko
 5    1 0xffffffff82528000     2180 smbus.ko
 6    1 0xffffffff82600000   31fd70 amdgpu.ko
 7    2 0xffffffff8252b000    7f4c8 drm.ko
 8    3 0xffffffff825ab000     cbc8 linuxkpi_gplv2.ko
 9    1 0xffffffff825b8000     2328 lindebugfs.ko
10    1 0xffffffff825bb000     e778 ttm.ko
11    1 0xffffffff825ca000     a0f8 amdgpu_hawaii_mc_bin.ko
12    1 0xffffffff825d5000     4358 amdgpu_hawaii_pfp_bin.ko
13    1 0xffffffff825da000     4358 amdgpu_hawaii_me_bin.ko
14    1 0xffffffff825df000     4358 amdgpu_hawaii_ce_bin.ko
15    1 0xffffffff825e4000     6358 amdgpu_hawaii_mec_bin.ko
16    1 0xffffffff825eb000     41d8 amdgpu_hawaii_rlc_bin.ko
17    1 0xffffffff825f0000     3240 amdgpu_hawaii_sdma_bin.ko
18    1 0xffffffff825f4000     3240 amdgpu_hawaii_sdma1_bin.ko
19    1 0xffffffff82920000    3ae08 amdgpu_hawaii_uvd_bin.ko
20    1 0xffffffff8295b000    1aba8 amdgpu_hawaii_vce_bin.ko
21    1 0xffffffff82976000    21fc8 amdgpu_hawaii_smc_bin.ko
22    1 0xffffffff825f8000     2340 uhid.ko
23    1 0xffffffff825fb000     4350 ums.ko
24    1 0xffffffff82998000     3380 usbhid.ko
25    1 0xffffffff8299c000     31f8 hidbus.ko
26    1 0xffffffff829a0000     3320 wmt.ko
27    1 0xffffffff829a4000     2a08 mac_ntpd.ko
28    1 0xffffffff829a7000    a32a0 if_re.ko
 
Should there be so many loaded?
Yes. drm is the generic infrastructure, linuxkpi is the in-kernel compatibility layer (the drm drivers are taken from linux and only slightly modified), amdgpu is the actual driver and all the others starting with amdgpu are firmware images for your GPU.
I wasn't able to unload any of the amdgpu kernel modules as I got "Device busy" for all of them.
Those drivers can't be unloaded at runtime. I guess one reason is that they take over the console.
 
Is it technically even possible for the kernel to completely take over all memory? If memory serves, on 64bit systems FreeBDS maps kernel memory into the top of the address space which I'd expect would limit it from gobbling up everything?

Not sure how memory used by kernel modules are treated, though I'd assume it'll use regular kernel memory. Hope someone with more kernel experience can share some insight and shed some light on this?

Edit: I should probably add that your kldstat nicely confirms that you're running on a 64bit system; you can see that all address space is in the upper range.
Edit2: and I just realize I was confused; virtual memory space would certainly allow the kernel to eat everything...
 
Either way, your kldstat kind of points towards kernel to be the culprit of eating your memory, not the graphics card module. Though I'm unsure how a memory leak in a module would get accounted for by the kernel?

If I'm reading the man page correctly, size in kldstat states the amount of memory in bytes allocated by each module (in hex). That should be some answer to your question, even if it probably won't resolve your issue.

You can also check vmstat to get an overview of your virtual memory stats and how your userspace is impacted.
 
What version of FreeBSD? Are you sure that there aren't any big userland programs using memory? vmstat -m shows how much memory each module allocated with malloc. If I remember correctly sysutils/htop shows the kernel size.
 
Is there anything in /var/log/messages at this time of the crash?
I'm not sure. Most of the time it just freezes but a few times it's dropped back to the console and then frozen.

I've looked in the file and I think the point before "---<<BOOT>>---" is the last message.

I think the kldunload is from when I tried to unload the AMD GPU stuff. So I don't think there is anything in there. I think it just froze (assuming something would have been written to the log file otherwise.)


Code:
Sep 11 18:06:37 chronos kernel: kldunload: attempt to unload file that was loaded by the kernel
Sep 11 18:07:08 chronos syslogd: last message repeated 5 times
Sep 11 18:07:36 chronos syslogd: last message repeated 5 times
Sep 11 20:05:20 chronos syslogd: kernel boot file is /boot/kernel/kernel
Sep 11 20:05:20 chronos kernel: ---<<BOOT>>---
Sep 11 20:05:20 chronos kernel: Copyright (c) 1992-2021 The FreeBSD Project.
Sep 11 20:05:20 chronos kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
Sep 11 20:05:20 chronos kernel:     The Regents of the University of California. All rights reserved.
Sep 11 20:05:20 chronos kernel: FreeBSD is a registered trademark of The FreeBSD Foundation.
 
I'm not sure. Most of the time it just freezes but a few times it's dropped back to the console and then frozen.

I've looked in the file and I think the point before "---<<BOOT>>---" is the last message.

I think the kldunload is from when I tried to unload the AMD GPU stuff. So I don't think there is anything in there. I think it just froze (assuming something would have been written to the log file otherwise.)


Code:
Sep 11 18:06:37 chronos kernel: kldunload: attempt to unload file that was loaded by the kernel
Sep 11 18:07:08 chronos syslogd: last message repeated 5 times
Sep 11 18:07:36 chronos syslogd: last message repeated 5 times
Sep 11 20:05:20 chronos syslogd: kernel boot file is /boot/kernel/kernel
Sep 11 20:05:20 chronos kernel: ---<<BOOT>>---
Sep 11 20:05:20 chronos kernel: Copyright (c) 1992-2021 The FreeBSD Project.
Sep 11 20:05:20 chronos kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
Sep 11 20:05:20 chronos kernel:     The Regents of the University of California. All rights reserved.
Sep 11 20:05:20 chronos kernel: FreeBSD is a registered trademark of The FreeBSD Foundation.

Do you allow me to see more?
 
OP if its repeatable, I'd keep top running in a window, it may give a better view of where all the memory is.
Your kldstat output, is that the full list? If so, did you install to UFS? How much memory in the machine total? How is it booted, BIOS opr UEFI? When you are in X what Desktop environment or window manager are you using?
Why did you need to build/rebuild the re driver? I'm running a 13.0-RELEASE system, with the GENERIC kernel that is using the native re driver just fine. Where did you get the source for the re driver you built?
What kernel are you running? Can you provide the output of "freebsd-version -kru"?
 
OP if its repeatable, I'd keep top running in a window, it may give a better view of where all the memory is.
Your kldstat output, is that the full list? If so, did you install to UFS? How much memory in the machine total? How is it booted, BIOS opr UEFI? When you are in X what Desktop environment or window manager are you using?
Why did you need to build/rebuild the re driver? I'm running a 13.0-RELEASE system, with the GENERIC kernel that is using the native re driver just fine. Where did you get the source for the re driver you built?
What kernel are you running? Can you provide the output of "freebsd-version -kru"?
So my mother board is an ASUS B550 Gaming Plus and it has a Realtek RTL8125B NIC. When I installed FreeBSD I couldn't see the network interface with ifconfig. Anyway after searching around I found that that Realtek released a driver for the NIC for FreeBSD and I followed the steps on some article to compile it and it almost worked but there was one error. There was an if statement (somewhere in the 30,000 line file ?) that called a function. It had something to do with getting random numbers. Anyway when trying to compile the driver I got an error saying something to the effect of there being no matching definition for the function. The driver was for version 12. something and I installed FreeBSD 13. I can't remember the exact details but basically the if statment had it's body after it (obviously) and then an else statement after that. Either way the if statement went it would call one of two functions both related to random numbers. Anyway I researched it a little bit and both of those functions end up calling the same function in the kernel. Random() I think? Anyway I basically just changed it so that it would just call one of them I think. Now maybe that's my problem. I think maybe from 12. something to 13.0 they removed or changed the interface to that function in the kernel? I've had no problems with network connectivity though. Also I've been running top since I restarted my computer and I can see that the number of MB that are wired is slowly rising. It's at 2146M at the moment. I've probably had my computer on for about 5 or 6 hours. Maybe I could try unloading the network driver and see if the memory being used keeps rising. Or even start the computer without having loaded the network driver in the first place and then I could eliminate that as a possibility by seeing if the problem still happens.

Yes it's the full list for the kldstat.
I installed to UFS.
It's got 32GB.
I can't remember but I can check next time I need to restart. I'm using an NVMe SSD though if that helps.
I'm using I3 gaps and I'm also running xcompmgr (although I'm not sure it's working 100% correctly.)


Code:
% freebsd-version -kru
13.0-RELEASE
13.0-RELEASE
13.0-RELEASE

Um I can't remember exactly where I got the driver source from. I'm pretty sure it was from the manufacture (or designer?) of the NIC. I've still got the source code though.
 
  • Like
Reactions: mer
So my mother board is an ASUS B550 Gaming Plus and it has a Realtek RTL8125B NIC. When I installed FreeBSD I couldn't see the network interface with ifconfig. Anyway after searching around I found that that Realtek released a driver for the NIC for FreeBSD and I followed the steps on some article to compile it and it almost worked but there was one error. There was an if statement (somewhere in the 30,000 line file ?) that called a function. It had something to do with getting random numbers. Anyway when trying to compile the driver I got an error saying something to the effect of there being no matching definition for the function. The driver was for version 12. something and I installed FreeBSD 13. I can't remember the exact details but basically the if statment had it's body after it (obviously) and then an else statement after that. Either way the if statement went it would call one of two functions both related to random numbers. Anyway I researched it a little bit and both of those functions end up calling the same function in the kernel. Random() I think? Anyway I basically just changed it so that it would just call one of them I think. Now maybe that's my problem. I think maybe from 12. something to 13.0 they removed or changed the interface to that function in the kernel? I've had no problems with network connectivity though. Also I've been running top since I restarted my computer and I can see that the number of MB that are wired is slowly rising. It's at 2146M at the moment. I've probably had my computer on for about 5 or 6 hours. Maybe I could try unloading the network driver and see if the memory being used keeps rising. Or even start the computer without having loaded the network driver in the first place and then I could eliminate that as a possibility by seeing if the problem still happens.

Yes it's the full list for the kldstat.
I installed to UFS.
It's got 32GB.
I can't remember but I can check next time I need to restart. I'm using an NVMe SSD though if that helps.
I'm using I3 gaps and I'm also running xcompmgr (although I'm not sure it's working 100% correctly.)


Code:
% freebsd-version -kru
13.0-RELEASE
13.0-RELEASE
13.0-RELEASE

Um I can't remember exactly where I got the driver source from. I'm pretty sure it was from the manufacture (or designer?) of the NIC. I've still got the source code though.
The NIC is a 2.5G NIC so I think it's fairly new.
 
  • Like
Reactions: mer
I wasn't able to unload any of the amdgpu kernel modules

If I recall correctly, it's advisable to not attempt to unload modules relating to DRM.

Loosely speaking: if you're unlucky, the attempt will cause a kernel panic.

installed to UFS.

How is it tuned?

tunefs -p /

Code:
% freebsd-version -kru
13.0-RELEASE
13.0-RELEASE
13.0-RELEASE

It's outdated. As root, with csh:

csh && setenv PAGER cat && freebsd-update fetch install && exit

setenv PAGER cat && freebsd-update fetch install

  1. su -l root -c /bin/csh
  2. setenv PAGER cat && freebsd-update fetch install
  3. restart the OS
Then upgrade packages, restart the OS reboot -r etc..

freebsd-update(8)
 
Last edited:
Back
Top