Xorg won't start with officially supported NVIDIA 5070 GPU?

This may seem elaborate but, I found it rather difficult to get this graphics driver issue clear for myself; this also comes up quite regularly in the forums. Perhaps this is already known.

Using Nvidia graphics hardware and using Xorg, you can use either:
  1. 100 % Nvidia proprietory drivers:
    - as kernel module: /boot/modules/nvidia-modeset.ko*, contained in x11/nvidia-driver
    - as Xorg driver (proprietary Nvidia): /usr/local/lib/xorg/modules/drivers/nvidia_drv.so, also contained in x11/nvidia-driver
  2. mixed: open source driver and Nvidia proprietory driver:
    - as open source kernel module /boot/modules/nvidia-drm.ko, contained in graphics/nvidia-drm-61-kmod (or graphics/nvidia-drm-515-kmod), or the meta port graphics/nvidia-drm-kmod
    - as Xorg driver (proprietary Nvidia): /usr/local/lib/xorg/modules/drivers/nvidia_drv.so, contained in x11/nvidia-driver and also in graphics/nvidia-drm-61-kmod
The kernel modules are referenced in /etc/rc.conf:
  • kld-list="nvidia-modeset" for #1
  • kld-list="nvidia-drm"** for #2; must be accompanied/preceeded by hw.nvidiadrm.modeset=1 in /boot/loader.conf
Refer also to the descriptions in the package message, as in for example here.

The proprietary Nvidia Xorg driver /usr/local/lib/xorg/modules/drivers/nvidia_drv.so is referenced in an appropriate Xorg .conf file embedded in Section "Device" as in:
Rich (BB code):
        Driver       "nvidia"

For normal Xorg usage the Nvidia proprietary solution works well, usually. You may be experiencing problems because you are using a fairly new Nvidia graphics card.




The use of a mixed port like graphics/nvidia-drm-515-kmod, graphics/nvidia-drm-61-kmod (and graphics/nvidia-drm-66-kmod on -CURRENT) for Xorg-use is, as a general case, not necessary. You will also get these mixed drivers when building the meta port graphics/nvidia-drm-kmod; on 14.2-RELEASE this will result in building and installing graphics/nvidia-drm-61-kmod. The exception is using Wayland where you likely will need the mixed open source-Nvidia proprietary drivers.

Moreover, relating to the quote above, it seems you are in effect inter-mixing these two solutions:
  1. solely Nvidia proprietary drivers as in x11/nvidia-driver (this is not a meta port)
  2. mixed open source and Nvidia proprietary drivers as in graphics/nvidia-drm-kmod (meta port), graphics/nvidia-drm-515-kmod and graphics/nvidia-drm-61-kmod
It is unclear how such 'double mixing' affects a working graphics solution in your case; it doesn't help in debugging the issue either.

I suggest, as mentioned by T-Aoki and ashafer, that you do not use the mixed open source-proprietary option #2 and concentrate on the Nvidia proprietary drivers option #1: x11/nvidia-driver (or an original tarball from Nvidia if need be).

___
* this automatically chain loads /boot/modules/nvidia.ko when /boot/modules/nvidia-modeset.ko is kldloaded by specifying kld_list="nvidia-modeset" in /etc/rc.conf

** for some elaborate Wayland set ups, I believe, you may need both, that is: kld_list="nvidia-drm nvidia-modeset"
I had previously a 4070 working with that exact configuration, i.e., the proprietary and the open-source DRM driver co-existed (installed from pkgs), both being kldloaded in rc.conf , as you mention.

The NVIDIA tarball works, I have a "usable" environment. I want to go back to my previous setup, ideally, that's my motivation here.

I will follow the recommendations as suggested above, namely making sure /usr/src are up-to-date with my running kernel.
 
FYI: Below is my /etc/X11/xorg.conf that I've used before switching to auto configuration by graphics/nvidia-drm-61-kmod (now renamed not to be picked).

Picked up sections ServerLayout, Monitor, Device and Screen related with GPU only. Others like Input and Files are not included and comments are stripped.

Merged the one I've using before I started using nvidia GPU and the one generated by x11/nvidia-xconfig with some customization and works fine for my ThinkPad P52 with nvidia Quadro P1000 (notebook) with internal 4k display when I temporarily switching back without DRM for testing.

Code:
Section "ServerLayout"
    Identifier     "PCI"
    Screen      0  "Screen PCI" 0 0
    Inactive       "InactiveDevice1"
    InputDevice    "Mouse0" "CorePointer"
    InputDevice    "Keyboard0" "CoreKeyboard"
    Option         "AIGLX" "True"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Sharp"
    ModelName      "DP-2"
    HorizSync       47.2 - 56.6
    VertRefresh     50.0 - 60.0
    ModeLine       "3840x2160_60.00" 712.75 3840 4160 4576 5312 2160 2163 2168 2237 -hsync +vsync
EndSection

Section "Device"
    Identifier     "NV PCI"
    Driver         "nvidia"
    VendorName     "nvidia"
    BoardName      "Quadro P1000"
    BusID          "PCI:1:0:0"
EndSection

Section "Device"
    Identifier     "InactiveDevice1"
    Driver         "modesetting"
    VendorName     "Unknown"
EndSection

Section "Screen"
    Identifier     "Screen PCI"
    Device         "NV PCI"
    Monitor        "Monitor0"
    DefaultDepth    24
    Option         "metamodes" "nvidia-auto-select +0+0"
    Option         "RegistryDwords" "EnableBrightnessControl=1"
    Option         "UseEdidDpi" "FALSE"
    Option         "DPI" "284 x 284"
    Option         "AddARGBGLXVisuals" "True"
    Option         "UBB" "True"
    Option         "RenderAccel" "True"
    Option         "AllowGLXWithComposite" "True"
    Option         "AllowEmptyInitialConfiguration" "True"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection
 
The panic you encountered doesn't seem to be nvidia specific.
By quick googling with keywords "drm_gem_plane_helper_prepare_fb() freebsd", found 2 seemingly-related topics found.
And also, this page hits at the same time, including
Code:
[PFC PATCH 0/3] drm/virtio: synchronous guest framebuffer update
 2023-08-20 20:58 UTC  (4+ messages)
` [RFC PATCH 3/3] drm/virtio: drm_gem_plane_helper_prepare_fb for obj synchronization
Not sure from which Linux version Lnux kernel had the functionality, though.

And another developer I've asked opinion (just after my previous post) with the same email to ashafer kindly replied as follows:
Code:
Hmm, the boot hangs are particularly curious -- that may somehow be
related to how new the card is and whether it works well with or
without GSP as you have been working through.  It could also be
related to some kernel ABI issue, it's hard to know if the user is
getting an appropriate build with /usr/src sync'd with the running
kernel from what we have read so far.  This is particularly critical
for the drm driver where the KPIs are also changeable in the ports
build but it seems like he is using the nvidia-modeset driver only
which is generally less fragile but does still need a consistent KABI.

The overall erratic situation going on makes me wonder a bit about the
user's installation environment, particularly when they talk about the
GL issues.  I use poudriere for everything to build in a sandbox and
only manage my system with pkg.  But some people mix all sorts of ways
of building, and don't really understand the implications and it can
be impossible to understand the state of their system through
messaging.

In the first log blob it never enumerates any screens.  I'd suggest
maybe running 'nvidia-xconfig' and giving X and the driver a bit more
clue as to the configuration as a first step and see if it gets any
farther.. but it seems like he does eventually get a running Xserver
so I am not really sure what to make of that.

That is all my thoughts for now, you are welcome to relay them if you
want or if not I can go and find my forum credentials.

I've assumed you're pulling in /usr/src in sync with your running kernel, right? If not, try sync'ing again with releng/14.2 branch. The commit log indicates no kernel upgrades after -p1 patch release, so latest -p3 should be OK.
And as ashafer already mentioned, stop using nvidia-drm.ko and let x11/nvidia-xconfig to auto configure xorg would be worth trying when nvidia-drm.ko is NOT loaded.
At least, the panic you've encountered (instead of hangs) is related with DRM codes.
I usually don't mix up ports with pkgs, but if necessary, I go with poudriere. I might have fallen behind on /usr/src since freebsd-update ignores its existence as it was not created by bsdinstall.
 
I might have fallen behind on /usr/src since freebsd-update ignores its existence as it was not created by bsdinstall.
So how did you obtain source codes?
If you cloned official git (or any of official mirrors like github) and no local customizations are done, make sure you've chosen releng/14.2 branch and pull the latest commits.
It should be the same as 14.2-p3 for now.
 
I usually don't mix up ports with pkgs, but if necessary, I go with poudriere. I might have fallen behind on /usr/src since freebsd-update ignores its existence as it was not created by bsdinstall.
Also, I understand it correctly, this is a direct consequence of mixing up the NVIDIA tarball with graphics/nvidia-drm-515-kmod, as suggested by reading PR 282659.
I ended up there because pkg x11/nvidia-driver didn't work for me (because of lacking GSP firmware?).
 
So how did you obtain source codes?
If you cloned official git (or any of official mirrors like github) and no local customizations are done, make sure you've chosen releng/14.2 branch and pull the latest commits.
It should be the same as 14.2-p3 for now.
git clone --branch releng/14.2 https://git.FreeBSD.org/src.git /usr/src
Cloning into '/usr/src'...


I pulled it fresh now and I will start over.
 
Also, I understand it correctly, this is a direct consequence of mixing up the NVIDIA tarball with graphics/nvidia-drm-515-kmod, as suggested by reading PR 282659.
I ended up there because pkg x11/nvidia-driver didn't work for me (because of lacking GSP firmware?).
graphics/nvidia-drm-*-kmod uses nvidia official tarball with patches.
But as upstream tarball doesn't include anything from outside nvidia related with DRM codes, these ports also extracts and uses corresponding graphics/drm-*-kmod distfiles (thus, BUILD_DEPENDS on corresponding one of them). And at the same time, some of kernel modules from corresponding graphics/drm-*-kmod are required to run, thus, RUN_DEPENDS, too.

So mixing up graphics/nvidia-drm-515-kmod and graphics/drm-61-kmod should not work (I assume even doesn't succeeds to build as of conflicts).
Using ports for both are needed.
 
I might have fallen behind on /usr/src since freebsd-update ignores its existence as it was not created by bsdinstall.
You can indeed get your /usr/src up to date by untarring the base source or with git(7); the latter is practically mandatory when you build the base install from source and then you are resposible for 'manually' synching the source.

However, when you build only (some) ports (and not the base install) from source you should get by using the base source code as installed by bsdinstall(8); this source is not under control by git but under control of freebsd-update(8) as specified in freebsd-update.conf(5)*. How to get sources installed and have these updated by freebsd-update(8) after a base-install contains some further info.

___
* this source of the base is mainly intended for human consumption: reading. However some of those files in the base source are needed when building ports from source, especially where it concerns building kernel modules like /boot/modules/nvidia-drm.ko (open source) that have to be mated to a specific OS kernel version. AFAIK the proprietary Nvidia files, like the kernel module /boot/modules/nvidia-modeset.ko, are already build/compiled by Nvidia and put in a binary blob, but I'm not 100% sure about this.
 
So I decided to put everything down on a file, documenting every step and options used for each scenario. It might be a sub-optimal process but at least, I think I covered all use cases in this thread:
  1. DRM setup
    1. PKG: FAIL
    2. From ports: FAIL
  2. X11
    1. PKG: FAIL
    2. From ports: FAIL
    3. NVIDIA tarball: OK
I put everything in this file, if someone has the patience to go through and tell me what I have missed, I am happy to do more testing. Either way, thank you for the support so far!
 

Attachments

mfoacs You'll want to disable the commit from this review: https://reviews.freebsd.org/D49828

I'm not sure which way you'll want to do it in your local setup but as long as you check out a ports tree before that commit or revert it locally you should be fine. GSP will be enabled by default in your ports builds again. Note that by enabling GSP your setup should work and you'll be back on a port based driver install, but you may run into the suspend/resume issue which prompted us to disable GSP temporarily. Your configuration seems sane so I think this is the only bit you'll need.
 
mfoacs You'll want to disable the commit from this review: https://reviews.freebsd.org/D49828

I'm not sure which way you'll want to do it in your local setup but as long as you check out a ports tree before that commit or revert it locally you should be fine. GSP will be enabled by default in your ports builds again. Note that by enabling GSP your setup should work and you'll be back on a port based driver install, but you may run into the suspend/resume issue which prompted us to disable GSP temporarily. Your configuration seems sane so I think this is the only bit you'll need.
Can this be a problem specific with certain VGA BIOS (firmware) in flash ROM (Regardless on main board or VGA card)?

If so, should we OPTION'ify the application of the patch in D49828 (enabled by default for now not to break suspend/resume again)?
Maybe we can conditionally exclude the option depending on NVVERSION not to confuse users of GPUs without GSP.
 
mfoacs You'll want to disable the commit from this review: https://reviews.freebsd.org/D49828

I'm not sure which way you'll want to do it in your local setup but as long as you check out a ports tree before that commit or revert it locally you should be fine. GSP will be enabled by default in your ports builds again. Note that by enabling GSP your setup should work and you'll be back on a port based driver install, but you may run into the suspend/resume issue which prompted us to disable GSP temporarily. Your configuration seems sane so I think this is the only bit you'll need.
I can confirm, by reverting the patch D49828 locally, I can successfully rebuild x11/nvidia-driver and have an usable X session.
 
Back
Top