I was happily converting bash scripts to csh, when all of a sudden the server froze. Prior to freezing, it was working well for several days.
After rebooting, the server would hang at Consoles: EFI Consoles. I suspected that a big spider had done damage in the server, but no bugs were to be found in it. However, I noticed that removing one of five Seagate Exos 7E8 drives would render the server bootable again.I have not yet tried to see if it is that particular drive or if it was enough to remove any one of those five. I tested with different configurations of four Exoses in at a time, and it did not matter which one of them was out. Ultimately, the whole thing started booting again with all five Exoses in. Leaving two Kingston KC600s in the machine did not hinder the boot.
All of the disks, the Seagates and the Kingstons alike, are connected to a Supermicro LSI 3108 controller running in JBOD mode with firmware 6.36. The Kingstons are SATA, the Seagates are SAS. The driver used for the controller is mrsas(4) because mfi(4) provided untolerably slow data transfers.
What tools exist for me to see what might have caused the freezing and what would be blocking the boot? /var/log/messages does not contain anything about the crash,
Edit 2: on a possibly related note,
After rebooting, the server would hang at Consoles: EFI Consoles. I suspected that a big spider had done damage in the server, but no bugs were to be found in it. However, I noticed that removing one of five Seagate Exos 7E8 drives would render the server bootable again.
All of the disks, the Seagates and the Kingstons alike, are connected to a Supermicro LSI 3108 controller running in JBOD mode with firmware 6.36. The Kingstons are SATA, the Seagates are SAS. The driver used for the controller is mrsas(4) because mfi(4) provided untolerably slow data transfers.
What tools exist for me to see what might have caused the freezing and what would be blocking the boot? /var/log/messages does not contain anything about the crash,
dmesg
provides a perfectly normal boot log.Edit 2: on a possibly related note,
dmesg
will report the transfer speeds to be 150.000MB/s, which is not correct. A dd
or cat
measurement will yield much higher rates.