Hello,
I found myself in a situation where I don't know where to start. I have a 3,5" spinning rust disk placed in an external USB enclosure. I am using this disk for backups (I am learning how to do this nicely and automatically via zfs export | zfs import and a bunch of scripts). The problem arises because the disk tends to go to some kind of sleep mode where it spins down and parks, I am not certain how long it needs to be inactive in order to fall asleep. When the backup tries to start as the disk is sleeping, i get a bunch of these errors:
Now, that wouldn't be that much of a problem in itself (I can hack my way around the sleeping).
The biggest problem is that after this, I am unable to clear the error, or really do anything with the disk:
It is just frozen, I am even unable to turn off my PC because of the I/O error and I have to power it off by holding the power button.
So my two questions:
1. How do I prevent the disk from sleeping elegantly and properly? (I can do an ls on the disk contents every few minutes, but it does not seem very elegant, any other ideas?)
2. When the above situation happens, how do I clear the error, or do something else than just trying to power off the PC and when that inevitably hangs on disc sync, holding the power button?
I found myself in a situation where I don't know where to start. I have a 3,5" spinning rust disk placed in an external USB enclosure. I am using this disk for backups (I am learning how to do this nicely and automatically via zfs export | zfs import and a bunch of scripts). The problem arises because the disk tends to go to some kind of sleep mode where it spins down and parks, I am not certain how long it needs to be inactive in order to fall asleep. When the backup tries to start as the disk is sleeping, i get a bunch of these errors:
Code:
3103 Nov 1 21:21:36 slaanesh kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 02 10 00 00 10 00
1 Nov 1 21:21:36 slaanesh kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
2 Nov 1 21:21:36 slaanesh kernel: (da0:umass-sim0:0:0:0): Retrying command, 1 more tries remain
3 Nov 1 21:21:42 slaanesh kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 02 10 00 00 10 00
4 Nov 1 21:21:42 slaanesh kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
5 Nov 1 21:21:42 slaanesh kernel: (da0:umass-sim0:0:0:0): Retrying command, 0 more tries remain
6 Nov 1 21:21:47 slaanesh kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 02 10 00 00 10 00
7 Nov 1 21:21:47 slaanesh kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
43 Nov 1 21:21:47 slaanesh kernel: (da0:umass-sim0:0:0:0): Error 5, Retries exhausted
42 Nov 1 21:21:47 slaanesh ZFS[45352]: vdev I/O failure, zpool=backupPool path=/dev/diskid/DISK-ABCDEFA74638 offset=4000786423808 size=8192 error=5
41 Nov 1 21:21:47 slaanesh ZFS[47051]: vdev I/O failure, zpool=backupPool path=/dev/diskid/DISK-ABCDEFA74638 offset=4000786685952 size=8192 error=5
40 Nov 1 21:21:47 slaanesh ZFS[48916]: vdev I/O failure, zpool=backupPool path=/dev/diskid/DISK-ABCDEFA74638 offset=270336 size=8192 error=5
39 Nov 1 21:21:47 slaanesh ZFS[50168]: vdev probe failure, zpool=backupPool path=/dev/diskid/DISK-ABCDEFA74638
38 Nov 1 21:21:53 slaanesh kernel: (da0:umass-sim0:0:0:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
37 Nov 1 21:21:53 slaanesh kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
36 Nov 1 21:21:53 slaanesh kernel: (da0:umass-sim0:0:0:0): Retrying command, 0 more tries remain
35 Nov 1 21:21:58 slaanesh kernel: (da0:umass-sim0:0:0:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
34 Nov 1 21:21:58 slaanesh kernel: (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
33 Nov 1 21:21:58 slaanesh kernel: (da0:umass-sim0:0:0:0): Error 5, Retries exhausted
32 Nov 1 21:21:58 slaanesh kernel: Solaris: WARNING: Pool 'backupPool' has encountered an uncorrectable I/O failure and has been suspended.
31 Nov 1 21:21:58 slaanesh kernel:
30 Nov 1 21:21:58 slaanesh ZFS[89529]: vdev state changed, pool_guid=9509300292886972309 vdev_guid=3802336103444525098
29 Nov 1 21:21:58 slaanesh ZFS[91159]: pool I/O failure, zpool=backupPool error=28
28 Nov 1 21:21:58 slaanesh ZFS[92189]: pool I/O failure, zpool=backupPool error=28
27 Nov 1 21:21:58 slaanesh ZFS[93331]: pool I/O failure, zpool=backupPool error=28
26 Nov 1 21:21:58 slaanesh ZFS[95323]: pool I/O failure, zpool=backupPool error=28
25 Nov 1 21:21:58 slaanesh ZFS[96943]: pool I/O failure, zpool=backupPool error=28
24 Nov 1 21:21:58 slaanesh ZFS[97912]: pool I/O failure, zpool=backupPool error=28
23 Nov 1 21:21:58 slaanesh ZFS[98843]: pool I/O failure, zpool=backupPool error=28
22 Nov 1 21:21:58 slaanesh ZFS[590]: pool I/O failure, zpool=backupPool error=28
21 Nov 1 21:21:58 slaanesh ZFS[2519]: pool I/O failure, zpool=backupPool error=28
20 Nov 1 21:21:58 slaanesh ZFS[3690]: pool I/O failure, zpool=backupPool error=28
19 Nov 1 21:21:58 slaanesh ZFS[5207]: pool I/O failure, zpool=backupPool error=28
18 Nov 1 21:21:58 slaanesh ZFS[6461]: pool I/O failure, zpool=backupPool error=28
17 Nov 1 21:21:58 slaanesh ZFS[7440]: pool I/O failure, zpool=backupPool error=28
16 Nov 1 21:21:58 slaanesh ZFS[8999]: pool I/O failure, zpool=backupPool error=28
15 Nov 1 21:21:58 slaanesh ZFS[9865]: pool I/O failure, zpool=backupPool error=28
14 Nov 1 21:21:58 slaanesh ZFS[10454]: pool I/O failure, zpool=backupPool error=28
13 Nov 1 21:21:58 slaanesh ZFS[12207]: pool I/O failure, zpool=backupPool error=28
12 Nov 1 21:21:58 slaanesh ZFS[13702]: pool I/O failure, zpool=backupPool error=28
11 Nov 1 21:21:58 slaanesh ZFS[14956]: pool I/O failure, zpool=backupPool error=28
10 Nov 1 21:21:58 slaanesh ZFS[16804]: pool I/O failure, zpool=backupPool error=28
9 Nov 1 21:21:58 slaanesh ZFS[17923]: pool I/O failure, zpool=backupPool error=28
8 Nov 1 21:21:58 slaanesh ZFS[19114]: pool I/O failure, zpool=backupPool error=28
7 Nov 1 21:21:58 slaanesh ZFS[20539]: pool I/O failure, zpool=backupPool error=28
6 Nov 1 21:21:58 slaanesh ZFS[21390]: catastrophic pool I/O failure, zpool=backupPool
5 Nov 1 21:22:53 slaanesh kernel: (da0:umass-sim0:0:0:0): got CAM status 0x44
4 Nov 1 21:22:53 slaanesh kernel: (da0:umass-sim0:0:0:0): fatal error, failed to attach to device
3 Nov 1 21:22:53 slaanesh kernel: da0 at umass-sim0 bus 0 scbus7 target 0 lun 0
2 Nov 1 21:22:53 slaanesh kernel: da0: <ST4000NE 001-2MA101 EN01> s/n ABCDEFA74638 detached
1 Nov 1 21:22:53 slaanesh kernel: (da0:umass-sim0:0:0:0): Periph destroyed
Now, that wouldn't be that much of a problem in itself (I can hack my way around the sleeping).
The biggest problem is that after this, I am unable to clear the error, or really do anything with the disk:
Code:
root@slaanesh:~ # zpool status -v backupPool
pool: backupPool
state: SUSPENDED
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-HC
scan: scrub repaired 0B in 00:36:16 with 0 errors on Sun Oct 30 14:30:18 2022
config:
NAME STATE READ WRITE CKSUM
backupPool ONLINE 0 0 0
diskid/DISK-ABCDEFA74638 ONLINE 3 117 0
errors: List of errors unavailable: pool I/O is currently suspended
root@slaanesh:~ # zpool clear backupPool
cannot clear errors for backupPool: I/O error
root@slaanesh:~ # zpool export backupPool
<hangs indefinitely>
It is just frozen, I am even unable to turn off my PC because of the I/O error and I have to power it off by holding the power button.
So my two questions:
1. How do I prevent the disk from sleeping elegantly and properly? (I can do an ls on the disk contents every few minutes, but it does not seem very elegant, any other ideas?)
2. When the above situation happens, how do I clear the error, or do something else than just trying to power off the PC and when that inevitably hangs on disc sync, holding the power button?