I have an application where we run many services in jails. Each service is isolated from the others and keeps its local data in a ZFS dataset that is passed into the jail. We're working on increasing redundancy in the system. Right now we are using ZFS snapshots sent to a backup server to cover DR needs. If we need to recover, we spin up a new VM, restore everything from Backup to NewVM and start things up again.
I want to break the services and hosts apart, and have been considering a multi-active-node setup. Our total data volume is relatively small (and *incredibly* compressible with ZFS compression), so I don't mind having multiple copies of it across multiple servers.
Let's assume for the use case I am describing that we have very good tracking and orchestration in place.
Is it feasible to change the "live head" of a dataset from server to server with ZFS send and recv? I'll sketch out the sort of thing I would expect to do:
3 servers: VM1, VM2, VM3
3 zpools: pool1, pool2, pool3 (hosted on same-numbered VMs)
1 service which moves around.
Is such a setup viable? Are there major gotchas to doing something like this? Are there issues from having different pools, but datasets otherwise named the same? What mechanism would be best for bringing up the snapshot on another VM: clone, rollback, promote?
Am I completely missing anything?
I want to break the services and hosts apart, and have been considering a multi-active-node setup. Our total data volume is relatively small (and *incredibly* compressible with ZFS compression), so I don't mind having multiple copies of it across multiple servers.
Let's assume for the use case I am describing that we have very good tracking and orchestration in place.
Is it feasible to change the "live head" of a dataset from server to server with ZFS send and recv? I'll sketch out the sort of thing I would expect to do:
3 servers: VM1, VM2, VM3
3 zpools: pool1, pool2, pool3 (hosted on same-numbered VMs)
1 service which moves around.
- Service starts on VM1.
- zfs snapshot pool1/service@snap1
- zfs send to each of VM2 and VM3
- Stop service on VM1
- (open question) zfs clone, zfs rollback, zfs promote snap1 on VM2 to pool2/service
- Start service on VM2
- zfs snapshot pool2/service@snap2
- zfs send to each of VM1 and VM3
- Stop service on VM2
- repeat steps 5-9, migrating the service to VM3 and snapshotting to snap3
- eventually land the service back on VM1, and start using snap3 as the basis for the service
Is such a setup viable? Are there major gotchas to doing something like this? Are there issues from having different pools, but datasets otherwise named the same? What mechanism would be best for bringing up the snapshot on another VM: clone, rollback, promote?
Am I completely missing anything?