ZFS Implementing HSM for ZFS.

zerophase · Thursday at 11:34 AM

I'm thinking about building a server that uses an vdev SSD cache to dump writes, before moving them to a much larger HDD vdev for long term storage.

For implementing the HSM wouldn't it basically be take a ZFS snapshot of the SSD vdev, move the data to the HDD vdev, take a snapshot of the HDD vdev, and run ZFS diff. If they don't match move data from the SSDs to the HDDs again for the parts that do not match?

Years ago, I was reading Mike Acton and these C++ devs told me to use managed pointers. But, Mike Acton gave a lecture about C++ devs writing slow code. "You just need a char*. Throw the stl library out", he said. So, I spent three months debugging my char* class till it could ingest an entire XML file, and I could manipulate it with other code without bugs. It was basically call malloc once at the start of the program, read the entire file in, and then use memmove with start_block + (end_block - start_block) +/- 1, and then deallocate before the program closes.

Shouldn't all of the data transfers be sequential reads and writes, and wouldn't my already existing string class work for correcting differences between snapshots potentially? Maybe, I have to add some stuff to make it work with ZFS, but I believe I should already have most of the pointer math and bit twiddling needed to correct errors between snapshots. If there's data corruption from sequential writes won't it be { good_block } { bad_block } { good_block }, and I just do some pointer math between both good_blocks to overwrite the bad_block?

mer · Thursday at 12:22 PM

zerophase said:
I'm thinking about building a server that uses an vdev SSD cache to dump writes, before moving them to a much larger HDD vdev for long term storage.

I'm not sure what you mean by HSM, but I believe what you talk about here is basically a zpool on the HDD vdev using an SSD as a SLOG device for that zpool.

zerophase · Thursday at 12:28 PM

mer said:
I'm not sure what you mean by HSM, but I believe what you talk about here is basically a zpool on the HDD vdev using an SSD as a SLOG device for that zpool.

I mean Hierarchical Storage Management. I mean moving data from a much smaller but extremely fast SSD zpool to a much larger but substantially slower HDD zpool. I don't think the SSD zpool will need any slogs or any of that stuff. But, maybe. But, it should definitely be written to faster than a HDD zpool with all of the special disks and such.

I'm trying to see if on the server if I have a 10 TB SSD write cache zpool if I can write from my desktop's SSD to my server as fast as I can read from it. Minus a bit for the overhead. Then below it I have a 400 TB zpool of HDDs for long term storage.

ralphbsz · Thursday at 7:44 PM

Several questions. First, the most important one: What are you trying to accomplish? What is your goal, what are your requirements. You are asking an XY question: I want to do XY, how do I do it. But we need to know why you want XY, because there are many ways to do it, with different costs and benefits.

Are you interested in write speed? In that case, if all the data written to the SSD tier also eventually migrates to HDD (meaning you have long-lived data), all the SSDs are doing is acting as a shock absorber or write cache, allowing the system to quickly ride out spikes of write traffic, and then smoothly spool that to the HDD tier. On the other hand, if your data is short-lived, you may be able to write it to SSD only, and most of it is deleted before the migration to HDD happens. An extremely example of such a system is the constant writing of checkpoints in HPC: they get written continuously, usually never read, and deleted after a few minutes; they just act to speed up restarts of crashed processes. But note that using the SSD tier as a write buffer for data that is usually not read back may cause problems of flash wear-out.

On the other hand, you might be interested in read speed, and that is an application where SSDs shine. This is particularly true if you have an expectation that recently written data is also frequently read, while older data may be archival; the extreme example is compliance data that has to be kept online for many years, but the expectation is that it is never accessed.

If your issue is neither read speed nor write speed, why are you using SSDs at all? It's obvious that HDDs are much cheaper (and tape or MAID cheaper still, but neither is really viable for individual small business or home users today). The thing that is less obvious is that HDD is also cheaper when measuring bandwidth ($ per GB/s); the only reason SSDs make sense is for random access (they do actually have good $ per IOps). And this is where access patterns come in: If all your files or objects are large, and read and written sequentially with large IOs (or good read-ahead caching), then SSDs make no sense at all. At that point, the whole concept of HSM becomes questionable.

ZFS Implementing HSM for ZFS.

zerophase

mer

zerophase

ralphbsz