Replies: 2 comments 1 reply
-
|
Hi,
Thank you!
Disk Cache: Caches raw SSTable bytes from the object store on local disk, split into 4MB parts. It knows nothing about the database internals, it's just a local mirror of remote objects to avoid network round-trips.
In-Memory Block Cache Caches decoded, deserialized database structures in memory:
Since these are already deserialized, a hit skips both I/O and decoding. This cache is populated on reads and also when flushing memtables (so freshly written data is immediately cached). Compaction does not populate it. The two caches are independent. The memory cache does not evict to the disk cache.
ZeroFS splits files into 32KB chunks, each stored as a separate database key. When ZFS reads a 128K block, zerofs issues a database scan over the relevant chunk range, configured with 10MB read-ahead. So no, it won't download the full SST. The database uses SSTable indexes to locate the exact blocks needed, then issues HTTP range requests. With the disk cache enabled (default 4MB part granularity), you'd pull down roughly 4MB-aligned chunks per cache miss, and subsequent reads hitting that same region are served from local disk. On a cold read of a 128K ZFS block, you'd see roughly a 10MB S3 fetch (due to read-ahead), with 4MB chunks landing in the disk cache and the decoded blocks in memory. Subsequent nearby reads are fast.
ZeroFS stores data in fixed 32KB chunks. Partial writes and sparse regions still produce a full 32KB value. LZ4 is nearly free CPU-wise and squashes those zero-padded tails to almost nothing. Data is encrypted with XChaCha20-Poly1305, and the Poly1305 tag is verified on every read. If a single bit gets corrupted, you'd get a hard error instead of silently reading garbage. XChaCha20 is dirt cheap on any modern CPU, so you're basically getting free end-to-end data integrity over everything sitting in S3. This overlaps with ZFS's own integrity checking, but the cost is negligible so there's little reason to add a configuration option there.
This sounds like a nice feature to add. |
Beta Was this translation helpful? Give feedback.
-
|
Love it! Thanks for answering all my questions in such a detail. merci! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi there,
great project! I’ve been using it for a few days with an nbd+zpool setup and Hetzner S3, mainly to send ZFS snapshots for offsite backups. It works amazingly well for my use case and feels much more responsive and performant than my previous setup using s3backer/zpool.
Nevertheless, I have two questions and some thoughts at the end, I would like to share.
Questions:
Could someone explain the two caching options (disk_size_gb, memory_size_gb) in more detail? How are the caches used? What is cached in memory, and what is cached on disk? Is older cached data from memory written to disk, or how do the two interact? If I write some blocks to the pool, is that data immediately placed in the cache as well, or only when it is fetched again from S3 on the next access? Documentation doesn't tell much here, or, I didn't found it.
When I look at S3 directly: with s3backer + ZFS + autotrim, for 300 GB of backups I literally ended up with 600,000 x 1 MB files. Currently, with my zerofs setup, it’s a 150 GB backup with roughly 670 files and most of them around 290 MB each.
What happens if a 128 K ZFS block is not in the ZFS/zerofs cache and gets accessed? Does zerofs download a full 300 MB block from S3? Or does it use a HTTP range request to download only 128 K? Or maybe some look‑ahead logic that downloads, say, 10 MB of a 300 MB S3 file? And again: what ends up in the zerofs cache in the end?
Thoughts:
Some of my thoughts after a few days of use:
Since ZFS already has great options for encryption and compression, it would be nice if zerofs eventually offered options for encryption=none and compression=none.
What I miss the most is that s3backer had a very useful “stats” file that you could simply watch -n 1 to get a good insight into what the engine was currently doing with S3 and its caches. When I first started using s3backer, this really helped me s lot understand what was happening in the background. I even build a fancy web-ui for this. Would like to build the same for zerofs.
Beta Was this translation helpful? Give feedback.
All reactions