Some questions about caching and thoughts after a few days of using #358

tinhoff · 2026-02-28T11:02:56Z

tinhoff
Feb 28, 2026

Hi there,

great project! I’ve been using it for a few days with an nbd+zpool setup and Hetzner S3, mainly to send ZFS snapshots for offsite backups. It works amazingly well for my use case and feels much more responsive and performant than my previous setup using s3backer/zpool.

Nevertheless, I have two questions and some thoughts at the end, I would like to share.

Questions:

Could someone explain the two caching options (disk_size_gb, memory_size_gb) in more detail? How are the caches used? What is cached in memory, and what is cached on disk? Is older cached data from memory written to disk, or how do the two interact? If I write some blocks to the pool, is that data immediately placed in the cache as well, or only when it is fetched again from S3 on the next access? Documentation doesn't tell much here, or, I didn't found it.

When I look at S3 directly: with s3backer + ZFS + autotrim, for 300 GB of backups I literally ended up with 600,000 x 1 MB files. Currently, with my zerofs setup, it’s a 150 GB backup with roughly 670 files and most of them around 290 MB each.

What happens if a 128 K ZFS block is not in the ZFS/zerofs cache and gets accessed? Does zerofs download a full 300 MB block from S3? Or does it use a HTTP range request to download only 128 K? Or maybe some look‑ahead logic that downloads, say, 10 MB of a 300 MB S3 file? And again: what ends up in the zerofs cache in the end?

Thoughts:

Some of my thoughts after a few days of use:
Since ZFS already has great options for encryption and compression, it would be nice if zerofs eventually offered options for encryption=none and compression=none.

What I miss the most is that s3backer had a very useful “stats” file that you could simply watch -n 1 to get a good insight into what the engine was currently doing with S3 and its caches. When I first started using s3backer, this really helped me s lot understand what was happening in the background. I even build a fancy web-ui for this. Would like to build the same for zerofs.

Barre · 2026-03-01T16:54:41Z

Barre
Mar 1, 2026
Maintainer

Hi,

Hi there,

great project! I’ve been using it for a few days with an nbd+zpool setup and Hetzner S3, mainly to send ZFS snapshots for offsite backups. It works amazingly well for my use case and feels much more responsive and performant than my previous setup using s3backer/zpool.

Thank you!

Nevertheless, I have two questions and some thoughts at the end, I would like to share.

Questions:

Could someone explain the two caching options (disk_size_gb, memory_size_gb) in more detail? How are the caches used? What is cached in memory, and what is cached on disk? Is older cached data from memory written to disk, or how do the two interact? If I write some blocks to the pool, is that data immediately placed in the cache as well, or only when it is fetched again from S3 on the next access? Documentation doesn't tell much here, or, I didn't found it.

Disk Cache:

Caches raw SSTable bytes from the object store on local disk, split into 4MB parts. It knows nothing about the database internals, it's just a local mirror of remote objects to avoid network round-trips.

Populated on reads and writes.
Evicts LRU when the maximum cache size is hit

In-Memory Block Cache

Caches decoded, deserialized database structures in memory:

Block cache: decoded data blocks from SSTables
Meta cache: parsed SSTable indices and bloom filters

Since these are already deserialized, a hit skips both I/O and decoding. This cache is populated on reads and also when flushing memtables (so freshly written data is immediately cached). Compaction does not populate it.

The two caches are independent. The memory cache does not evict to the disk cache.

When I look at S3 directly: with s3backer + ZFS + autotrim, for 300 GB of backups I literally ended up with 600,000 x 1 MB files. Currently, with my zerofs setup, it’s a 150 GB backup with roughly 670 files and most of them around 290 MB each.

What happens if a 128 K ZFS block is not in the ZFS/zerofs cache and gets accessed? Does zerofs download a full 300 MB block from S3? Or does it use a HTTP range request to download only 128 K? Or maybe some look‑ahead logic that downloads, say, 10 MB of a 300 MB S3 file? And again: what ends up in the zerofs cache in the end?

ZeroFS splits files into 32KB chunks, each stored as a separate database key. When ZFS reads a 128K block, zerofs issues a database scan over the relevant chunk range, configured with 10MB read-ahead.

So no, it won't download the full SST. The database uses SSTable indexes to locate the exact blocks needed, then issues HTTP range requests. With the disk cache enabled (default 4MB part granularity), you'd pull down roughly 4MB-aligned chunks per cache miss, and subsequent reads hitting that same region are served from local disk.

On a cold read of a 128K ZFS block, you'd see roughly a 10MB S3 fetch (due to read-ahead), with 4MB chunks landing in the disk cache and the decoded blocks in memory. Subsequent nearby reads are fast.

Thoughts:

Some of my thoughts after a few days of use: Since ZFS already has great options for encryption and compression, it would be nice if zerofs eventually offered options for encryption=none and compression=none.

ZeroFS stores data in fixed 32KB chunks. Partial writes and sparse regions still produce a full 32KB value. LZ4 is nearly free CPU-wise and squashes those zero-padded tails to almost nothing.

Data is encrypted with XChaCha20-Poly1305, and the Poly1305 tag is verified on every read. If a single bit gets corrupted, you'd get a hard error instead of silently reading garbage. XChaCha20 is dirt cheap on any modern CPU, so you're basically getting free end-to-end data integrity over everything sitting in S3. This overlaps with ZFS's own integrity checking, but the cost is negligible so there's little reason to add a configuration option there.

What I miss the most is that s3backer had a very useful “stats” file that you could simply watch -n 1 to get a good insight into what the engine was currently doing with S3 and its caches. When I first started using s3backer, this really helped me s lot understand what was happening in the background. I even build a fancy web-ui for this. Would like to build the same for zerofs.

This sounds like a nice feature to add.

0 replies

tinhoff · 2026-03-03T07:29:29Z

tinhoff
Mar 3, 2026
Author

Love it!

Thanks for answering all my questions in such a detail.
Keep up the great work.

merci!

1 reply

Barre Mar 3, 2026
Maintainer

Glad I was helpful!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some questions about caching and thoughts after a few days of using #358

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Questions:

Thoughts:

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Some questions about caching and thoughts after a few days of using #358

Uh oh!

tinhoff Feb 28, 2026

Questions:

Thoughts:

Replies: 2 comments · 1 reply

Uh oh!

Barre Mar 1, 2026 Maintainer

Questions:

Thoughts:

Uh oh!

tinhoff Mar 3, 2026 Author

Uh oh!

Barre Mar 3, 2026 Maintainer

tinhoff
Feb 28, 2026

Replies: 2 comments 1 reply

Barre
Mar 1, 2026
Maintainer

tinhoff
Mar 3, 2026
Author

Barre Mar 3, 2026
Maintainer