Let’s talk about the pros; why choose ZFS at all:
Integrity And Consistency: ZFS is built for maximum reliability. By default, checksums are calculated for all data, and for metadata, at least two copies are written in different places on the disk. There is such a myth that ECC memory is needed for ZFS; any file system must prevent writing incorrect data; they honestly talk about it in ZFS.
Compression on the fly: For example, the system uses the LZ4 compression algorithm to issue 800 MB per second per core for writing and up to 4.5 GB per second for reading without any problems. Accordingly, if we are talking about a multi-threaded load, it can be effectively utilized if there is free processor time. In this case, you can save disk space and hard disk IOPS in return for processor resources due to fewer disk operations. There are interesting cases, such as using MySQL, when we have a not very expensive SSD or a simple HDD. Then, by enabling LZ4, you can win well in multi-threaded performance, slightly increasing the latency of each thread.
Atomicity: In ZFS, everything is atomic because it is based on a Merkle tree. If our file system is always atomic, then we can abandon the WAL log in applications because the integrity of the blocks is guaranteed by the transactional nature of the file system. There are nuances here: you need to be able to speak with the file system in its language, manipulate transactions. It is a separate question that some databases use these logs for replication. But in general, you can save on resources this way.
This nuance can also be a disadvantage: working through the Merkle tree is not free – to acquire a consistent state, you need to carry out all the manipulations to calculate the hash sums. It is expensive in the first place relative to the processor. Something requires more resources than classical file systems and something more minor on the data access side. For example, we save on the same journal.
“Free ” snapshots: Creating a snapshot in ZFS is constant in time and does not impose additional costs on working with this data. It is convenient to transfer snapshots, including incrementally. Those who use backups through Rsync or other tools face such a problem – you need to check which part of the data has changed. And here, you can send a snapshot incrementally; the integrity will be reviewed and confirmed on the other side.
A snapshot is a tag, a link to a specific version of all the data of all the blocks it consists of, starting from the root block. This is the specially labeled topmost root block where the tree begins. Due to this, getting an incremental slice of changed data is equivalent to getting blocks from the transaction pointed to by the snapshot; that is, it is not necessary to check which blocks have changed; we already have a link only to the modified ones.
Using snapshots, one can cite the rapid deployment of different versions for testing database dumps, the same PostgreSQL.