DwarFS is a filesystem developed by the user mhx on GitHub [1], which is self-described as "A fast high compression read-only file system for Linux, Windows, and macOS." One of my ideas for blendOS was to layer different packages, and combined with its compression and option to be mounted as a FUSE-based filesystem, it's an appealing option for this use case - blendOS is immutable, so it might as well have some compression.
The datasets being used for this test will be the following:
00000000
in
binary)00000000
in binary)All this data should cover both latency and read speed testing for data that compresses differently - extremely compressible files with null data, decently compressible files, and random data which can't be compressed well.
I'll be benchmarking DwarFS, fuse-archive (with tar files),
and btrfs. In some early, basic testing, I found that mounting
any compressed archives with fuse-archive
,
a tool for mounting archive file formats as read-only
filesystems, took far too long. Additionally, being FUSE-based,
these would have slightly worse performance than kernel
filesystems, so I tried to use a FUSE driver as well for btrfs.
Unforunately, I ran into a bug, so I won't be able to quite do
an equivalent test; btrfs will only be running in the
kernel.
During said early testing, I also ran into the fact that most compressed archives, like Gzip-compressed tar archives, also took far too long to create, because Gzip is single-threaded. So all the options with no chance of being used have been marked off, and I'll only be looking into these three.
DwarFS also took far too long to create on its default setting, but on compression level 1, it's much faster - 11m2.738s for the ~80 GiB total, and considering
First installed it by cloning the repository, installing it using Cargo, then added its completions to fish (just for this session):
git clone https://git.askiiart.net/askiiart/disk-read-benchmark
cd ./disk-read-benchmark
cargo install --path .
disk-read-benchmark generate-fish-completions | source
Then I prepared all the data:
disk-read-benchmark prep-dirs
disk-read-benchmark grab-data
./prepare.sh
disk-read-benchmark
prepares all the
directories, generates the data to be used for testing, then
./prepare.sh
uses the data to generate the DwarFS
and tar archives.
To run it, I just ran this:
disk-read-benchmark benchmark
Which outputs the data at
data/benchmark-data.csv
and
data/bulk.csv
for the single and bulk files,
respectively.
After processing the data with this script to make it a bit easier, I put the resulting graphs in here ↓
The FUSE-based filesystems run into a bit of trouble here -
with incompressible data, DwarFS has a hard time keeping up for
some reason, despite keeping up just fine with larger random
reads on the same data, and so it takes 3 to 4 seconds to run
random read latency testing on the 25 GiB random file.
Meanwhile, when testing random read latency in
fuse-archive
pretty much just dies, becoming
ridiculously slow (even compared to DwarFS), so I didn't test
its random read latency at all and just had its results put as 0
milliseconds.
My code can generate up to 25 GB/s. However, it does random writes to my drive, which is much slower. So on one hand, you could say my code is so amazingly fast that current day technologies simply can't keep up. Or you could say that I have no idea how to code for real world scenarios.↩︎
[Vertex { position: Pos([0.5, 0.0, 0.0]), color: Col([0.5310667, 0.7112941, 0.7138775]) }, Vertex { position: Pos([-0.25000003, 0.4330127, 0.0]), color: Col([0.7492257, 0.3142163, 0.49905664]) }, Vertex { position: Pos([0.0, 0.0, 0.0]), color: Col([0.2046682, 0.25598457, 0.72071356]) }, Vertex { position: Pos([-0.25000003, 0.4330127, 0.0]), color: Col([0.6389981, 0.5204368, 0.077735074]) }, Vertex { position: Pos([-0.24999996, -0.43301272, 0.0]), color: Col([0.8869035, 0.30709425, 0.8658899]) }, Vertex { position: Pos([0.0, 0.0, 0.0]), color: Col([0.2046682, 0.25598457, 0.72071356]) }, Vertex { position: Pos([-0.24999996, -0.43301272, 0.0]), color: Col([0.6236294, 0.03584433, 0.7590722]) }, Vertex { position: Pos([0.5, 8.742278e-8, 0.0]), color: Col([0.6105084, 0.3593351, 0.85544324]) }, Vertex { position: Pos([0.0, 0.0, 0.0]), color: Col([0.2046682, 0.25598457, 0.72071356]) }]