272 lines
17 KiB
HTML
272 lines
17 KiB
HTML
<!DOCTYPE html>
|
||
<html lang="en">
|
||
<head>
|
||
<meta content="width=device-width, initial-scale=1" charset="utf-8" />
|
||
<title>Benchmarking and comparing DwarFS</title>
|
||
<link href="/style.css" type="text/css" rel="stylesheet" />
|
||
<link href="/prism.css" type="text/css" rel="stylesheet" />
|
||
</head>
|
||
<body class="line-numbers">
|
||
<h1 id="benchmarking-and-comparing-dwarfs">Benchmarking and
|
||
comparing DwarFS</h1>
|
||
<p>DwarFS is a filesystem developed by the user mhx on GitHub
|
||
(<em>mhx/dwarfs</em>), which is self-described as "A fast high
|
||
compression read-only file system for Linux, Windows, and
|
||
macOS." One of my ideas for blendOS was to layer different
|
||
packages, and combined with its compression and option to be
|
||
mounted as a FUSE-based filesystem, it's an appealing option for
|
||
this use case - blendOS is immutable, so it might as well have
|
||
some compression.</p>
|
||
<h2 id="methodology">Methodology</h2>
|
||
<p>The datasets being used for this test will be the
|
||
following:</p>
|
||
<ul>
|
||
<li>25 GiB of null data (just <code>00000000</code> in
|
||
binary)</li>
|
||
<li>25 GiB of random data<a href="#fn1" class="footnote-ref"
|
||
id="fnref1" role="doc-noteref"><sup>1</sup></a></li>
|
||
<li>Data for a 100 million-sided regular polygon; ~26.5 GiB<a
|
||
href="#fn2" class="footnote-ref" id="fnref2"
|
||
role="doc-noteref"><sup>2</sup></a></li>
|
||
<li>The current Linux longterm release source (<a
|
||
href="https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.6.58.tar.xz">6.6.58</a>
|
||
(<em>The Linux Kernel Archives</em>)); ~1.5 GB</li>
|
||
<li>For some rough latency testing:
|
||
<ul>
|
||
<li>1024 4 KiB files filled with null data (again, just
|
||
<code>00000000</code> in binary)</li>
|
||
<li>1024 4 KiB files filled with random data</li>
|
||
</ul></li>
|
||
</ul>
|
||
<p>All this data should cover both latency and read speed
|
||
testing for data that compresses differently - extremely
|
||
compressible files with null data, decently compressible files,
|
||
and random data which can't be compressed well.</p>
|
||
<h3 id="what-filesystems">What filesystems?</h3>
|
||
<p>I'll be benchmarking DwarFS (<em>mhx/dwarfs</em>),
|
||
fuse-archive (<em>Google/Fuse-Archive</em>) (with tar files),
|
||
and btrfs. In some early, basic testing, I found that mounting
|
||
any <em>compressed</em> archives with <code>fuse-archive</code>,
|
||
a tool for mounting archive file formats as read-only
|
||
filesystems, took far too long. Additionally, being FUSE-based,
|
||
these would have slightly worse performance than kernel
|
||
filesystems, so I tried to use a FUSE driver as well for btrfs.
|
||
Unforunately, I ran into a bug, so I won't be able to quite do
|
||
an equivalent test; btrfs will only be running in the
|
||
kernel.</p>
|
||
<p>During said early testing, I also ran into the fact that most
|
||
compressed archives, like Gzip-compressed tar archives, also
|
||
took far too long to <em>create</em>, because Gzip is
|
||
single-threaded. So all the options with no chance of being used
|
||
have been marked off, and I'll only be looking into these
|
||
three.</p>
|
||
<p>DwarFS also took far too long to create an archive on its
|
||
default setting, but on compression level 1, it's much faster -
|
||
11m2.738s for the ~80 GiB total, and considering my entire
|
||
system is about 20 GiB, that should be about 2-3 minutes, which
|
||
is reasonable; With no compression, tar took 3m3.378s. Mounting
|
||
the DwarFS archive was nearly instant (0.022s), while mounting
|
||
the tar archive took 1.352s - not bad, but not ideal when
|
||
mounting many, and will absolutely be taken into
|
||
consideration.</p>
|
||
<h2 id="running-the-benchmark">Running the benchmark</h2>
|
||
<p>First off, installed I installed my benchamark (<em>Disk Read
|
||
Benchmark</em>) by cloning the repository, installing it using
|
||
Cargo, then added its completions to fish (just for this
|
||
session):</p>
|
||
<div class="sourceCode" id="cb2"><pre
|
||
class="language-sh"><code class="language-bash"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="fu">git</span> clone https://git.askiiart.net/askiiart/disk-read-benchmark</span>
|
||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a><span class="bu">cd</span> ./disk-read-benchmark</span>
|
||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a><span class="ex">cargo</span> install <span class="at">--path</span> .</span>
|
||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a><span class="ex">disk-read-benchmark</span> generate-fish-completions <span class="kw">|</span> <span class="bu">source</span></span></code></pre></div>
|
||
<p>Then I prepared all the data:</p>
|
||
<div class="sourceCode" id="cb3"><pre
|
||
class="language-sh"><code class="language-bash"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="ex">disk-read-benchmark</span> prep-dirs</span>
|
||
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a><span class="ex">disk-read-benchmark</span> grab-data</span>
|
||
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a><span class="ex">./prepare.sh</span></span></code></pre></div>
|
||
<p><code>disk-read-benchmark</code> prepares all the
|
||
directories, generates the data to be used for testing, then
|
||
<code>./prepare.sh</code> uses the data to generate the DwarFS
|
||
and tar archives.</p>
|
||
<p>To run it, I just ran this:</p>
|
||
<div class="sourceCode" id="cb4"><pre
|
||
class="language-sh"><code class="language-bash"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="ex">disk-read-benchmark</span> benchmark</span></code></pre></div>
|
||
<p>Which outputs the data at
|
||
<code>data/benchmark-data.csv</code> and
|
||
<code>data/bulk.csv</code> for the single and bulk files,
|
||
respectively.</p>
|
||
<h2 id="results">Results</h2>
|
||
<p>After processing <a
|
||
href="/assets/benchmarking-dwarfs/data/">the data</a> with <a
|
||
href="/assets/benchmarking-dwarfs/process-data.py">this
|
||
script</a> to make it a bit easier, I put the resulting graphs
|
||
in here ↓</p>
|
||
<h3 id="sequential-read">Sequential read</h3>
|
||
<p>These results interest me quite a bit; unsurprisingly, DwarFS
|
||
has an advantage on the null file, due to its compression,
|
||
though it's disappointing the difference in time wasn't greater.
|
||
However, it does far worse on the random file, and I'm not sure
|
||
why; as discussed further down, DwarFS doesn't try to compress
|
||
incompressible files as far as I know, but I could be wrong. As
|
||
for the 100 million-sided polygon, it's somewhere in between,
|
||
with an advantage due to its compression, but still taking
|
||
longer than expected.</p>
|
||
<p>As for fuse-archive, it handles the null file well, but takes
|
||
longer on the others; not much to say.</p>
|
||
<div>
|
||
<canvas id="seq_read_chart" class="chart">
|
||
</canvas>
|
||
</div>
|
||
<h3 id="random-read">Random read</h3>
|
||
<p>There's nothing much to say here; although DwarFS took
|
||
significantly longer, it's still pretty fast - a different of
|
||
about 14 milliseconds worst case, across a 25 GiB file; similar
|
||
resuls for the 100 million-sided polygon, though to a less
|
||
extent, given it can be compressed better. With the null file,
|
||
due to its compression, DwarFS was actually on par with
|
||
fuse-archive, but it can't compete with btrfs's performance,
|
||
given it's so heavily optimized, and in the kernel.</p>
|
||
<div>
|
||
<canvas id="rand_read_chart" class="chart">
|
||
</canvas>
|
||
</div>
|
||
<h3 id="sequential-read-latency">Sequential read latency</h3>
|
||
<p>As expected, DwarFS performs a bit worse on the
|
||
incompressible random data, but otherwise they'll all roughly
|
||
equal. I wasn't expecting this, given btrfs is in the kernel,
|
||
while the other two are using FUSE.</p>
|
||
<div>
|
||
<canvas id="seq_read_latency_chart" class="chart">
|
||
</canvas>
|
||
</div>
|
||
<h3 id="random-read-latency">Random read latency</h3>
|
||
<p>Both DwarFS and fuse-archive had some trouble with this test.
|
||
DwarFS doesn't seem to handle random access very well; this is
|
||
supposedly fixed, as seen in issue 139 (<em>Issue #139 ·
|
||
mhx/dwarfs</em>), but the performance issues are obvious
|
||
regardless; I'm not sure why, given it doesn't compress
|
||
uncompressible data, not to mention it does just fine on the
|
||
random read test, where the only difference is that it reads
|
||
<em>more</em> data. But regardless, DwarFS ended up performing
|
||
far worse than expected on both the incompressible random data,
|
||
and the highly compressible null data.</p>
|
||
<p>Meanwhile, when testing random read latency in
|
||
<code>fuse-archive</code> pretty much just dies, becoming
|
||
ridiculously slow (even compared to DwarFS), so I didn't include
|
||
its single-file results. It succeeds on the bulk files, but
|
||
given it just shows as 0 seconds anyways, given the massive
|
||
scale, I opted to not include it in this graph at all.</p>
|
||
<div>
|
||
<canvas id="rand_read_latency_chart" class="chart">
|
||
</canvas>
|
||
</div>
|
||
<h2 id="misc-notes">Misc notes</h2>
|
||
<p>DwarFS can take up a fair amount of memory if mounting it
|
||
many times (<em>Issue #219 · mhx/dwarfs</em>), and this should
|
||
be kept in mind for use in BlendOS.</p>
|
||
<hr />
|
||
<p>Ratarmount (<em>mxmlnkn/ratarmount</em>) should also be
|
||
investigated; it's similar to fuse-archive, but with some
|
||
improvements, and some important notes. From its README
|
||
file:</p>
|
||
<blockquote>
|
||
<p>Note that fuse-archive daemonizes instantly but the mount
|
||
point will not be usable for a long time and everything trying
|
||
to use it will hang until then when not using
|
||
--asyncprogress</p>
|
||
</blockquote>
|
||
<blockquote>
|
||
<p>Mounting bzip2 and xz archives has actually become faster
|
||
than archivemount and fuse-archive with ratarmount -P 0 on most
|
||
modern processors because it actually uses more than one core
|
||
for decoding those compressions. indexed_bzip2 supports block
|
||
parallel decoding since version 1.2.0.</p>
|
||
</blockquote>
|
||
<p>Despite being written in Python, Ratarmount seems to have
|
||
significant performance improvements over fuse-archive.</p>
|
||
<hr />
|
||
<p>This should also be tested on systems with different specs,
|
||
like my Chromebook and laptop, and should try getting the btrfs
|
||
FUSE driver working and benchmarking that.</p>
|
||
<h2 id="summary">Summary</h2>
|
||
<p>DwarFS, or just the normal filesystem plus overlayfs, seem
|
||
like they may be the best options - DwarFS's compression and
|
||
deduplication are great, and the deduplication could probably be
|
||
used in way I haven't even thought of yet, but it has some niche
|
||
issues. Overall, I'm leaning towards using DwarFS as an option,
|
||
with just overlayfs as the default, but further testing is
|
||
needed.</p>
|
||
<h2 id="footnotes">Footnotes</h2>
|
||
<h2 id="sources">Sources</h2>
|
||
<p> - “Confused_ace_noises/Maths-Demos - Branch:
|
||
Headless-Deterministic.” Forgea: Git with a Cup of Jea,
|
||
git.askiiart.net/confused_ace_noises/maths-demos/src/branch/headless-deterministic.<br />
|
||
- “Disk Read Benchmark - A Simple and Performant Read-Only Disk
|
||
Benchmark, Written in Rust.” Forgea: Git with a Cup of Jea,
|
||
git.askiiart.net/askiiart/disk-read-benchmark.<br />
|
||
- Google. “Google/Fuse-Archive: Fuse File System for Archives
|
||
and Compressed Files (ZIP, RAR, 7z, ISO, TGZ, Xz...).” GitHub,
|
||
github.com/google/fuse-archive.<br />
|
||
- The Linux Kernel Archives, Linux Kernel Organization, Inc.,
|
||
<www.kernel.org/>.<br />
|
||
- Mhx. “Feature Request: Improve Block Management for
|
||
Uncompressed Blocks to Save Memory and Enhance Deduplication ·
|
||
ISSUE #139 · MHX/Dwarfs.” GitHub,
|
||
github.com/mhx/dwarfs/issues/139.<br />
|
||
- mhx. “mhx/Dwarfs: A Fast High Compression Read-Only File
|
||
System for Linux, Windows and Macos.” GitHub,
|
||
github.com/mhx/dwarfs.<br />
|
||
- mhx. “[Feature Request] Mounting Multiple Archives to the
|
||
Same Path · Issue #219 · MHX/Dwarfs.” GitHub,
|
||
github.com/mhx/dwarfs/issues/219.<br />
|
||
- mxmlnkn. “mxmlnkn/ratarmount: Access Large Archives as a
|
||
Filesystem Efficiently, e.g., Tar, Rar, Zip, Gz, BZ2, XZ, ZSTD
|
||
Archives.” GitHub, github.com/mxmlnkn/ratarmount.</p>
|
||
<!-- JavaScript for graphs goes hereeeeeee -->
|
||
<script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
|
||
<script src="/assets/benchmarking-dwarfs/js/declare_vars.js"></script>
|
||
<script src="/assets/benchmarking-dwarfs/js/seq_read.js"></script>
|
||
<script src="/assets/benchmarking-dwarfs/js/rand_read.js"></script>
|
||
<script src="/assets/benchmarking-dwarfs/js/seq_latency.js"></script>
|
||
<script src="/assets/benchmarking-dwarfs/js/rand_latency.js"></script>
|
||
<section id="footnotes"
|
||
class="footnotes footnotes-end-of-document" role="doc-endnotes">
|
||
<hr />
|
||
<ol>
|
||
<li id="fn1"><p>My code can generate up to 25 GB/s. However, it
|
||
does random writes to my drive, which is <em>much</em> slower.
|
||
So on one hand, you could say my code is so amazingly fast that
|
||
current day technologies simply can't keep up. Or you could say
|
||
that I have no idea how to code for real world scenarios.<a
|
||
href="#fnref1" class="footnote-back"
|
||
role="doc-backlink">↩︎</a></p></li>
|
||
<li id="fn2">This data is from a modified version of an
|
||
abandoned math demonstration program
|
||
(<em>confused_ace_noises/maths-demos</em>) made by a friend; it
|
||
generates regular polygons and writes their data to a file. I
|
||
chose this because it was an artificial and reproducible yet
|
||
fairly compressible dataset (without being extremely
|
||
compressible like null data).
|
||
<details open>
|
||
<summary>
|
||
3-sided regular polygon data
|
||
</summary>
|
||
<br>
|
||
<!-- I put it in here just as a `style`, it didn't work. I put it in as a div with that `style`, it didn't work. I put it in as a div of that class which has those properties in style.css, it works -->
|
||
<!-- i hate webdev i hate webdev i hate webdev i hate webdev i hate webdev i hate webdev -->
|
||
<div class="force-word-wrap">
|
||
<pre><code>[Vertex { position: Pos([0.5, 0.0, 0.0]), color: Col([0.5310667, 0.7112941, 0.7138775]) }, Vertex { position: Pos([-0.25000003, 0.4330127, 0.0]), color: Col([0.7492257, 0.3142163, 0.49905664]) }, Vertex { position: Pos([0.0, 0.0, 0.0]), color: Col([0.2046682, 0.25598457, 0.72071356]) }, Vertex { position: Pos([-0.25000003, 0.4330127, 0.0]), color: Col([0.6389981, 0.5204368, 0.077735074]) }, Vertex { position: Pos([-0.24999996, -0.43301272, 0.0]), color: Col([0.8869035, 0.30709425, 0.8658899]) }, Vertex { position: Pos([0.0, 0.0, 0.0]), color: Col([0.2046682, 0.25598457, 0.72071356]) }, Vertex { position: Pos([-0.24999996, -0.43301272, 0.0]), color: Col([0.6236294, 0.03584433, 0.7590722]) }, Vertex { position: Pos([0.5, 8.742278e-8, 0.0]), color: Col([0.6105084, 0.3593351, 0.85544324]) }, Vertex { position: Pos([0.0, 0.0, 0.0]), color: Col([0.2046682, 0.25598457, 0.72071356]) }]</code></pre>
|
||
</div>
|
||
</details>
|
||
<a href="#fnref2" class="footnote-back"
|
||
role="doc-backlink">↩︎</a></li>
|
||
</ol>
|
||
</section>
|
||
<iframe src="https://john.citrons.xyz/embed?ref=askiiart.net" style="margin-left:auto;display:block;margin-right:auto;max-width:732px;width:100%;height:94px;border:none;"></iframe>
|
||
<script src="/prism.js"></script>
|
||
</body>
|
||
<footer>
|
||
<p><a href="https://git.askiiart.net/askiiart/engl-2311-blog">Source code</a> | <a href="/feed.xml">RSS</a> | <a href="/glossary.html">Glossary</a> | <a href="/about.html">About</a></p>
|
||
<small>Image captions are the same as the alt text; assuming you're sighted, you can most likely ignore them.</small>
|
||
</footer>
|
||
</html>
|