engl-2311-blog/blog/benchmarking-dwarfs.html

<!DOCTYPE html>
<html lang="en">
    <head>
        <meta content="width=device-width, initial-scale=1" charset="utf-8" />
        <title>Benchmarking and comparing DwarFS</title>
        <link href="/style.css" type="text/css" rel="stylesheet" />
        <link href="/prism.css" type="text/css" rel="stylesheet" />
    </head>
    <body class="line-numbers">
        <h1 id="benchmarking-and-comparing-dwarfs">Benchmarking and
        comparing DwarFS</h1>
        <p>DwarFS is a filesystem developed by the user mhx on GitHub
        (<em>mhx/dwarfs</em>), which is self-described as "A fast high
        compression read-only file system for Linux, Windows, and
        macOS." One of my ideas for blendOS was to layer different
        packages, and combined with its compression and option to be
        mounted as a FUSE-based filesystem, it's an appealing option for
        this use case - blendOS is immutable, so it might as well have
        some compression.</p>
        <h2 id="methodology">Methodology</h2>
        <p>The datasets being used for this test will be the
        following:</p>
        <ul>
        <li>25 GiB of null data (just <code>00000000</code> in
        binary)</li>
        <li>25 GiB of random data<a href="#fn1" class="footnote-ref"
        id="fnref1" role="doc-noteref"><sup>1</sup></a></li>
        <li>Data for a 100 million-sided regular polygon; ~26.5 GiB<a
        href="#fn2" class="footnote-ref" id="fnref2"
        role="doc-noteref"><sup>2</sup></a></li>
        <li>The current Linux longterm release source (<a
        href="https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.6.58.tar.xz">6.6.58</a>
        (<em>The Linux Kernel Archives</em>)); ~1.5 GB</li>
        <li>For some rough latency testing:
        <ul>
        <li>1024 4 KiB files filled with null data (again, just
        <code>00000000</code> in binary)</li>
        <li>1024 4 KiB files filled with random data</li>
        </ul></li>
        </ul>
        <p>All this data should cover both latency and read speed
        testing for data that compresses differently - extremely
        compressible files with null data, decently compressible files,
        and random data which can't be compressed well.</p>
        <h3 id="what-filesystems">What filesystems?</h3>
        <p>I'll be benchmarking DwarFS (<em>mhx/dwarfs</em>),
        fuse-archive (<em>Google/Fuse-Archive</em>) (with tar files),
        and btrfs. In some early, basic testing, I found that mounting
        any <em>compressed</em> archives with <code>fuse-archive</code>,
        a tool for mounting archive file formats as read-only
        filesystems, took far too long. Additionally, being FUSE-based,
        these would have slightly worse performance than kernel
        filesystems, so I tried to use a FUSE driver as well for btrfs.
        Unforunately, I ran into a bug, so I won't be able to quite do
        an equivalent test; btrfs will only be running in the
        kernel.</p>
        <p>During said early testing, I also ran into the fact that most
        compressed archives, like Gzip-compressed tar archives, also
        took far too long to <em>create</em>, because Gzip is
        single-threaded. So all the options with no chance of being used
        have been marked off, and I'll only be looking into these
        three.</p>
        <p>DwarFS also took far too long to create an archive on its
        default setting, but on compression level 1, it's much faster -
        11m2.738s for the ~80 GiB total, and considering my entire
        system is about 20 GiB, that should be about 2-3 minutes, which
        is reasonable; With no compression, tar took 3m3.378s. Mounting
        the DwarFS archive was nearly instant (0.022s), while mounting
        the tar archive took 1.352s - not bad, but not ideal when
        mounting many, and will absolutely be taken into
        consideration.</p>
        <h2 id="running-the-benchmark">Running the benchmark</h2>
        <p>First off, installed I installed my benchamark (<em>Disk Read
        Benchmark</em>) by cloning the repository, installing it using
        Cargo, then added its completions to fish (just for this
        session):</p>
        <div class="sourceCode" id="cb2"><pre
        class="language-sh"><code class="language-bash"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="fu">git</span> clone https://git.askiiart.net/askiiart/disk-read-benchmark</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a><span class="bu">cd</span> ./disk-read-benchmark</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a><span class="ex">cargo</span> install <span class="at">--path</span> .</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a><span class="ex">disk-read-benchmark</span> generate-fish-completions <span class="kw">|</span> <span class="bu">source</span></span></code></pre></div>
        <p>Then I prepared all the data:</p>
        <div class="sourceCode" id="cb3"><pre
        class="language-sh"><code class="language-bash"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="ex">disk-read-benchmark</span> prep-dirs</span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a><span class="ex">disk-read-benchmark</span> grab-data</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a><span class="ex">./prepare.sh</span></span></code></pre></div>
        <p><code>disk-read-benchmark</code> prepares all the
        directories, generates the data to be used for testing, then
        <code>./prepare.sh</code> uses the data to generate the DwarFS
        and tar archives.</p>
        <p>To run it, I just ran this:</p>
        <div class="sourceCode" id="cb4"><pre
        class="language-sh"><code class="language-bash"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="ex">disk-read-benchmark</span> benchmark</span></code></pre></div>
        <p>Which outputs the data at
        <code>data/benchmark-data.csv</code> and
        <code>data/bulk.csv</code> for the single and bulk files,
        respectively.</p>
        <h2 id="results">Results</h2>
        <p>After processing <a
        href="/assets/benchmarking-dwarfs/data/">the data</a> with <a
        href="/assets/benchmarking-dwarfs/process-data.py">this
        script</a> to make it a bit easier, I put the resulting graphs
        in here ↓</p>
        <h3 id="sequential-read">Sequential read</h3>
        <p>These results interest me quite a bit; unsurprisingly, DwarFS
        has an advantage on the null file, due to its compression,
        though it's disappointing the difference in time wasn't greater.
        However, it does far worse on the random file, and I'm not sure
        why; as discussed further down, DwarFS doesn't try to compress
        incompressible files as far as I know, but I could be wrong. As
        for the 100 million-sided polygon, it's somewhere in between,
        with an advantage due to its compression, but still taking
        longer than expected.</p>
        <p>As for fuse-archive, it handles the null file well, but takes
        longer on the others; not much to say.</p>
        <div>
        <canvas id="seq_read_chart" class="chart">
        </canvas>
        </div>
        <h3 id="random-read">Random read</h3>
        <p>There's nothing much to say here; although DwarFS took
        significantly longer, it's still pretty fast - a different of
        about 14 milliseconds worst case, across a 25 GiB file; similar
        resuls for the 100 million-sided polygon, though to a less
        extent, given it can be compressed better. With the null file,
        due to its compression, DwarFS was actually on par with
        fuse-archive, but it can't compete with btrfs's performance,
        given it's so heavily optimized, and in the kernel.</p>
        <div>
        <canvas id="rand_read_chart" class="chart">
        </canvas>
        </div>
        <h3 id="sequential-read-latency">Sequential read latency</h3>
        <p>As expected, DwarFS performs a bit worse on the
        incompressible random data, but otherwise they'll all roughly
        equal. I wasn't expecting this, given btrfs is in the kernel,
        while the other two are using FUSE.</p>
        <div>
        <canvas id="seq_read_latency_chart" class="chart">
        </canvas>
        </div>
        <h3 id="random-read-latency">Random read latency</h3>
        <p>Both DwarFS and fuse-archive had some trouble with this test.
        DwarFS doesn't seem to handle random access very well; this is
        supposedly fixed, as seen in issue 139 (<em>Issue #139 ·
        mhx/dwarfs</em>), but the performance issues are obvious
        regardless; I'm not sure why, given it doesn't compress
        uncompressible data, not to mention it does just fine on the
        random read test, where the only difference is that it reads
        <em>more</em> data. But regardless, DwarFS ended up performing
        far worse than expected on both the incompressible random data,
        and the highly compressible null data.</p>
        <p>Meanwhile, when testing random read latency in
        <code>fuse-archive</code> pretty much just dies, becoming
        ridiculously slow (even compared to DwarFS), so I didn't include
        its single-file results. It succeeds on the bulk files, but
        given it just shows as 0 seconds anyways, given the massive
        scale, I opted to not include it in this graph at all.</p>
        <div>
        <canvas id="rand_read_latency_chart" class="chart">
        </canvas>
        </div>
        <h2 id="misc-notes">Misc notes</h2>
        <p>DwarFS can take up a fair amount of memory if mounting it
        many times (<em>Issue #219 · mhx/dwarfs</em>), and this should
        be kept in mind for use in BlendOS.</p>
        <hr />
        <p>Ratarmount (<em>mxmlnkn/ratarmount</em>) should also be
        investigated; it's similar to fuse-archive, but with some
        improvements, and some important notes. From its README
        file:</p>
        <blockquote>
        <p>Note that fuse-archive daemonizes instantly but the mount
        point will not be usable for a long time and everything trying
        to use it will hang until then when not using
        --asyncprogress</p>
        </blockquote>
        <blockquote>
        <p>Mounting bzip2 and xz archives has actually become faster
        than archivemount and fuse-archive with ratarmount -P 0 on most
        modern processors because it actually uses more than one core
        for decoding those compressions. indexed_bzip2 supports block
        parallel decoding since version 1.2.0.</p>
        </blockquote>
        <p>Despite being written in Python, Ratarmount seems to have
        significant performance improvements over fuse-archive.</p>
        <hr />
        <p>This should also be tested on systems with different specs,
        like my Chromebook and laptop, and should try getting the btrfs
        FUSE driver working and benchmarking that.</p>
        <h2 id="summary">Summary</h2>
        <p>DwarFS, or just the normal filesystem plus overlayfs, seem
        like they may be the best options - DwarFS's compression and
        deduplication are great, and the deduplication could probably be
        used in way I haven't even thought of yet, but it has some niche
        issues. Overall, I'm leaning towards using DwarFS as an option,
        with just overlayfs as the default, but further testing is
        needed.</p>
        <h2 id="footnotes">Footnotes</h2>
        <h2 id="sources">Sources</h2>
        <p> - “Confused_ace_noises/Maths-Demos - Branch:
        Headless-Deterministic.” Forgea: Git with a Cup of Jea,
        git.askiiart.net/confused_ace_noises/maths-demos/src/branch/headless-deterministic.<br />
         - “Disk Read Benchmark - A Simple and Performant Read-Only Disk
        Benchmark, Written in Rust.” Forgea: Git with a Cup of Jea,
        git.askiiart.net/askiiart/disk-read-benchmark.<br />
         - Google. “Google/Fuse-Archive: Fuse File System for Archives
        and Compressed Files (ZIP, RAR, 7z, ISO, TGZ, Xz...).” GitHub,
        github.com/google/fuse-archive.<br />
         - The Linux Kernel Archives, Linux Kernel Organization, Inc.,
        &lt;www.kernel.org/&gt;.<br />
         - Mhx. “Feature Request: Improve Block Management for
        Uncompressed Blocks to Save Memory and Enhance Deduplication ·
        ISSUE #139 · MHX/Dwarfs.” GitHub,
        github.com/mhx/dwarfs/issues/139.<br />
         - mhx. “mhx/Dwarfs: A Fast High Compression Read-Only File
        System for Linux, Windows and Macos.” GitHub,
        github.com/mhx/dwarfs.<br />
         - mhx. “[Feature Request] Mounting Multiple Archives to the
        Same Path · Issue #219 · MHX/Dwarfs.” GitHub,
        github.com/mhx/dwarfs/issues/219.<br />
         - mxmlnkn. “mxmlnkn/ratarmount: Access Large Archives as a
        Filesystem Efficiently, e.g., Tar, Rar, Zip, Gz, BZ2, XZ, ZSTD
        Archives.” GitHub, github.com/mxmlnkn/ratarmount.</p>
        <!-- JavaScript for graphs goes hereeeeeee -->
        <script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
        <script src="/assets/benchmarking-dwarfs/js/declare_vars.js"></script>
        <script src="/assets/benchmarking-dwarfs/js/seq_read.js"></script>
        <script src="/assets/benchmarking-dwarfs/js/rand_read.js"></script>
        <script src="/assets/benchmarking-dwarfs/js/seq_latency.js"></script>
        <script src="/assets/benchmarking-dwarfs/js/rand_latency.js"></script>
        <section id="footnotes"
        class="footnotes footnotes-end-of-document" role="doc-endnotes">
        <hr />
        <ol>
        <li id="fn1"><p>My code can generate up to 25 GB/s. However, it
        does random writes to my drive, which is <em>much</em> slower.
        So on one hand, you could say my code is so amazingly fast that
        current day technologies simply can't keep up. Or you could say
        that I have no idea how to code for real world scenarios.<a
        href="#fnref1" class="footnote-back"
        role="doc-backlink">↩︎</a></p></li>
        <li id="fn2">This data is from a modified version of an
        abandoned math demonstration program
        (<em>confused_ace_noises/maths-demos</em>) made by a friend; it
        generates regular polygons and writes their data to a file. I
        chose this because it was an artificial and reproducible yet
        fairly compressible dataset (without being extremely
        compressible like null data).
        <details open>
        <summary>
        3-sided regular polygon data
        </summary>
        <br>
        <!-- I put it in here just as a `style`, it didn't work. I put it in as a div with that `style`, it didn't work. I put it in as a div of that class which has those properties in style.css, it works -->
        <!-- i hate webdev i hate webdev i hate webdev i hate webdev i hate webdev i hate webdev -->
        <div class="force-word-wrap">
        <pre><code>[Vertex { position: Pos([0.5, 0.0, 0.0]), color: Col([0.5310667, 0.7112941, 0.7138775]) }, Vertex { position: Pos([-0.25000003, 0.4330127, 0.0]), color: Col([0.7492257, 0.3142163, 0.49905664]) }, Vertex { position: Pos([0.0, 0.0, 0.0]), color: Col([0.2046682, 0.25598457, 0.72071356]) }, Vertex { position: Pos([-0.25000003, 0.4330127, 0.0]), color: Col([0.6389981, 0.5204368, 0.077735074]) }, Vertex { position: Pos([-0.24999996, -0.43301272, 0.0]), color: Col([0.8869035, 0.30709425, 0.8658899]) }, Vertex { position: Pos([0.0, 0.0, 0.0]), color: Col([0.2046682, 0.25598457, 0.72071356]) }, Vertex { position: Pos([-0.24999996, -0.43301272, 0.0]), color: Col([0.6236294, 0.03584433, 0.7590722]) }, Vertex { position: Pos([0.5, 8.742278e-8, 0.0]), color: Col([0.6105084, 0.3593351, 0.85544324]) }, Vertex { position: Pos([0.0, 0.0, 0.0]), color: Col([0.2046682, 0.25598457, 0.72071356]) }]</code></pre>
        </div>
        </details>
        <a href="#fnref2" class="footnote-back"
        role="doc-backlink">↩︎</a></li>
        </ol>
        </section>
        <iframe src="https://john.citrons.xyz/embed?ref=askiiart.net" style="margin-left:auto;display:block;margin-right:auto;max-width:732px;width:100%;height:94px;border:none;"></iframe>
        <script src="/prism.js"></script>
    </body>
    <footer>
        <p><a href="https://git.askiiart.net/askiiart/engl-2311-blog">Source code</a>&ensp;|&ensp;<a href="/feed.xml">RSS</a>&ensp;|&ensp;<a href="/glossary.html">Glossary</a>&ensp;|&ensp;<a href="/about.html">About</a></p>
        <small>Image captions are the same as the alt text; assuming you're sighted, you can most likely ignore them.</small>
    </footer>
</html>
No results found.