-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
walrus roundtrip inflates size of WebAssembly binaries #142
Comments
A curious bug! I don't think your input/output are for the same file, using the input file in the link you gisted above I get: input
output
The differences here are small differences in the elem and code sections. The elem section seems entirely accounted for because walrus will renumber function indices. The code section I presume is similar. Walrus sorts functions biggest first to get engines with streaming translation the chance to start on the big functions first, and it looks like that happens to cause the size to increase here, presumably because small functions are referenced far more often than large functions. Walrus isn't really a size-optimizing compiler. It's possible that we could add an entirely new emission mode to optimize for this though. |
Sorry, I'm not sure what you meant here. Your output seems to match mine at the first glance? (except I accidentally omitted last line in wasm-objdump for input when copy-pasting).
Yeah, could be, although surprised it makes such a big difference. I think size is often important for Wasm users, so adding such mode would be benefitial, but if not, maybe it's worth documenting as explicit non-goal in docs or README? |
I also wonder if it would be possible to have this reordering as an optional pass, so that by default functions would appear in original order and roundtrip would be preserved more closely. This would basically delegate order of most functions to the original Wasm compiler, which might choose to optimise for speed or perf as well. |
That would require keeping track of the original order, which we don't currently do. |
@fitzgen As far as I understand from the code, functions should be already naturally added in original order to the arena during parsing, and that order is preserved throughout most basic transformations. The actual reordering seems to be happening just here - walrus/src/module/functions/mod.rs Line 444 in 35bbfb0
|
Yes the difference is that the input binary you listed above doesn't look like it has a data section, but the data section is there (and is the same size in both) I don't really think this warrants any notes in the README or anything like that, it's not like we're not size-averse or we specifically don't optimize for size in walrus. This is a micro-optimization in one use case where resorting functions causes bloat. If you want to 100% optimize for size that's not really what |
Oh yeah, sorry, as I said, it was just a copy-paste mistake.
To me, it's not so much about intentionally optimising for the size, as about roundtrip producing different results. Other tools either usually preserve it by default, or, if they also use a different internal IR - such as Binaryen - clearly document that roundtrip can produce very different Wasm than the input. I think in this case, as a minimum, it would make sense to follow what Binaryen does with just documenting this constraint. On the other hand, the fix to keep original order (and keep reordering as a separate option) seems so simple that it seems worth just implementing it instead. |
I experimented with this a bit, and, while disabling function reordering decreases size a bit closer to original, it seems that the main overhead remains and is caused by locals reordering instead. |
I think it's reasonable to document yeah that |
Describe the Bug
Parsing and re-emitting WebAssembly even without modifications often produces larger binary than the original.
Steps to Reproduce
Expected Behavior
Output is same (or maybe even smaller) than the original.
Actual Behavior
Input file is 286727 bytes long, and output is 290568, making it 3.75KB larger.
Additional Context
wasm-objdump -h
dump for input:and output:
As you can see, most sections remained unchanged, but Elem and Code became larger.
Compression sizes differ as well (even if less so) - e.g. with Brotli input is 103410 bytes and output is 103887 bytes.
The text was updated successfully, but these errors were encountered: