Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparison with a Rust WASI binary #2135

Open
hassan-shahbazi opened this issue Oct 30, 2020 · 7 comments
Open

Comparison with a Rust WASI binary #2135

hassan-shahbazi opened this issue Oct 30, 2020 · 7 comments

Comments

@hassan-shahbazi
Copy link

hassan-shahbazi commented Oct 30, 2020

Compared to a Rust generated binary, swiftwasm generates binaries with slower performance and larger size. The results can be evaluated here: https://github.com/hassan-shahbazi/swiftwasm-go/tree/benchmark

Performance

With the same imported and exported functions and the _start, Swift's binary performs almost 85x slower than the Rust one, with 18.350 seconds and 0.214 seconds respectively.

$ # RUST - generate binary.wasm from a rust code
$ cd rust && rustc binary.rs --target wasm32-wasi && cd ../
$ go test ./... -v -race -count 1 -run _Rust  
=== RUN   TestStartBinary_Rust
Hello, world!
--- PASS: TestStartBinary_Rust (0.07s)
=== RUN   TestExportedFunction_Rust
--- PASS: TestExportedFunction_Rust (0.06s)
=== RUN   TestImportedFunction_Rust
--- PASS: TestImportedFunction_Rust (0.06s)
PASS
ok      github.com/hassan-shahbazi/swiftwasi/src        0.214s

$ # SWIFT - generate binary.wasm from the swift package
$ TOOLCHAIN_PATH=$(cd $(dirname "$(swiftenv which swiftc)") && cd ../share && pwd)
$ cd swift && swift build --triple wasm32-unknown-wasi -c release --toolchain $TOOLCHAIN -Xlinker --export=fetch -Xlinker --export=sum -Xlinker --allow-undefined
$ go test ./... -v -race -count 1 -run _Swift
=== RUN   TestStartBinary_Swift
Hello World!
--- PASS: TestStartBinary_Swift (6.42s)
=== RUN   TestExportedFunction_Swift
--- PASS: TestExportedFunction_Swift (6.28s)
=== RUN   TestImportedFunction_Swift
--- PASS: TestImportedFunction_Swift (5.52s)
PASS
ok      github.com/hassan-shahbazi/swiftwasi/src        18.350s

Size

In addition to performance, I can see Swift generated binary is almost 5x larger than the Rust generated binary

$ ls -lh rust | grep wasm
-rwxrwxr-x 1 hassan hassan 1.7M Oct 30 20:33 binary.wasm
$ ls -lh swiftwasm | grep wasm 
-rwxrwxr-x 1 hassan hassan 9.8M Oct 26 17:08 binary.wasm
@MaxDesiatov
Copy link

Thanks for raising this issue. Unfortunately, the Swift compiler doesn't run all of the required size optimizations yet, but you can run them manually with wasm-strip from WABT and wasm-opt -Os from Binaryen, this should somewhat reduce the size of binaries. I hope this will also improve the amount of time Wasmer spends on AOT compilation. I also recommend following issue #7 for more details on the binary size.

I need to have a closer look at the benchmark to understand what other optimizations are missing to make performance comparable with the Rust version.

@hassan-shahbazi
Copy link
Author

Thanks for your help, @MaxDesiatov. Although wasm-strip declined the size (8.1 MB),, it made the performance even worse (21.44 seconds). However, wasm-opt was a good suggestion. It declined the size to nearly half (4.4 MB) and doubled the performance (9.155 seconds). Yet, neither are even close to Rust.

@MaxDesiatov
Copy link

@hassan-shahbazi I've reviewed the benchmark and it looks like Wasmer's Compile function is repeatedly invoked on every test iteration. In my opinion, this benchmarks Wasmer's compilation speed first, not the actual speed of execution of code produced by SwiftWasm.

I'd recommend that you pre-compile the .wasm binaries by Wasmer before running the tests, and then test the actual execution speed. It is known to us that Wasmer does not perform well in terms of compilation of larger binaries, but after those binaries are compiled, their performance should be comparable to Rust's binaries I think.

Until Wasmer improves their compilation speed, or until #7 is resolved on our side, I recommend that you use wasm-strip and wasm-opt in the meantime.

@alejandroq
Copy link

Some of the size is inherit to Swift packaging it's stdlib and ARC? Given Rust's fine-tuned memory control via ownership, Swift will likely permeate overhead in-comparison - though expressing one's application in Swift has it's benefits over Rust for a large pool of developers. @MaxDesiatov with appropriate Swift compiler optimizations, do you have any notion of the theoretical smallest size for a Swift hello world function compiled to WASM? I believe WASM-specific experiments like AssemblyScript produce the smallest sizes and Rust due to a number of factors also produces relatively small .wasm files.

@MaxDesiatov
Copy link

MaxDesiatov commented Nov 29, 2020

I don't think ARC has any relation to the size of the binary. With the latest development snapshots I think we can DCE unused parts of stdlib, so it's up to a user whether they rely on stdlib and bear the cost. If they won't use it, we'll strip it out. The biggest overhead in binary size currently comes from ICU and WASI. As far as I understand, we either need to find a way to replace ICU with something smaller, or eliminate it completely. As for WASI, I think we only rely on allocators from it, we potentially can drop WASI as soon as we have custom allocators.

These are the biggest parts of the overhead that I currently know of. As soon as they're eliminated, I don't see why a simple "Hello world" binary written in Swift should be bigger than one produced by Rust or AssemblyScript. As for more complex cases, it again depends on what code developers bring in. If binary size is important to a developer, they should do size profiling regardless of what language they use.

@kateinoigakukun
Copy link
Member

Precisely, the size of stdlib is not optimized yet because the current toolchain doesn't provide prebuilt sib and swiftmodulesummary. I have to update build script to distribute them within our toolchain and implement file search mechanism for those supplementary files.

@hassan-shahbazi
Copy link
Author

hassan-shahbazi commented Nov 30, 2020

@hassan-shahbazi I've reviewed the benchmark and it looks like Wasmer's Compile function is repeatedly invoked on every test iteration. In my opinion, this benchmarks Wasmer's compilation speed first, not the actual speed of execution of code produced by SwiftWasm.

I'd recommend that you pre-compile the .wasm binaries by Wasmer before running the tests, and then test the actual execution speed. It is known to us that Wasmer does not perform well in terms of compilation of larger binaries, but after those binaries are compiled, their performance should be comparable to Rust's binaries I think.

Until Wasmer improves their compilation speed, or until #7 is resolved on our side, I recommend that you use wasm-strip and wasm-opt in the meantime.

Thanks for the comment, @MaxDesiatov. Updating to wasm-5.3.1-RELEASE and by refactoring the code so the binary is compiled only once, I've got a better performance comparable to Rust.

Update

The verbose log shows the problem is probably with the wasmer compile performance rather than executing the binary.

$ binary=rust go test ./... -v -race -count 1 -run _Rust
=== RUN   TestStartBinary_Rust
Hello, world!
--- PASS: TestStartBinary_Rust (0.00s)
=== RUN   TestExportedFunction_Rust
--- PASS: TestExportedFunction_Rust (0.00s)
=== RUN   TestImportedFunction_Rust
--- PASS: TestImportedFunction_Rust (0.00s)
PASS
ok  	github.com/hassan-shahbazi/swiftwasi/src	0.118s

$ binary=swift go test ./... -v -race -count 1 -run _Swift
=== RUN   TestStartBinary_Swift
Hello World!
--- PASS: TestStartBinary_Swift (0.00s)
=== RUN   TestExportedFunction_Swift
--- PASS: TestExportedFunction_Swift (0.00s)
=== RUN   TestImportedFunction_Swift
--- PASS: TestImportedFunction_Swift (0.00s)
PASS
ok  	github.com/hassan-shahbazi/swiftwasi/src	3.401s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: To triage
Development

No branches or pull requests

4 participants