Track original wasm instructions location #71

yurydelendik · 2019-05-07T14:56:04Z

Work toward #67

when decoding, store all offsets in the ExprId -> offset map
when encoding, remember all ExprId -> offset
on finalize of writing, merge information to get source offset -> encoded offset transform

fitzgen

So is the idea here that we would use the resulting offsets map to modify the DWARF in place outside of walrus the same way a linker might?

A few comments/questions/suggestions in line below.

src/module/functions/local_function/mod.rs

src/module/mod.rs

yurydelendik · 2019-05-07T21:12:29Z

So is the idea here that we would use the resulting offsets map to modify the DWARF in place outside of walrus the same way a linker might?

The wasm DWARF will need the code_transform (probably even more in the future) to be properly transformed. There is https://github.com/yurydelendik/wtmaps-utils/tree/master/wdwarf-cp project -- I'm planing to extend it to accept this data and/or be make it as a library to be used with Module.generate_dwarf option.

yurydelendik · 2019-05-21T17:42:06Z

@fitzgen additional info (based on standup meeting), DWARF transform logic is currently located at https://github.com/yurydelendik/wtmaps-utils in the wdwarf library -- the provided by this PR data will be used for final DWARF sections generation.

fitzgen · 2019-05-22T18:37:28Z

So I whipped up a smoke test to check that inserting (drop (i32.const 0)) at the start of a function results in the expected code transformations.

That is, going from input.wasm to output.wasm:

$ wasm-objdump -d crates/tests/input.wasm crates/tests/output.wasm 

input.wasm:	file format wasm 0x1

Code Disassembly:

00002e func[0] <f>:
 00002f: 41 b9 0a                   | i32.const 1337
 000032: 0b                         | end

output.wasm:	file format wasm 0x1

Code Disassembly:

00002e func[0] <f>:
 00002f: 41 00                      | i32.const 0
 000031: 1a                         | drop
 000032: 41 b9 0a                   | i32.const 1337
 000035: 0b                         | end

The i32.const 1337 has moved from offset 0x2f to offset 0x32, however the code_transform contains [(0x32, 0x31)].

I rebased this branch on top of master and added a commit that contains this failing test:

@yurydelendik could you look into why the test's assertions about the code transforms are failing?

RReverser · 2019-10-08T15:26:16Z

Is this PR / feature planned to be revived?

Currently wasm-pack supports dwarf-debug-info = true in the config, but it doesn't appear very useful since all the locations are off, and can't be used for source mapping.

fitzgen · 2019-10-09T16:53:49Z

@RReverser yes this feature is somethign we definitely want, but this PR is quite bit rotted at this point.

rustwasm#71

yurydelendik · 2019-10-22T16:07:55Z

Rebased and changed logic to use InstrLocId instead of ExprId. InstrLocId is just a debug location token, though for wasm purposes (and by default) it can be used to store bytecode offset.

fitzgen

Looks great, thanks for including that smoke test, Yury.

So the idea is that InstrLocId is not necessarily a code section offset, and users can customize its representation with config.on_instr_pos? Why would someone want to use something else? What would we be losing by only supporting code section offsets?

yurydelendik · 2019-10-22T22:18:11Z

Why would someone want to use something else?

If tools can read/write debug location format, they can allow better integration.

What would we be losing by only supporting code section offsets?

At this point nothing. Right now it is module bytecode offset (I would prefer to extend bytecode to entire usize type (instead of u32), if we are planing to support only that). Or, do we need to switch InstrLocId to code section offsets format, which will require to offset bytecode position to code section start?

fitzgen · 2019-10-22T22:38:33Z

Gotcha.

Can you fix the fuzz targets? Then we can merge this 👍

https://travis-ci.org/rustwasm/walrus/jobs/601361918#L950

  --> fuzz_targets/raw.rs:11:22
   |
7  |     let module = match walrus::Module::from_buffer(data) {
   |         ------ help: consider changing this to be mutable: `mut module`
...
11 |     let serialized = module.emit_wasm();
   |                      ^^^^^^ cannot borrow as mutable
error[E0596]: cannot borrow `module` as mutable, as it is not declared as mutable
  --> fuzz_targets/raw.rs:14:24
   |
12 |     let module =
   |         ------ help: consider changing this to be mutable: `mut module`
13 |         walrus::Module::from_buffer(&serialized).expect("we should only emit valid Wasm data");
14 |     let reserialized = module.emit_wasm();
   |                        ^^^^^^ cannot borrow as mutable

Just the raw target is showing up, presumably because it is the first we try to run, but the others may need to be updated as well.

fitzgen · 2019-10-23T15:38:02Z

Thanks!

RReverser · 2019-10-23T17:01:40Z

Woot, this was fast. Thanks!

RReverser · 2019-10-23T17:08:01Z

@fitzgen Do you plan to make a wasm-bindgen release with these changes?

fitzgen · 2019-10-23T18:05:34Z

@RReverser we can publish a new walrus and use that in wasm-bindgen, but I suspect what you're really interested in is using this to fix up llvm's emitted DWARF, right? If so, then that requires writing a little bit more than what this PR provided.

That code could theoretically go in either walrus or wasm-bindgen. I guess I would prefer it live in walrus, since it isn't really bindings-specific, and any sort of wasm transform will need this functionality to preserve debug info.

We will need CustomSection implementations for each .debug_foo section that we wish to preserve (can probably start with just a DebugLines struct for .debug_lines) that use the new origin-tracking infra introduced in this PR. These should probably be behind a cargo feature and live in src/dwarf.rs. Then we would have a config option for enabling DWARF support, and when set we would create the DebugLines instance from the .debug_lines custom section if it exists, and add it to the module's custom section registry so that when we write the wasm back out it updates all code offsets. Once that is working, we would do the same for each of the other DWARF sections.

RReverser · 2019-10-23T18:51:00Z

but I suspect what you're really interested in is using this to fix up llvm's emitted DWARF, right?

I guess I'm confused, I thought that's what this PR is doing? If not:

We will need CustomSection implementations for each .debug_foo section ...

I had an idea that these operations could be done in gimli::write::Dwarf instead - it already accepts a callback that converts any addresses in constructor, and, while it's not the original goal, same callback could be used for returning an adjusted address in the new binary.

Then the rest of it is as easy as reading DWARF custom sections of Wasm into read::Dwarf and re-emitting them back immediately after construction of write::Dwarf with such callback.

yurydelendik · 2019-10-23T18:55:15Z

I had an idea that these operations could be done in gimli::write::Dwarf instead

@RReverser There is a project started at https://github.com/yurydelendik/wtmaps-utils that will demonstrate the scope of the work needs to accomplish the task -- it is not trivial.

fitzgen · 2019-10-23T19:01:51Z

Then the rest of it is as easy as reading DWARF custom sections of Wasm into read::Dwarf and re-emitting them back immediately after construction of write::Dwarf with such callback.

This is basically what those CustomSection implementations would be doing.

philipc · 2019-10-23T22:00:26Z

Looking at convert.rs in wdwarf, a lot of it is copied from gimli. Is there a way we can adapt gimli's convert to do what you need? I had originally thought that transformations would work by calling gimli's convert, and then mutating the resulting write::Dwarf, but maybe that is not feasible? I would like to avoid the duplication of logic somehow though.

yurydelendik · 2019-10-23T22:11:26Z

Is there a way we can adapt gimli's convert to do what you need?

@philipc yes, that's an ideal situation. This project is created to identify what needs to be done for this use case (on gimli or user side). I'm planning to resume a work on it soon.

Let's create an issue at wdwarf or gimli repo to continue this discussion. I'll just mention that this is a problem not just translation of address, but changed data to still has be a valid DWARF:

keep .debug_line data in order
keep .debug_info ranges valid
(in the future) modify DWARF expressions if wasm locals/globals were changed

RReverser · 2019-10-24T12:17:20Z

I'll just mention that this is a problem not just translation of address, but changed data to still has be a valid DWARF:
keep .debug_line data in order
keep .debug_info ranges valid

AFAIK the things listed here are already taken care of by convert_address callback in write::Dwarf, hence the suggestion above to reuse it.

RReverser · 2019-10-29T02:12:40Z

@fitzgen I've looked into what you meant by implementing CustomSection + apply_code_transform btw. I think I see how it could work, but this API doesn't feel ideal, as it allows to only process one section at a time, whereas DWARF info is emitted across a bunch of different section.

If walrus could just produce a map of original -> final locations, then it seems much easier to reuse read::Dwarf + write::Dwarf that can read and update all the DWARF info at once (and extend support to more sections over time), rather than handling each section manually in wasm-bindgen itself.

I've tried to experiment with this in a wasm-bindgen branch, but looks like there are some incompatibilities with new walrus, particularly in anyhow::Error vs failure::Error used in two projects.

fitzgen · 2019-10-29T16:31:26Z

I've tried to experiment with this in a wasm-bindgen branch

As I mentioned above, I think walrus is the place this should live, not wasm-bindgen.

If walrus could just produce a map of original -> final locations, then it seems much easier to reuse read::Dwarf + write::Dwarf that can read and update all the DWARF info at once (and extend support to more sections over time), rather than handling each section manually in wasm-bindgen itself.

We could do that, but I also don't think that doing each section one at a time is bad, because then we can easily do each section in parallel.

RReverser · 2019-10-29T20:55:19Z

because then we can easily do each section in parallel

It feels like something that should be done inside gimli, as it 1) knows more about sections and can parallelise work better and 2) could use such parallelisation on other platforms too.

fitzgen reviewed May 7, 2019

View reviewed changes

src/module/functions/local_function/mod.rs Outdated Show resolved Hide resolved

src/module/functions/local_function/mod.rs Outdated Show resolved Hide resolved

src/module/functions/local_function/mod.rs Outdated Show resolved Hide resolved

src/module/mod.rs Outdated Show resolved Hide resolved

yurydelendik force-pushed the debuginfo branch from f681aea to 3d0474e Compare May 7, 2019 21:06

yurydelendik marked this pull request as ready for review May 21, 2019 17:30

yurydelendik force-pushed the debuginfo branch from 3d0474e to ee00067 Compare May 21, 2019 17:35

yurydelendik force-pushed the debuginfo branch from 580cc47 to a771b18 Compare May 22, 2019 22:34

yurydelendik added a commit to yurydelendik/walrus that referenced this pull request Oct 21, 2019

Track wasm location (redo)

532bef0

rustwasm#71

yurydelendik force-pushed the debuginfo branch 2 times, most recently from ac129c4 to 61fe12e Compare October 22, 2019 16:06

yurydelendik requested a review from fitzgen October 22, 2019 16:08

fitzgen reviewed Oct 22, 2019

View reviewed changes

Track original wasm instructions location

9d77db2

yurydelendik force-pushed the debuginfo branch from 61fe12e to 9d77db2 Compare October 22, 2019 23:09

fitzgen merged commit 7d2007d into rustwasm:master Oct 23, 2019

yurydelendik mentioned this pull request Oct 29, 2019

Can gimli's convert be adapted to do most of the work yurydelendik/wtmaps-utils#1

Open

RReverser mentioned this pull request Jun 20, 2021

Allow remapping all of line program when rewriting gimli-rs/gimli#563

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track original wasm instructions location #71

Track original wasm instructions location #71

yurydelendik commented May 7, 2019

fitzgen left a comment •

edited

Loading

yurydelendik commented May 7, 2019

yurydelendik commented May 21, 2019

fitzgen commented May 22, 2019

RReverser commented Oct 8, 2019

fitzgen commented Oct 9, 2019

yurydelendik commented Oct 22, 2019

fitzgen left a comment

yurydelendik commented Oct 22, 2019

fitzgen commented Oct 22, 2019

fitzgen commented Oct 23, 2019

RReverser commented Oct 23, 2019

RReverser commented Oct 23, 2019

fitzgen commented Oct 23, 2019

RReverser commented Oct 23, 2019

yurydelendik commented Oct 23, 2019 •

edited

Loading

fitzgen commented Oct 23, 2019

philipc commented Oct 23, 2019

yurydelendik commented Oct 23, 2019

RReverser commented Oct 24, 2019

RReverser commented Oct 29, 2019 •

edited

Loading

fitzgen commented Oct 29, 2019

RReverser commented Oct 29, 2019

Track original wasm instructions location #71

Track original wasm instructions location #71

Conversation

yurydelendik commented May 7, 2019

fitzgen left a comment • edited Loading

Choose a reason for hiding this comment

yurydelendik commented May 7, 2019

yurydelendik commented May 21, 2019

fitzgen commented May 22, 2019

RReverser commented Oct 8, 2019

fitzgen commented Oct 9, 2019

yurydelendik commented Oct 22, 2019

fitzgen left a comment

Choose a reason for hiding this comment

yurydelendik commented Oct 22, 2019

fitzgen commented Oct 22, 2019

fitzgen commented Oct 23, 2019

RReverser commented Oct 23, 2019

RReverser commented Oct 23, 2019

fitzgen commented Oct 23, 2019

RReverser commented Oct 23, 2019

yurydelendik commented Oct 23, 2019 • edited Loading

fitzgen commented Oct 23, 2019

philipc commented Oct 23, 2019

yurydelendik commented Oct 23, 2019

RReverser commented Oct 24, 2019

RReverser commented Oct 29, 2019 • edited Loading

fitzgen commented Oct 29, 2019

RReverser commented Oct 29, 2019

fitzgen left a comment •

edited

Loading

yurydelendik commented Oct 23, 2019 •

edited

Loading

RReverser commented Oct 29, 2019 •

edited

Loading