Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disasm: Consider using evmole as an optional backend? #80

Open
shazow opened this issue Jan 17, 2024 · 16 comments
Open

disasm: Consider using evmole as an optional backend? #80

shazow opened this issue Jan 17, 2024 · 16 comments
Labels
maybe Not sure if this is a good idea yet

Comments

@shazow
Copy link
Owner

shazow commented Jan 17, 2024

evmole is doing great work at extracting selectors and guessing arguments.

  • The implementation is fairly small and efficient, so it wouldn't add a lot of overhead to whatsabi. (Much smaller impact than sevm, disasm: Allow different static analysis backends? #31)
  • It has a minimal EVM implementation focused specifically for extracting the data we care about, which was out of scope for whatsabi's approach (though I'm increasingly more convinced it's not a bad idea).
  • Could start as an optional peer dependency, and if it works well we can make it the main backend over time, reduce the maintenance burden on whatsabi, allowing us to focus more on other developer UX (proxies, loaders, etc).
  • I'm not sure we can entirely abandon whatsabi's static analysis, since we still need parts of it for proxy detection and other analysis I want to do in the future, maybe some upstream changes would be possible. Not sure yet.

cc @cdump -- do you have any thoughts/feelings about this?

@shazow shazow added the maybe Not sure if this is a good idea yet label Jan 17, 2024
@cdump
Copy link

cdump commented Jan 18, 2024

I like this idea

@shazow
Copy link
Owner Author

shazow commented Feb 7, 2024

@cdump Any chance you'd be willing to expose some of the evm implementation in the js library? I'd like to hook into a few opcodes, for example I need to keep track of SLOAD and DELEGATECALL for things like proxy detection, LOG* for events, etc.

@cdump
Copy link

cdump commented Feb 9, 2024

@shazow I was thinking about this - making EVM evmole extensible, will probably add that in about 1 month.

@shazow
Copy link
Owner Author

shazow commented Feb 9, 2024

@cdump Sounds good. Could also make it a separate package, I'd probably use it without the rest of the stuff. I like the idea of a js EVM implementation that is focused on efficiency and gas-bounded analysis. Things like ethereumjs have way too much other things to be viable for this, and I'm really not excited about separately maintaining yet another separate EVM implementation with the various upgrades coming up, but would definitely contribute to one. :)

@cdump
Copy link

cdump commented Feb 12, 2024

separate package

Maybe later, I think it will be easier to start just as the evm module exports from the EVMole right now

What do you think about the following approach?

import { Vm } from 'evmole/evm/vm'
import { Element } from 'evmole/evm/element'
import { Op } from 'evmole/evm/opcodes'


class MyVm extends Vm {
  constructor(code, calldata) {
    super(code, calldata);
  }

  exec_opcode(op) {
    if (op == Op.JUMPDEST) {
      console.log('hooked JUMPDEST');
      return [1];
    }
    return super.exec_opcode(op);
  }
}


const code = '608060405260043610610033575f3560e01c8063b69ef8a814610037578063d0e30db01461005d578063dd5d521114610067575b5f80fd5b348015610042575f80fd5b5061004b5f5481565b60405190815260200160405180910390f35b610065610086565b005b348015610072575f80fd5b506100656100813660046100bb565b61009d565b345f8082825461009691906100e5565b9091555050565b8063ffffffff165f808282546100b391906100e5565b909155505050565b5f602082840312156100cb575f80fd5b813563ffffffff811681146100de575f80fd5b9392505050565b8082018082111561010457634e487b7160e01b5f52601160045260245ffd5b9291505056'

const code_arr = Buffer.from(code, 'hex')
const cd_arr = Buffer.from('b69ef8a8', 'hex')

const vm = new MyVm(code_arr, new Element(cd_arr, 'calldata'))

while (!vm.stopped) {
  const ret = vm.step()
  console.log('step gas:', ret[1]);
}
  1. export of src/evm/*.js as evm/
  2. make #exec_opcode public instead of private
  3. using all this with class inheritance - it's possible to overwrite vm methods and call original methods too (super.method())
$ node a.mjs

step gas: 3
step gas: 3
step gas: 3
step gas: 3
step gas: 2
step gas: 3
step gas: 3
step gas: 10
step gas: 2
step gas: 3
step gas: 3
step gas: 3
step gas: 3
step gas: 3
step gas: 3
step gas: 3
step gas: 10
hooked JUMPDEST
step gas: 1
step gas: 2
step gas: 3
step gas: 3
step gas: 3
step gas: 10
hooked JUMPDEST
step gas: 1
...

@shazow
Copy link
Owner Author

shazow commented Feb 12, 2024

I'll need to prototype some code to be sure but should be a good starting point, as long as I can access the stack and memory inside the hook too (I believe it should work as-is).

Would it be possible to also get access to something like src/selectors.js:process and src/arguments.js:functionArguments?

@cdump
Copy link

cdump commented Feb 12, 2024

can access the stack and memory inside the hook too

yes, you can

Would it be possible to also get access to something like src/selectors.js:process and /src/arguments.js:functionArguments?

these functions already exported, and you can import them using import {functionArguments, functionSelectors} from 'evmole' (example)

@shazow
Copy link
Owner Author

shazow commented Feb 12, 2024

these functions already exported, and you can import them using import {functionArguments, functionSelectors} from 'evmole' (example)

I mean if we're getting access to the vm implementation, it can be useful to get access to functions that deal with vm instances as inputs (rather than the wrappers than parse bytecode into a vm again).

@shazow
Copy link
Owner Author

shazow commented Feb 12, 2024

I respect the desire to keep the exposed API surface as small as possible. Worst case I can also fork/vendor parts that aren't exposed if necessary,

@cdump
Copy link

cdump commented Feb 12, 2024

can be useful to get access to functions that deal with vm instances as inputs (rather than the wrappers than parse bytecode into a vm again)

I can think about it when you have a real example where it could be useful, but you are right—I want to keep the public API small and stable right now :)

@shazow
Copy link
Owner Author

shazow commented Feb 12, 2024

Sounds good, let's start with just the evm stuff and we'll go from there. :)

@cdump
Copy link

cdump commented Feb 12, 2024

Ok, I will finish first version of #80 (comment) and push it to the dedicated git branch this week

@shazow
Copy link
Owner Author

shazow commented Feb 12, 2024

@cdump Thanks! Looking forward to playing with it. :)

@cdump
Copy link

cdump commented Feb 13, 2024

@shazow
Copy link
Owner Author

shazow commented Feb 13, 2024

@cdump That looks like a great start, thank you! I probably won't have time to prototype this into whatsabi until after ETHDenver though (any chance I'll see you there?)

@cdump
Copy link

cdump commented Feb 13, 2024

any chance I'll see you there?

no :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maybe Not sure if this is a good idea yet
Projects
None yet
Development

No branches or pull requests

2 participants