Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proxies: Multiple instances of the same resolver #142

Open
manuelwedler opened this issue Oct 16, 2024 · 14 comments
Open

proxies: Multiple instances of the same resolver #142

manuelwedler opened this issue Oct 16, 2024 · 14 comments
Labels
bug Something isn't working soon Planned, targeting a near-future release

Comments

@manuelwedler
Copy link

I noticed a few more issues while running the proxy detection on the Sourcify database. Let's have a look at two example contracts on Ethereum mainnet:

0x56C5Aef1296d004707475c8440f540DdA409b53D

https://eth.blockscout.com/address/0x56C5Aef1296d004707475c8440f540DdA409b53D?tab=contract

For this contract I got 14 instances of the FixedProxyResolver. They resolved only to 4 distinct addresses:

  • 0xdd0694ed7ecb6494a9055a78cf848a6b146c88b5
  • 0x652c78d85cdf25e85fe7df8d4c61b2a0f3c25d4f
  • 0x2184298d600b5f2ff2febca0993626581a994095
  • 0xcc985298cc007901c9994920fecedadbeba8380f

All of the 4 above addresses are libraries.

0x7dB8637A5fd20BbDab1176BdF49C943A96F2E9c6

https://eth.blockscout.com/address/0x7dB8637A5fd20BbDab1176BdF49C943A96F2E9c6?tab=contract

For this contract I got 1 FixedProxyResolver and 5 EIP1967ProxyResolver instances. The FixedProxyResolver resolved to 0xf208a8fe1ef2b31b0e408c7399a5d696cadb0be9 which is a library again. All EIP1967ProxyResolver instances resolved to 0x0000000000000000000000000000000000000000, but the above address is actually no proxy. It is only able to deploy new EIP1967 proxies.

Problems

There are multiple problems here. I would expect the following behavior:

  • don't return a FixedProxy for a library
  • don't return FixedProxy multiple times for the same implementation address
  • don't return a resolver for a factory contract that deploys proxies

If you want to see more examples, you can have a look here: https://github.com/sourcifyeth/data-analysis-scripts/blob/main/multi-proxy-analysis-results/multi-implementation-addresses.json
There, I collected all contracts that resolved to multiple implementation addresses (except DiamondProxies). The format is {chainId: [{proxyAddress: [[detectedProxies], ...distinctImplementationAddresses]}]}.

@shazow
Copy link
Owner

shazow commented Oct 16, 2024

@manuelwedler Thank you for reporting these! It has been hard to find good edge cases to test against.

Definitely need to fix this. :)

@shazow shazow added bug Something isn't working soon Planned, targeting a near-future release labels Oct 16, 2024
@shazow
Copy link
Owner

shazow commented Oct 16, 2024

don't return a FixedProxy for a library

Do we have a distinguishing definition for what constitutes as a library from an EVM bytecode perspective?

I think traditionally we consider contracts called with STATICCALL as libraries, but in this case they're being DELEGATECALL'ed.

don't return FixedProxy multiple times for the same implementation address

This one is easy to fix, on it!

don't return a resolver for a factory contract that deploys proxies

Hmm this one is tricky too, not sure we can reliably distinguish this.

The other perspective I'm considering: Is there any additional information I can include with the resolver results that would allow you to productively decide how to treat the detected resolvers?

@manuelwedler
Copy link
Author

Yeah, libraries and factories are really difficult to detect. At least it's good to be aware of these issues now :)

I think you could potentially differentiate a proxy from a library by detecting CALLDATACOPY and/or CALLDATASIZE opcodes before a DELEGATECALL, since a proxy should forward any call. Libraries would have the function selectors pushed to the stack before the DELEGATECALL. Still, coming up with a solid algorithm is hard.

For our purposes, I will discuss with the team how we go about it. We should actually be able to tell if a detected proxy is a library, since we have the source code. We would only need to check against our DB.
Factories would be harder for us, but I also don't know at the moment if you could include any information that would help. I will have another thought about all of this.

@shazow
Copy link
Owner

shazow commented Oct 17, 2024

I think you could potentially differentiate a proxy from a library by detecting CALLDATACOPY and/or CALLDATASIZE opcodes before a DELEGATECALL, since a proxy should forward any call. Libraries would have the function selectors pushed to the stack before the DELEGATECALL. Still, coming up with a solid algorithm is hard.

That's still tricky since it could be in the stack, and unfortunately WhatsABI doesn't have full stack tracing yet. (This is on the table but would substantially change how WhatsABI works, so I'm kinda punting it to post-EOF at the moment.)

I think it would be a matter of how conservative we want the results to be. In the case of sourcify, I imagine it's better to present a subset of proxies rather than a superset?

Off the top of my head, some available information that you can influence how you use the proxies result:

  1. Type of proxy detected. FixedProxy is the main issue here, since it's basically anything that does DELEGATECALL to a predetermined address.
  2. Length of bytecode (minimal proxies are tiny), or presence of some codes
  3. Number of selectors detected in the jump table (simple proxies like minimal proxies have none, but complex/hybrid proxies would have some -- could simply skip those?)

@manuelwedler
Copy link
Author

  • Type of proxy detected. FixedProxy is the main issue here, since it's basically anything that does DELEGATECALL to a predetermined address.
  • Length of bytecode (minimal proxies are tiny), or presence of some codes
  • Number of selectors detected in the jump table (simple proxies like minimal proxies have none, but complex/hybrid proxies would have some -- could simply skip those?)

These are some good hints. Thanks.

A subset is definitely better than a superset. I think regarding FixedProxies the easiest for us is to only support EIP-1167 proxies. We can just check if the bytecode matches the opcodes of the EIP-1167 standard. Other detected FixedProxies would be non-standard proxies in my understanding and it's totally fine to not support such at the moment.

I still wonder what we can do about factories. If I understand it correctly, any type of detected proxy could be a factory. I guess a factory would usually resolve to 0x0. If a contract resolves to 0x0 we should simply not consider it as a proxy.

@shazow
Copy link
Owner

shazow commented Oct 18, 2024

I still wonder what we can do about factories. If I understand it correctly, any type of detected proxy could be a factory. I guess a factory would usually resolve to 0x0. If a contract resolves to 0x0 we should simply not consider it as a proxy.

Trying to imagine a more general case: You could treat any contract that has CREATE/CREATE2 opcodes as factories?

I don't know how reliably the resolved address would be 0x0, though.

@manuelwedler
Copy link
Author

manuelwedler commented Oct 21, 2024

You could treat any contract that has CREATE/CREATE2 opcodes as factories?

I think yes, if there is any CREATE opcode, we shouldn't consider a contract as a proxy. However, it is still possible that you embed a contract's bytecode but only pass this bytecode to another contract which would call a CREATE opcode with the passed bytecode. So the contract that embeds the bytecode would not have a CREATE opcode itself.

I don't know how reliably the resolved address would be 0x0, though.

For EIP-1967, it should not be impossible to write to the storage slot by accident in Solidity. You would have to so on purpose in assembly code, and that could be a problem.

@shazow
Copy link
Owner

shazow commented Oct 21, 2024

Or we could contrive a contract that is both a valid proxy and also acts a factory in some conditions. Plenty of proxies that also have non-proxied selectors that it checks first.

I might need to figure out a sensible way to reveal CREATE opcodes with AutoloadResult, or have a different call flow altogether... Hmm.

@shazow
Copy link
Owner

shazow commented Oct 21, 2024

One more thing I'm thinking of adding is a priority property on ProxyResolver types, and sorting them before returning the result so that FixedProxyResolvers are last.

  1. This would make it easier to check if there's a non-FixedProxyResolver present (just check the first member of result.proxies).
  2. In many cases, people only really care about resolving the first one.

I'll also see if it's feasible to detect more specific types of FixedProxy types that can be distinguished from weird usage of DELEGATECALL.

@manuelwedler
Copy link
Author

I might need to figure out a sensible way to reveal CREATE opcodes with AutoloadResult, or have a different call flow altogether... Hmm.

Revealing CREATE opcodes would be great.

One more thing I'm thinking of adding is a priority property on ProxyResolver types, and sorting them before returning the result so that FixedProxyResolvers are last.

My data (here) actually showed that only contracts with various combinations of FixedProxy and EIP1967Proxy resolvers also resolved to multiple different implementation addresses. In all of these cases, the EIP1967Proxy resolvers resolved to 0x0.

So not sure if this priorization is perfect. Maybe it is worth to look more deeply into why there are EIP1967Proxy contracts with resolution to 0x0.

I'll also see if it's feasible to detect more specific types of FixedProxy types that can be distinguished from weird usage of DELEGATECALL.

That would also be great :)

@shazow
Copy link
Owner

shazow commented Oct 22, 2024

@manuelwedler Are any of these are blockers for sourcify at this time? Trying to figure out my priorities for the next sprint. :)

@manuelwedler
Copy link
Author

None of these are blockers, but I still need to figure out how we should handle embedded bytecodes. The only thing that might be useful for this are the CREATE opcodes.

Repository owner deleted a comment Oct 26, 2024
@shazow
Copy link
Owner

shazow commented Oct 28, 2024

Adding AutoloadResult.isFactory in #144

@shazow
Copy link
Owner

shazow commented Nov 4, 2024

@manuelwedler Just released v0.16.0 which has some of these improvements. :) https://github.com/shazow/whatsabi/releases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working soon Planned, targeting a near-future release
Projects
None yet
Development

No branches or pull requests

2 participants