Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Configuration flag to allow possible 'null' blocks on a network #4729

Open
1 of 2 tasks
emmanuelm41 opened this issue Jun 28, 2023 · 10 comments
Open
1 of 2 tasks
Labels
enhancement New feature or request

Comments

@emmanuelm41
Copy link

emmanuelm41 commented Jun 28, 2023

Description

Filecoin blockchain has added an EVM runtime on top of its blockchain some months ago. From Filecoin docs web:

"The FVM contains an Ethereum Virtual Machine (EVM) runtime, allowing Ethereum and Solidity developers to run their contracts on the FVM with little to no modifications. This page details what exactly this EVM compatibility means, and any other information that Ethereum developers may need to build applications on the FVM."

Because of the Filecoin's miner election process nature, "null" blocks can happen. When the random miner election process occurs, sometimes no winer is chosen. In that scenario, no tipset (group of blocks) is generated. On the ethereum rpc side, this is reflected with an error message indicating the block is null. Please, refer to Ethereum RPC doc.

We are currently trying to use the ethereum rpc on the Filecoin network to connect a graph node so that subgraphs can be deployed. However, the moment a null block is found, the process gets stuck there. Some possible fixes has been explored, but none of them have been effective. In particular, a proxy has been established between the graph node and the filecoin node to produce fake blocks when a null one is found. This an entire valid block with no txs on it. However, the chain is actually broken, as block hashes are not correctly chained (it is impossible to do it in this proxy). For this reason, we see constantly reverted blocks while new blocks are ingested.

One possible workaround to this could be a new flag for specific chains which can allow to "skip" blocks when they are "null".

Some screenshots from our grafana dashboard
image

These are some of the logs we see on the node

Jun 29 13:56:14.343 INFO Scanning blocks [2991101, 2991101], range_size: 2000, sgd: 2, subgraph_id: QmebGEv1gdcQfGM5RkiDbx421Kw4JEPTMmUvN8wc6NU3XL, component: BlockStream
Jun 29 13:56:14.343 DEBG Requesting hashes for blocks [2991101, 2991101], sgd: 2, subgraph_id: QmebGEv1gdcQfGM5RkiDbx421Kw4JEPTMmUvN8wc6NU3XL, component: BlockStream
Jun 29 13:56:14.356 DEBG Requesting 0 block(s), sgd: 5, subgraph_id: QmQzU2mwuhMZvSXam1jPcWhBMpnyFBL7NudkmJkyduQ3F5, component: BlockStream
Jun 29 13:56:14.405 INFO Reverting block to get back to main chain, revert_to_ptr: #2991100 (6718cfaf170b5e8e1e67bec45f07d08768f36d2ee420a4e2d7d6acdcf615f8d0), subgraph_ptr: #2991101 (fcf0abfb1c450019f8305472c962a8ed445d4804d727ca5f5e4adfa6c4f52979), sgd: 5, subgraph_id: QmQzU2mwuhMZvSXam1jPcWhBMpnyFBL7NudkmJkyduQ3F5, component: SubgraphInstanceManager
Jun 29 13:56:14.746 DEBG Found 1 relevant block(s), sgd: 2, subgraph_id: QmebGEv1gdcQfGM5RkiDbx421Kw4JEPTMmUvN8wc6NU3XL, component: BlockStream
Jun 29 13:56:14.768 INFO Scanning blocks [2991101, 2991101], range_size: 2000, sgd: 5, subgraph_id: QmQzU2mwuhMZvSXam1jPcWhBMpnyFBL7NudkmJkyduQ3F5, component: BlockStream
Jun 29 13:56:14.770 DEBG Requesting logs for blocks [2991101, 2991101], contract 0x9be9b0cfa89ea800556c6efba67b455d336db1d0, 5 events, sgd: 5, subgraph_id: QmQzU2mwuhMZvSXam1jPcWhBMpnyFBL7NudkmJkyduQ3F5, component: BlockStream
Jun 29 13:56:14.835 INFO Reverting block to get back to main chain, revert_to_ptr: #2991100 (6718cfaf170b5e8e1e67bec45f07d08768f36d2ee420a4e2d7d6acdcf615f8d0), subgraph_ptr: #2991101 (fcf0abfb1c450019f8305472c962a8ed445d4804d727ca5f5e4adfa6c4f52979), sgd: 2, subgraph_id: QmebGEv1gdcQfGM5RkiDbx421Kw4JEPTMmUvN8wc6NU3XL, component: SubgraphInstanceManagerJu

Are you aware of any blockers that must be resolved before implementing this feature? If so, which? Link to any relevant GitHub issues.

No response

Some information to help us out

  • Tick this box if you plan on implementing this feature yourself.
  • I have searched the issue tracker to make sure this issue is not a duplicate.
@emmanuelm41 emmanuelm41 added the enhancement New feature or request label Jun 28, 2023
@emmanuelm41 emmanuelm41 changed the title [Feature] Flag to allow possible 'null' blocks on chain [Feature] Configuration flag to allow possible 'null' blocks on a network Jun 28, 2023
@fridrik01
Copy link

I collaborated with @emmanuelm41 on this issue and want to share that in addition to the issue of constantly reverted blocks we also notice that the graph node falls behind about 200 blocks from the latest block, so it does not seem to keep up due to always reverting.

Also, when querying the graph node in GraphQL we very often see an error message "the chain was reorganized while executing the query" which further indicates that this is a revert issue:

image

@SozinM
Copy link
Contributor

SozinM commented Jul 14, 2023

Also, erigon clients for Polygon and BSC have kinda same problems. Some blocks (state transactions) do not return transactions and it causes the graph to treat it as reorg and wait until it gives up.
I think it would be perfect to refactor the graph reorg detection logic to make it more strict so it triggers only on real reorgs.

@maoueh
Copy link
Contributor

maoueh commented Jul 14, 2023

@SozinM The graph-node logic for revert is done on two places. One when we are far beyond chain head block, in which case a call to eth_getBlockByNumber is made and we check if the block's hash is the same as the one returned by the RPC call (https://github.com/graphprotocol/graph-node/blob/master/chain/ethereum/src/ethereum_adapter.rs#L648 via https://github.com/graphprotocol/graph-node/blob/master/graph/src/blockchain/polling_block_stream.rs#L298-L312)

When we are live, we check the continuity of the chain by checking if current head block's parent hash is the same as the previous block we have seen: https://github.com/graphprotocol/graph-node/blob/master/graph/src/blockchain/polling_block_stream.rs#L430-L450 (the else part).

I don't think the issue description applies to your use case @SozinM if only transactions are missing, it shouldn't cause weird re-org issue. Of course if the block is fully empty, that's another thing.

@leoyvens
Copy link
Collaborator

At high-level, two approaches would be possible here:

  1. Have Graph Node be able to handle gaps in an EVM chain.
  2. Fill in the gaps with a dummy block.

Approach 2 is what you attempted with the proxy, but it runs into the issue which is that if block N is a dummy between blocks, then what should the parent hash of N + 1 be? In some situations we'd want it to be the hash of N - 1 as the chain says, and in others the hash of N to keep a continuous chain. I don't see a good way to solve this problem which leads me to think we should explore approach 1.

There is a precedent in Graph Node for a chain which skips blocks, which is Near. In Near, block headers have both a prev_hash and prev_height. Graph Node uses this in fn parent_ptr, which looks like this for Near:

pub fn parent_ptr(&self) -> Option<BlockPtr> {
match (self.prev_hash.as_ref(), self.prev_height) {
(Some(hash), number) => Some(BlockPtr::from((H256::from(hash), number))),
_ => None,
}
}

Contrast it to Ethereum, which assumes the parent block number is n - 1, as is the case in all other EVM compatible chains we support:

fn parent_ptr(&self) -> Option<BlockPtr> {
match self.number() {
0 => None,
n => Some(BlockPtr::from((self.parent_hash, n - 1))),
}
}

We'd have to support a prev_height for EVM chains. This support would be opt-in, enabled in the toml config. Adding that extra field to the block will be a bit annoying, as the Ethereum block type is currently taken straight from the web3 crate, we'd need to change it from a type alias to a struct:

pub type LightEthereumBlock = Block<Transaction>;

A difference between Near and Ethereum is that Near is Firehose-only, so it uses a Firehose block ingestor. This will use an RPC block ingestor, which lives in chain/ethereum/src/ingestor.rs. That will need to be modified to calculate and add the prev_height information to blocks, and to ignore null as a skipped (in chains that opt-in).

This would be my suggestion at this point, hopefully it helps and I'm available to further discuss the approach or answer any questions about the code details.

@raulk
Copy link

raulk commented Jul 26, 2023

I'm going to try and stake a stab at this.

@eshon
Copy link

eshon commented Aug 7, 2023

In case anyone needs to do testing, there is a null Filecoin Tipset round at height #2991095. Zondax's original logs above seem to describe an error at height #2991101 but that round has blocks/messages. The null round occurred some Tipsets before.

@xianglinhe
Copy link

xianglinhe commented Nov 21, 2023

I'm going to try and stake a stab at this.

Hi raulk, is there any update of your test?

@dboreham
Copy link
Contributor

dboreham commented Feb 6, 2024

Hello @xianglinhe here is a set of changes that we have tested against a live Lotus node handling the "null blocks present" case: vulcanize#1

@azf20
Copy link
Contributor

azf20 commented Feb 22, 2024

Hey @dboreham, it would be great to upstream these changes, could you open a PR?

@AFDudley
Copy link

#5247

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests