Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit bitswap wants by server and/or client backpressure #644

Open
gammazero opened this issue Jul 29, 2024 · 3 comments
Open

Limit bitswap wants by server and/or client backpressure #644

gammazero opened this issue Jul 29, 2024 · 3 comments
Labels
need/triage Needs initial labeling and prioritization

Comments

@gammazero
Copy link
Contributor

This issue captures information from a PR discussion in case it is useful for future Bitswap work. If this no longer applies, then this issue can be closed.

It would be great if Bitswap clients could avoid sending more than a protocol-wide constant by using back-pressure to limit limit the number of wants.

From @aschmahmann:

Here are 3 options, but these might not be correct:

  1. Having the server backpressure the client on the other side of the wire
    • Similar to my comment above, and AFAICT requires some protocol changes to accommodate
  2. Have the bitswap client internally batch GetBlocks calls that are for larger than maxWantlistSize into batches before returning them in case the batch is fully returned before the rebroadcast interval?
    • Doable if it'll be helpful, although it's a bit gross since A) maxWantlistSize shouldn't really be a protocol/network-wide thing and be per-client B) this batching can be done outside of the bitswap client package.
  3. The bitswap client backpressuring the caller (e.g. code that's walking a DAG)

From @Wondertan:

My suggestion was some form of 2 and 3. The client should be able to deal with the server-side request rate-limiting; otherwise, the client gets stuck expecting to be served, while the server simply cuts off his wants. The rebroadcasting after one minute helps, but that's still one minute and request spamming overhead over the wire.

Ideally, we would do a protocol change as described in 1, but as it's breaking, we may consider other less clean options, like or similar to 2. Setting protocol-wide maxWantlistSize is gross, I agree. Another option might be negotiating the limit between the client and the server so the client knows it should never exceed it.

The 3 is complimentary and provides a new, powerful way to interface with a client. However, I just realized that it is not necessary for our case if the client is smart enough. In our case, we have a flat structure, i.e. we don't traverse a DAG where we unpack IPLD nodes to get more CID links to fetch them and unpack again to get to the data. Essentially, we know all the CIDs in advance and could simply ask Bitswap to get all of them over GetBlocks as long the client is smart enough not to get into any limitations of its immediate peers, which we are currently facing with this issue.

From @gammazero:

With the latest changes in PR #629, the main problem should be solved. The problem was that wants that have DONT_HAVE responses remained on the queue, and caused new incoming messages to be dropped when there is no more free space on the queue. The same failure also prevented new wants from updating existing wants. This is fixed: Incoming wants are now applied to the queue, and any new ones that cannot be handled without increasing the queue over the size limit are handled specially as overflow, where their priority is considered for replacing existing wants.

Having the client limit its message sizes may be useful to avoid oversized wantlists from being truncated and allowing any overflow is handled. The client can group batches of wants by priority if applicable. In the future, priority can be influenced by DAG path length (higher priority for items closer to root).

@gammazero gammazero added the need/triage Needs initial labeling and prioritization label Jul 29, 2024
Copy link

welcome bot commented Jul 29, 2024

Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review.
In the meantime, please double-check that you have provided all the necessary information to make this process easy! Any information that can help save additional round trips is useful! We currently aim to give initial feedback within two business days. If this does not happen, feel free to leave a comment.
Please keep an eye on how this issue will be labeled, as labels give an overview of priorities, assignments and additional actions requested by the maintainers:

  • "Priority" labels will show how urgent this is for the team.
  • "Status" labels will show if this is ready to be worked on, blocked, or in progress.
  • "Need" labels will indicate if additional input or analysis is required.

Finally, remember to use https://discuss.ipfs.io if you just need general support.

@Wondertan
Copy link
Member

Wondertan commented Jul 30, 2024

@gammazero, i believe your change is useful and should help with the problem, but no completely solve it.

In case server side has all the data client wants, but client wants more than queue size, the wants are gonna be dropped.

This might not be the biggest problem for IPFS due its Dag structure, but is a problem for other use cases like ours. We have a "flat" structure and may request 256*256 items "at once", hitting this issue.

Ideally, we would do some form of backpressure as @aschmahmann suggested, but that's gonna be a large protocol change and we want assess other options before pushing such a change.

@gammazero
Copy link
Contributor Author

@Wondertan Yes, this is unfortunately the case that when client wants more than queue size the wants are dropped. The client must batch requests to be within the queue limit. This is why we need backpressure or flow-control, and why this issues is here. As you said, this will likely be a significant protocol change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/triage Needs initial labeling and prioritization
Projects
None yet
Development

No branches or pull requests

2 participants