Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep scanner alive (auto-renew) OR handle expired scanner and restart from previous row #91

Open
jonbonazza opened this issue May 15, 2018 · 2 comments

Comments

@jonbonazza
Copy link
Contributor

Is it possible to support configurable scanner timeout and client RPC timeouts?
These would be equivalent to hbase.client.scanner.timeout.period and hbase.rpc.timeout respectively in the in Java HBase client. This would allow for fine tuning scans (and other requests) to prevent scanner leases from expiring prematurely.

Also, what are the current values for these timeouts?

@timoha
Copy link
Collaborator

timoha commented May 24, 2018

AFAIK, scanner timeout values are configurable on the server side: https://github.com/apache/hbase/blob/branch-2.0/hbase-common/src/main/resources/hbase-default.xml#L597

I don't see any relevant fields in ScanRequest (https://github.com/apache/hbase/blob/branch-2.0/hbase-protocol/src/main/protobuf/Client.proto#L289) or Scan (https://github.com/apache/hbase/blob/branch-2.0/hbase-protocol/src/main/protobuf/Client.proto#L246) that would allow the client to specify the timeout.

So the problem you are trying to solve is how do you renew or automatically recover from scanner failure (due to timeout or regionserver going down). That is tricky due to partial results returned by hbase, meaning if half of a row is returned, the challenge is to restart request from the second half of the row and continue on scanning. It looks like hbase already has some logic to handle case like this (https://github.com/apache/hbase/blob/branch-2.0/hbase-protocol/src/main/protobuf/Client.proto#L295 and https://issues.apache.org/jira/browse/HBASE-5974), but I haven't spent much time on researching the nuances.

@devoxel
Copy link
Contributor

devoxel commented Oct 7, 2021

Necro-ing because we hit this recently.

I think we can fix this by sending a ScanRequest with the renew bool set to true https://github.com/apache/hbase/blob/branch-2.0/hbase-protocol/src/main/protobuf/Client.proto#L289.

So I guess the question of implementation is either:

  • handle this transparently in gohbase by starting a goroutine per scan that renewed on an interval
  • add a function (like Renew) to the Scanner

I personally think 1 is preferable, especially if this is optional and off by default. We could also add a configurable renew limit to kill slow clients with our own per scanner timeout.

Thoughts?

@dethi dethi changed the title Configurable timeouts Keep scanner alive (auto-renew) OR handle expired scanner and restart from previous row Mar 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants