Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create route "routes_vip_offset_index" duplicate errors #3525

Closed
renelehmann opened this issue Nov 24, 2023 · 4 comments · Fixed by #3697
Closed

Create route "routes_vip_offset_index" duplicate errors #3525

renelehmann opened this issue Nov 24, 2023 · 4 comments · Fixed by #3697

Comments

@renelehmann
Copy link

Issue

For several minutes a repeated v2 call POST /v2/routes for an apps.internal route ended in errors like:
error creating route Mysql2::Error: Duplicate entry '24218' for key 'routes_vip_offset_index'

Context

On using capi/1.161.0' (cf-deployment/v32.8.0) ) on a busy environment with multiple cc VMs running.

It seems we run into some circumstances/race conditions which supported a "create route" issue.
There was no unusual sign in the DB stats/logs.

Shortened log lines from API (following a specific b3_trace_id . It refers to duplicate key 24218 and 24219 on retry):

21:01:30.554638520Z","message":"Started POST \"/v2/routes?async=true&inline-relations-depth=1
21:01:33.258274448Z","message":"error creating route Mysql2::Error: Duplicate entry '24218' for key 'routes_vip_offset_index', retrying once","log_level":"warn"
21:01:35.743954090Z","message":"exception not translated: Sequel::UniqueConstraintViolation - Mysql2::Error: Duplicate entry '24219' for key 'routes_vip_offset_index'
21:01:35.744681358Z","message":"Request failed: 500: {\"description\"=>\"Database error\"
21:01:35.745450613Z","message":"Completed 500 vcap-request-id:

VMs:
API (cc/cc_worker): 12 x 8vCPU,8GB / 6 x 4vCPU,8GB
Database: galera/mysql 5.7; 3 x 44vCPU/64GB

Related slack chat:
https://cloudfoundry.slack.com/archives/C07C04W4Q/p1700675754670279

Steps to Reproduce

We did not try to reproduce the issue yet.

Expected result

Regarding the routes_vip_offset_index, it should handle a higher/parallel volume of create routes requests more safely (e.g. request queue; synchronous )

Additional logs

Create route activity:
Screenshot of the relevant time window related to the mentioned condition above.
All request pattern with Started POST for /v2/routes and /v3/routes within a 1 minute window.

CAPI_POST_V2_V3_ROUTES_2101-2102_Started-POST_DATA_CSV_marked

@renelehmann
Copy link
Author

Friendly "bump" ;-)

@anyandrea
Copy link

Any updates on this issue?

@philippthun
Copy link
Member

After looking again at the differences in the V2 and V3 implementations, I'm convinced that this problem is solved in V3 due to 'endless' retries. Nevertheless there was a small bug in V3 that broke the retry mechanism which will be fixed with #3697. This PR also includes a test for the parallel creation of internal routes that shows that the retries are working as expected.

Please change your client coding to use the V3 API (after this fix is merged and released). As the V2 API is deprecated, there won't be any bug fixes anymore. This means that for the parallel creation of internal routes there is only a single retry.

@renelehmann
Copy link
Author

Thanks a lot @philippthun for checking, your suggestion and the fix.

@philippthun philippthun linked a pull request Apr 2, 2024 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants