-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HUDI-8590] fix: wrong file path for consistent-bucket-commit-marker-file #12344
base: master
Are you sure you want to change the base?
[HUDI-8590] fix: wrong file path for consistent-bucket-commit-marker-file #12344
Conversation
1. wrong file path for consistent-bucket-commit-marker-file Signed-off-by: TheR1sing3un <[email protected]>
@beyond1920 @danny0405 HI, I find a bug about incorrect path for the Consistent-Bucket-Commit-Marker-File. Please have a look. Thanks! |
df0fcce
to
da34c08
Compare
1. fix unable to load latest committed consistent-bucket-hash-metadata Signed-off-by: TheR1sing3un <[email protected]>
efde0e3
to
fb1ea61
Compare
final List<StoragePathInfo> hashingMetaFiles = metaFiles.stream().filter(hashingMetadataFilePredicate) | ||
.sorted(Comparator.comparing(f -> f.getPath().getName())) | ||
|
||
final TreeMap<String/*instantTime*/, Pair<StoragePathInfo/*hash metadata file path*/, Boolean/*commited*/>> versionedHashMetadataFiles = metaFiles.stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@beyond1920 can you review, and add test cases please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@beyond1920 can you review, and add test cases please.
Thanks! I will add more test cases to verify metadata correctness.
Hi, could u please describe what problems will be caused if this fix is not made? Will there be any issues regarding correctness? |
Commit-marker-file will always be created in wrong file path.You never know that a certain version of hash metadata has been committed because the scan path is different. Therefore, it is necessary to go to the timeline each time and then resubmit this file, which will put a lot of pressure on the timeline. In the current code logic, if there is only one hash metadata that is not committed and it is still in pending state, the current code logic will return empty instead of the recently committed hash metadata. So I fix these two problems. |
issue: #12338
Change Logs
Describe context and summary for this change. Highlight if any code was copied.
Impact
Describe any public API or user-facing feature change or any performance impact.
none
Risk level (write none, low medium or high below)
low
If medium or high, explain what verification was done to mitigate the risks.
Documentation Update
Describe any necessary documentation update if there is any new feature, config, or user-facing change. If not, put "none".
ticket number here and follow the instruction to make
changes to the website.
Contributor's checklist