-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix nullable bug of Arrow MapVector in Bridge.cpp #11214
base: main
Are you sure you want to change the base?
Fix nullable bug of Arrow MapVector in Bridge.cpp #11214
Conversation
✅ Deploy Preview for meta-velox canceled.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we add a test for this as well?
@Yuhta thanks for your review. I add some assertions in the |
@mbasmanova Would you please take a look at this? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@whutjs Thank you for the fix. Would you update the PR description to clarify which constraint was violated and, if possible, provide a reference to Arrow documentation that describes that constraint.
velox/vector/arrow/Bridge.cpp
Outdated
@@ -1447,6 +1451,11 @@ void exportToArrow( | |||
maps.getNullCount()); | |||
exportToArrow(rows, *child, options); | |||
child->name = "entries"; | |||
// Map data should be a non-nullable struct type | |||
child->flags = clearArrowNullableFlag(child->flags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to change clearArrowNullableFlag take flags as in/out parameter?
clearArrowNullableFlag(child->flags);
Would it also make sense to remove 'Arrow' from the function name? It seems to be implied.
edd660c
to
244462e
Compare
@mbasmanova thanks for your review. I have modified this PR according to your suggestion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
@kgpai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@whutjs Can you rebase against latest main and update ? |
Verify MapVector schema is not null in ArrowBridgeSchemaTest.cpp
244462e
to
ae1f99a
Compare
@kgpai DONE |
There is a constraint defined in arrow/format/Schema.fbs, which requires the MapVector itself and the key of MapVector to be not nullable.
There are also two not-nullable constraints in the Java implementation of MapVector:
https://github.com/apache/arrow/blob/d4516c5386f84619dfdf2a9f72fed6d7df89704c/java/vector/src/main/java/org/apache/arrow/vector/complex/MapVector.java#L98
https://github.com/apache/arrow/blob/d4516c5386f84619dfdf2a9f72fed6d7df89704c/java/vector/src/main/java/org/apache/arrow/vector/complex/MapVector.java#L105
However, When the velox convert the MapVector to Arrow MapVector, velox will set the flag of MapVector to nullable, which violates these constraints:
velox/velox/vector/arrow/Bridge.cpp
Line 1388 in acd5717