-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix array_has_all and array_has_any with empty array #15039
base: main
Are you sure you want to change the base?
Conversation
@jayzhan211 Could you help approve the test workflows and review the PR? Thanks in advance |
ComparisonType::All => true, | ||
ComparisonType::Any => false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's weird to me that All
is true
and Any
is false
but I agree it matches the descriptions:
array_has_all: Returns true if all elements of sub-array exist in array.
array_has_any: Returns true if any elements exist in both arrays.
I guess, from a set perspective, we have "true if the intersection of the two sets has the same length as needle" (all) and "true if the intersection of the two sets is non-empty" (any).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
D select array_has_any([1,2,3], []);
┌────────────────────────────────────────────────────────────┐
│ array_has_any(main.list_value(1, 2, 3), main.list_value()) │
│ boolean │
├────────────────────────────────────────────────────────────┤
│ false │
└────────────────────────────────────────────────────────────┘
D select array_has_all([1,2,3], []);
┌────────────────────────────────────────────────────────────┐
│ array_has_all(main.list_value(1, 2, 3), main.list_value()) │
│ boolean │
├────────────────────────────────────────────────────────────┤
│ true │
└────────────────────────────────────────────────────────────┘
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, yes. I just meant it was strange intuition (I was rambling). I was not arguing correctness. I agree this is good.
ComparisonType::All => true, | ||
ComparisonType::Any => false, | ||
}; | ||
let result = BooleanArray::from(vec![result_value; haystack.len()]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using BooleanBuffer::new_set
and BooleanBuffer::new_unset
should be slightly more efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using
if needle.values().len() == 0 {
return match comparison_type {
ComparisonType::All => Ok(Arc::new(BooleanBuffer::new_set(haystack.len()))),
ComparisonType::Any => Ok(Arc::new(BooleanBuffer::new_unset(haystack.len())))
};
}
Runs into
the trait bound `BooleanBuffer: arrow::array::Array` is not satisfied
the following other types implement trait `arrow::array::Array`:
&T
BooleanArray
DictionaryArray<T>
FixedSizeBinaryArray
FixedSizeListArray
GenericByteArray<T>
GenericByteViewArray<T>
GenericListArray<OffsetSize>
and 10 others
required for the cast from `std::sync::Arc<BooleanBuffer>` to `std::sync::Arc<dyn arrow::array::Array>`rustc[Click for full compiler diagnostic](rust-analyzer-diagnostics-view:/diagnostic%20message%20[0]?0#file:///Users/lu/Projects/Github/Work/datafusion/datafusion/functions-nested/src/array_has.rs)
So confusing about the different types loll
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should use BooleanArray::from(BooleanBuffer::new_/* ... */)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion! Changed to use BooleanBuffer
Which issue does this PR close?
What changes are included in this PR?
Fix array_has_any and array_has_all with empty array input
Are these changes tested?
Added the unit test in array.slt
Are there any user-facing changes?
NA