-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: allow math functions to parse input from Utf8 #9302
Comments
@alamb Maybe I need some opinions from you. In my opinion, this is also a complicated issue. |
Datafusion does actually support this it seems: ❯ select log(arrow_cast('+Infinity', 'Float32'));
+------------------------+
| log(Utf8("+Infinity")) |
+------------------------+
| inf |
+------------------------+
1 row in set. Query took 0.002 seconds.
❯ select log(arrow_cast('-Infinity', 'Float32'));
+------------------------+
| log(Utf8("-Infinity")) |
+------------------------+
| NaN |
+------------------------+
1 row in set. Query took 0.002 seconds.
❯ Could you elaborate on which math functions are throwing error in cases of infinity input? |
Thank you for your reply. It seems that you need to do some cast conversion. Can't you log2('-Infinity') directly? |
Yeah that would seem reasonable. I guess it would involve changing the signatures of the math functions to also accept Utf8 data type, then try parse into float (or something like that) |
I've updated the issue title to better reflect this 👍 (Since less of an issue with infinity, and more just a general data type issue with math functions) |
Very happy to see the way arrow-datafusion handles this, it's a way I would like to see, I think doing a cast is a very rigorous behavior |
@Jefffrey But I think this cast should be an implicit conversion. If not, maybe we should indicate this in the tests and documentation. |
Yes, I think this is what I was trying to convey, that we should allow |
Maybe I want to try it, but I'm not sure I can pull it off |
@Jefffrey I will try to add some tests first and then find the problem. At present, the math function should have some negative tests that need to be improved. |
I think this process is called "coercion" in the datafusion codebase. So we would need to add a coercion from |
double or float? |
I think we would probably need the coercion for both |
@alamb I have some ideas about type conversion. I am trying out my ideas on calcite. If it makes sense, I will transplant it to arrow-datafusion. This may take some time. |
Thanks @caicancai -- BTW I find following the example of existing systems is helpful in cases like this (there is no need to reinvent different coercion rules) For example, postgres appears to happily coerce UTF8 -> float: postgres=# select sqrt('1.4');
sqrt
--------------------
1.1832159566199232
(1 row) |
Is your feature request related to a problem or challenge?
log10('-Infinity');
log2('-Infinity');
power('Infinity',2)
As above, using the + infinity function in some math functions will report an error in arrow-datafusion, but mysql and posgtres will not report an error. I think the + infinity parameter is meaningful in mathematics.
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: