Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust API: UUID support in pola.rs Series #7175

Open
caniko opened this issue Feb 25, 2023 · 7 comments
Open

Rust API: UUID support in pola.rs Series #7175

caniko opened this issue Feb 25, 2023 · 7 comments
Labels
enhancement New feature or an improvement of an existing feature

Comments

@caniko
Copy link

caniko commented Feb 25, 2023

Problem description

I was trying to create a pola.rs DataFrame with a column of UUIDs from the uuid crate.

Something like this:

let ser = Series::new("ID", vec![Uuid::new_v4(), Uuid::new_v4()])

Gives the error: the trait bound polars::prelude::Series: polars::prelude::NamedFrom<Vec<uuid::Uuid>, _> is not satisfied

Explanation: the trait polars::prelude::NamedFrom<Vec<uuid::Uuid>, _> is not implemented for polars::prelude::Series

@caniko caniko added the enhancement New feature or an improvement of an existing feature label Feb 25, 2023
@jeremychone
Copy link

This would be great.

And in the meantime, I'd love to know how we can support this ourselves. Which Polars type should we translate the uuid to? It seems there's a Polars Decimal type, which is i128, but I'm not sure if that would work.

let ser = Series::new("ID", vec![Uuid::new_v4().to_what_type(), Uuid::new_v4().to_what_type())

@uditrana
Copy link

uditrana commented Sep 4, 2023

On a similar note, it seems that Apache supports a UUID type in parquet.

Would Polaris be able to read a parquet file with a column of that data type ? Seems like Polars would need a u128

@jeremychone
Copy link

After further reading, I believe there might be some issues with using u128 as a UUID representation due to endianness concerns.

It appears that the feature supporting this is FixedSizeBinary, and I came across a bug that seems related: #9373

Regardless, I hope that we'll get UUID support in the future. It's crucial for our use case, particularly for RequestLogLine analytics.

@mjclarke94
Copy link

+1! We're also running into issues with this. We're ending up with massive files due to us having to cast UUIDs to UTF8. The fixed size binary seems like a nice workaround, but given the UUID is a builtin type in python, it would be nice to be able to preserve the functionality attached to that type.

@scur-iolus
Copy link

Would be great to have this feature in v1!

@deanm0000
Copy link
Collaborator

UUID type is a subset of the larger Extension Type request which has been requested before #9112

I'm a big proponent of Extension Type support.

@pol-todo
Copy link

Yes please!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature
Projects
None yet
Development

No branches or pull requests

7 participants