Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As of polars 1.8.x (but not 1.7.x), partition_by with as_dict=True strips leading zeros from strings on the partition column #18895

Closed
2 tasks done
ptomecek opened this issue Sep 24, 2024 · 4 comments
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars

Comments

@ptomecek
Copy link

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

pl.DataFrame({"s":["01", "0010", "10", "foo"]}).partition_by("s", as_dict=True).keys()

Log output

No response

Issue description

Leading zeros have been stripped from the keys
dict_keys([('1',), ('10',), ('foo',)])

Expected behavior

I would have expected the first key to remain unchanged (with the leading zero)
dict_keys([('01',), ('0010',), ('10',), ('foo',)])

Installed versions

Broken on 1.8.1. This used to work on 1.7.1.

--------Version info---------
Polars:              1.8.1
Index type:          UInt32
Platform:            Linux-4.18.0-372.9.1.el8.x86_64-x86_64-with-glibc2.28
Python:              3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:36:13) [GCC 12.3.0]
----Optional dependencies----
adbc_driver_manager  <not installed>
altair               <not installed>
cloudpickle          2.2.1
connectorx           0.3.3
deltalake            0.18.2
fastexcel            <not installed>
fsspec               2024.6.1
gevent               <not installed>
great_tables         0.10.0
matplotlib           3.9.1
nest_asyncio         1.6.0
numpy                1.24.4
openpyxl             3.1.5
pandas               2.1.4
pyarrow              16.1.0
pydantic             2.8.2
pyiceberg            <not installed>
sqlalchemy           2.0.32
torch                2.3.1.post100
xlsx2csv             <not installed>
xlsxwriter           <not installed>
@ptomecek ptomecek added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Sep 24, 2024
@coastalwhite
Copy link
Collaborator

I can reproduce on 1.8.1 but not on main anymore. I think this is already fixed as part of #18888.

@ritchie46
Copy link
Member

Fixed by #18893

@ritchie46
Copy link
Member

Will patch tonight.

@ptomecek
Copy link
Author

Thanks everyone, appreciate the speedy response!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars
Projects
None yet
Development

No branches or pull requests

3 participants