Replies: 1 comment
-
Hey @gero90, thanks for bringing this up! Created an issue to track this feature request's progress: #3823 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
If there is anyway to estimate parquet file size in
df.write_iceberg()
, it would be really nice to try to get parquet files of size close to the iceberg table propertywrite.target-file-size-bytes
(default is 512 MiB)Having parquet files close to that size makes iceberg reads more efficient, and there is less table maintenance (compaction) to perform.
As example, I'm doing
df.into_partitions(1)
right beforedf.write_iceberg()
where I know the total data is small, to get a single file per write.Thanks in advance for taking a look and for making daft awesome!
Beta Was this translation helpful? Give feedback.
All reactions