-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move FileFormat
and related pieces to datafusion-datasource
#14873
Move FileFormat
and related pieces to datafusion-datasource
#14873
Conversation
FileFormat
and related pieces to datafusion-datasource
.FileFormat
and related pieces to datafusion-datasource
/// Coerces the file schema if the table schema uses a view type. | ||
#[cfg(not(target_arch = "wasm32"))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These functions used to be public but are only used in parquet (especially transform_binary_to_string
), so I think it makes sense to put them here in the long-term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(so they will long term become part of the datasource/parquet crate?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's my plan. I expect most of this module (and json
, csv
and avro
) to move to the new format-specific crates, hopefully without too many leftovers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already ran into one small issue with csv/json when I tried to move the Decoder
trait, but I think that once I make that big split it'll be easier to find a home for all the shared things, hopefully without having to change them too much.
pub use datafusion_datasource::file_compression_type; | ||
use datafusion_datasource::file_scan_config::FileScanConfig; | ||
pub use datafusion_datasource::file_format::*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trying to keep existing use
statements working
/// relevant methods. [FileType] is only used in logical planning and only implements | ||
/// the subset of methods required during logical planning. | ||
#[derive(Debug)] | ||
pub struct DefaultFileType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is really nice to see this broken up into smaller modules
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me -- thank you @AdamGS
/// Coerces the file schema if the table schema uses a view type. | ||
#[cfg(not(target_arch = "wasm32"))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(so they will long term become part of the datasource/parquet crate?)
Thanks again @AdamGS |
Which issue does this PR close?
Part of #14444.
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?