-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement tree explain for DataSourceExec
#15029
Conversation
DisplayFormatType::Default | DisplayFormatType::Verbose => { | ||
write!(f, ", has_header={}", self.has_header) | ||
} | ||
DisplayFormatType::TreeRender => Ok(()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per the description of TreeRender:
datafusion/datafusion/physical-plan/src/display.rs
Lines 48 to 74 in 3dc212c
/// TreeRender, displayed in the `tree` explain type. | |
/// | |
/// This format is inspired by DuckDB's explain plans. The information | |
/// presented should be "user friendly", and contain only the most relevant | |
/// information for understanding a plan. It should NOT contain the same level | |
/// of detail information as the [`Self::Default`] format. | |
/// | |
/// In this mode, each line contains a key=value pair. | |
/// Everything before the first `=` is treated as the key, and everything after the | |
/// first `=` is treated as the value. | |
/// | |
/// For example, if the output of `TreeRender` is this: | |
/// ```text | |
/// partition_sizes=[1] | |
/// partitions=1 | |
/// ``` | |
/// | |
/// It is rendered in the center of a box in the following way: | |
/// | |
/// ```text | |
/// ┌───────────────────────────┐ | |
/// │ DataSourceExec │ | |
/// │ -------------------- │ | |
/// │ partition_sizes: [1] │ | |
/// │ partitions: 1 │ | |
/// └───────────────────────────┘ | |
/// ``` |
TreeRender mode should have only the most relevant details for understanding the high level plan
} | ||
|
||
write!(f, "partitions={}", partition_sizes.len()) | ||
let total_rows = self.partitions.iter().map(|b| b.len()).sum::<usize>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise, the previous version is too verbose I think
15)│ -------------------- │ | ||
16)│ files: 1 │ | ||
17)│ format: parquet │ | ||
18)│ │ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why there is an extra newline here 🤔
DataSourceExec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm thanks @alamb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @comphead
Btw I like this simple approach to calc batch sizes, we can also reuse it in #14510 perhaps to sum up all incoming or outcoming batches. |
Co-authored-by: Oleks V <comphead@users.noreply.github.com>
Thanks again @alamb |
Which issue does this PR close?
SQL EXPLAIN
Tree Rendering #14914Rationale for this change
SQL EXPLAIN
Tree Rendering #14914We want to have nice explain plans for users. Let's add some detail to the
DataSourceExec
What changes are included in this PR?
DataSource
andFileSource
Are these changes tested?
yes
Are there any user-facing changes?