-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add tree
/ pretty explain mode
#14677
Conversation
😍 I have been dreaming about this @irenjj -- can't wait to check it out |
ad83ec7
to
d143783
Compare
PTAL @alamb @xudong963 👀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to see this happen.
Maybe it's better to open a tracking issue to record next todos, such as rich other physical operator's information to display. Then we can make our new explain perfect in parallel after the PR is merged.
I will do it later today.❤️ |
Looks awesome -- I have put this on my review queue. So exciting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very cool 💯
I suggest to add more tests to make sure the tree rendering is correct for complex plans:
- Very long metric entry like
partition_sizes: [20, 20, 20, 30, 20, 20, 20...]
- Complex query: perhaps pick some from TPCH and make sure the tree structure is rendered correct.
Then we can write instructions and open tickets to support pretty explain for other executor nodes. I think now the default explain info is very detailed debug-style messages, perhaps we can make this pretty explain more concise (by looking at what metrics DuckDB includes for a certain plan node 😄 )
datafusion/datasource/src/source.rs
Outdated
@@ -43,6 +44,7 @@ pub trait DataSource: Send + Sync { | |||
) -> datafusion_common::Result<SendableRecordBatchStream>; | |||
fn as_any(&self) -> &dyn Any; | |||
fn fmt_as(&self, t: DisplayFormatType, f: &mut Formatter) -> fmt::Result; | |||
fn collect_info(&self) -> HashMap<String, String>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a new interface to collect info for pretty explain? Looks like it's possible to implement inside fmt_as()
.
DuckDB explains also have many non key-value patterns like the following example, we will want the formater to be more flexible.
┌─────────────┴─────────────┐
│ READ_PARQUET │
│ ──────────────────── │
│ Function: │
│ READ_PARQUET │
│ │
│ Projections: │
│ l_orderkey │
│ l_partkey │
│ l_suppkey │
│ l_linenumber │
│ l_quantity │
│ l_extendedprice │
│ │
│ ~6046496 Rows │
└───────────────────────────┘
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this -- it would be much better to avoid a new interface and instead update the existing fmt_as
API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @irenjj
I took the liberty of merging up from main to resolve a conflict and fixing the CI failure
I am still going through this PR but so far I think it looks great
datafusion/common/src/config.rs
Outdated
@@ -704,6 +704,10 @@ config_namespace! { | |||
|
|||
/// When set to true, the explain statement will print schema information | |||
pub show_schema: bool, default = false | |||
|
|||
/// Display format of explain. Default is "indent". | |||
/// When set to "pretty", it will print the plan in a tree-rendered format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than "pretty"
that prints out a tree I suggest we change the option name to "tree"
|
||
use crate::ExecutionPlan; | ||
|
||
#[allow(dead_code)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it still deadcode? it doesn't look like it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will warn while cargo build:
warning: fields `x` and `y` are never read
--> datafusion/physical-plan/src/render_tree.rs:29:9
|
28 | pub struct Coordinate {
| ---------- fields in this struct
29 | pub x: usize,
| ^
30 | pub y: usize,
| ^
|
= note: `#[warn(dead_code)]` on by default
the Coordinate
is inspired from duckdb's implementation, I'll figure out if it's necessary to introduce such a structure.
I hope to look at this later today Looks like the follow on ticket is filed. |
tree
/ pretty explaintree
/ pretty explain mode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks great @irenjj
Thanks so much 🙏
I merged up from main and will touch up the ticket you filed
This is next on my list of things to merge What i would like ot do is merge this PR in and then file follow on subtasks in I am sorry I have been slammed trying to get the DataFusion 46 release out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I merged up from main to resolve some conflicts in this PR
Which issue does this PR close?
SQL EXPLAIN
Tree Rendering #14914Rationale for this change
What changes are included in this PR?
Prettify explain display format like DuckDB.
In this pr, implemented TreeRender structure, and DataSourceExec info:
Are these changes tested?
Are there any user-facing changes?