Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement tree explain for DataSourceExec #15029

Merged
merged 6 commits into from
Mar 6, 2025

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Mar 5, 2025

Which issue does this PR close?

Rationale for this change

We want to have nice explain plans for users. Let's add some detail to the DataSourceExec

What changes are included in this PR?

  • Implement tree explain for DataSourceExec
  • Implement for DataSource and FileSource
  • Tests

Are these changes tested?

yes

Are there any user-facing changes?

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) datasource Changes to the datasource crate labels Mar 5, 2025
DisplayFormatType::Default | DisplayFormatType::Verbose => {
write!(f, ", has_header={}", self.has_header)
}
DisplayFormatType::TreeRender => Ok(()),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the description of TreeRender:

/// TreeRender, displayed in the `tree` explain type.
///
/// This format is inspired by DuckDB's explain plans. The information
/// presented should be "user friendly", and contain only the most relevant
/// information for understanding a plan. It should NOT contain the same level
/// of detail information as the [`Self::Default`] format.
///
/// In this mode, each line contains a key=value pair.
/// Everything before the first `=` is treated as the key, and everything after the
/// first `=` is treated as the value.
///
/// For example, if the output of `TreeRender` is this:
/// ```text
/// partition_sizes=[1]
/// partitions=1
/// ```
///
/// It is rendered in the center of a box in the following way:
///
/// ```text
/// ┌───────────────────────────┐
/// │ DataSourceExec │
/// │ -------------------- │
/// │ partition_sizes: [1] │
/// │ partitions: 1 │
/// └───────────────────────────┘
/// ```

TreeRender mode should have only the most relevant details for understanding the high level plan

}

write!(f, "partitions={}", partition_sizes.len())
let total_rows = self.partitions.iter().map(|b| b.len()).sum::<usize>();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, the previous version is too verbose I think

15)│ -------------------- │
16)│ files: 1 │
17)│ format: parquet │
18)│ │
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why there is an extra newline here 🤔

@alamb alamb changed the title Implement tree explain for DataSourceExec Implement tree explain for DataSourceExec Mar 5, 2025
@alamb alamb marked this pull request as ready for review March 5, 2025 15:03
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm thanks @alamb

Copy link
Contributor Author

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @comphead

@comphead
Copy link
Contributor

comphead commented Mar 6, 2025

Btw I like this simple approach to calc batch sizes, we can also reuse it in #14510 perhaps to sum up all incoming or outcoming batches.

@comphead comphead merged commit 986be19 into apache:main Mar 6, 2025
24 checks passed
@comphead
Copy link
Contributor

comphead commented Mar 6, 2025

Thanks again @alamb

@alamb alamb deleted the alamb/less_details branch March 6, 2025 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasource Changes to the datasource crate sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants