Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add release notes for 0.216 #12194

Merged
merged 1 commit into from
Jan 24, 2019
Merged

Conversation

mbasmanova
Copy link
Contributor

@mbasmanova mbasmanova commented Jan 8, 2019

* Fix rare (but practical) incorrect query result for queries involving timestamp,
when the time zone database embedded in the version of Joda-Time that Presto depends on
does not match the time zone database used by the JVM that Presto runs on.
* Fix count(*) aggregation on empty relation when optimize_mixed_distinct_aggregation is enabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what used to happen and what happens now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sopel39 Karol, could you help answer Rebecca's question?

* Improve performance of table scan on structured types by removing unnecessary memory accounting.
* Add :func:`reduce_agg` aggregate function.
* Add :func:`milliseconds` function (issue #9524).
* Add support for predicate pushdown for partition symbols for window operators.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's super clear what "predicate pushdown" means. I think it would be clearer to say Push filters on window function partition symbols below window operators. (I assume that's what this is about)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't mention "partition symbols" at all since that is an internal detail of the planner. This should be explained in end-user terms.

* Add :func:`reduce_agg` aggregate function.
* Add :func:`milliseconds` function (issue #9524).
* Add support for predicate pushdown for partition symbols for window operators.
* Remove SHOW STATS ON syntax
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add that SHOW STATS FOR should be used instead

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, it looks like we put SQL syntax in code backticks. so that should be done here too (or removed at show create table. whichever is our standard practice)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please put code, SQL stuff in double backticks. There are other places that need to be updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also need a . at the end.

* Add :func:`milliseconds` function (issue #9524).
* Add support for predicate pushdown for partition symbols for window operators.
* Remove SHOW STATS ON syntax
* Restrict WHERE clause in SHOW STATS to filters that can be pushed down
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clarify that it's pushed down to the connector. We sometimes use "pushed down" to mean pushed down within the plan and sometimes pushed down to the connector.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a . at the end.


* Queries reading Parquet files will now fail by default if statistics
in those Parquet files are corrupt (e.g., min > max).
To disable this behavior, set configuration property ``hive.parquet.fail-on-corrupted-statistics``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set the configuration property (add "the")

Verifier
---------------

* Add config run-teardown-on-result-mismatch. When set to false,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

surround "run-teardown-on-result-mismatch" with double backticks like the other configuration properties

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add ``run-teardown-on-result-mismatch`` config property. ...

---------------

* Add config run-teardown-on-result-mismatch. When set to false,
temporary table will not be dropped after checksum failure to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

temporary tables (plural)

---------------

* Fix accounting of time spent reading Parquet data.
* Fix rare (but practical) incorrect query result for queries involving timestamp,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"rare (but practical)" doesn't mean a whole lot to me. Get rid of those words.

@mbasmanova mbasmanova force-pushed the release-notes-0.216 branch from a9f3c5c to 2762328 Compare January 8, 2019 20:16
General Changes
---------------

* Fix accounting of time spent reading Parquet data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should put parquet related docs under Hive Changes.

* Fix rare (but practical) incorrect query result for queries involving timestamp,
when the time zone database embedded in the version of Joda-Time that Presto depends on
does not match the time zone database used by the JVM that Presto runs on.
* Fix count(*) aggregation on empty relation when optimize_mixed_distinct_aggregation is enabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put optimize_mixed_distinct_aggregation in double backticks.

* Add :func:`reduce_agg` aggregate function.
* Add :func:`milliseconds` function (issue #9524).
* Add support for predicate pushdown for partition symbols for window operators.
* Remove SHOW STATS ON syntax
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please put code, SQL stuff in double backticks. There are other places that need to be updated.

* Add :func:`milliseconds` function (issue #9524).
* Add support for predicate pushdown for partition symbols for window operators.
* Remove SHOW STATS ON syntax
* Restrict WHERE clause in SHOW STATS to filters that can be pushed down
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a . at the end.

* Add :func:`reduce_agg` aggregate function.
* Add :func:`milliseconds` function (issue #9524).
* Add support for predicate pushdown for partition symbols for window operators.
* Remove SHOW STATS ON syntax
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also need a . at the end.

MySQL, PostgreSQL, Redshift, and SQL Server Changes
---------------------------------------------------

* Add support for predicate pushdown for ``char(x)`` columns.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add support for predicate pushdown for columns with ``char(x)`` type.

Verifier
---------------

* Add config run-teardown-on-result-mismatch. When set to false,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add ``run-teardown-on-result-mismatch`` config property. ...

does not match the time zone database used by the JVM that Presto runs on.
* Fix count(*) aggregation on empty relation when optimize_mixed_distinct_aggregation is enabled.
* Improve performance of table scan on structured types by removing unnecessary memory accounting.
* Add `reduce_agg` aggregate function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a doc that we can link to here for reduce_agg?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nezihyigitbasi @wenleix Currently, there is not. Wenlei will be adding one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add the documentation before releasing. The code should not have been merged without documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to get in #12195.

Web UI
---------------

* Fix failure when rendering live plan view for queries involving index joins (issue #12054).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix the rendering of live plan view for ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nezihyigitbasi This way is it not clear what was wrong? Whether the UI didn't look nice or it failed to render at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ... in the end means the rest is the same, sorry for the confusion.

Fix the rendering of live plan view for queries involving index joins (:issue:`12054 `)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we still want to go one level deeper in terms of detail, we can also add what exactly is broken with rendering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nezihyigitbasi Updated

* Remove SHOW STATS ON syntax
* Restrict WHERE clause in SHOW STATS to filters that can be pushed down
* Remove ``node_id`` column from ``system.runtime.queries`` table.
* Immediately return final results to client for a failed query.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return final results to clients immediately for failed queries.

@mbasmanova mbasmanova force-pushed the release-notes-0.216 branch from 2762328 to 844c200 Compare January 8, 2019 21:39
@mbasmanova
Copy link
Contributor Author

@nezihyigitbasi @rschlussel Nezih, Rebecca, thank you for reviews. I made the changes you suggested. Would you take another look?

@mbasmanova mbasmanova force-pushed the release-notes-0.216 branch from 844c200 to 96e2f50 Compare January 8, 2019 22:22
Copy link
Contributor

@rschlussel rschlussel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple additional comments.

Copy link
Contributor

@nezihyigitbasi nezihyigitbasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding some more comments.

* Fix incorrect query result for queries involving timestamp, when the time zone database
embedded in the version of Joda-Time that Presto depends on does not match the time zone
database used by the JVM that Presto runs on.
* Fix count(*) aggregation on empty relation when ``optimize_mixed_distinct_aggregation`` is enabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need double backticks around count(*) as * will make the rest of the sentence italic.

``count(*)``

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "fix" mean? Did it fail during planning? Return an incorrect result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sopel39 Karol, could you answer David's question?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe:

Do not return ``NULL`` as ``count(*)`` aggregation result when ``optimize_mixed_distinct_aggregation`` is enabled.

?

* Improve performance of table scan on structured types by removing unnecessary memory accounting.
* Add `reduce_agg` aggregate function.
* Add :func:`millisecond` function (:issue:`#9524`).
* Add an optimizer rule to push filters on window function partition symbols under the window
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This release note item is talking in terms of the plan tree. Can we make it more user friendly?
We should say something like Add an optimizer rule to filter the window partitions before executing the window operators. Please check with the author of this change if this statement is correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sopel39 Karol, could you review the proposed wording?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better, thanks!

Copy link
Contributor

@electrum electrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to comments, this needs the additional release notes that were recently added to the release notes issue

* Fix incorrect query result for queries involving timestamp, when the time zone database
embedded in the version of Joda-Time that Presto depends on does not match the time zone
database used by the JVM that Presto runs on.
* Fix count(*) aggregation on empty relation when ``optimize_mixed_distinct_aggregation`` is enabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "fix" mean? Did it fail during planning? Return an incorrect result?

does not match the time zone database used by the JVM that Presto runs on.
* Fix count(*) aggregation on empty relation when optimize_mixed_distinct_aggregation is enabled.
* Improve performance of table scan on structured types by removing unnecessary memory accounting.
* Add `reduce_agg` aggregate function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add the documentation before releasing. The code should not have been merged without documentation.

Kudu Changes
------------

* Update client to 1.8.0.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an implementation detail and should not be a release note item. If there is a user visible change, such as allowing connecting to Kudu servers newer than version X, then mention that directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kokosing Grzegorz, is there any user-visible change?

* Add ``run-teardown-on-result-mismatch`` configuration property. When set to false,
temporary tables will not be dropped after checksum failure to facilitate debugging.

SPI
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SPI Changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@electrum David, would you update https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines with the new guidelines?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@mbasmanova mbasmanova force-pushed the release-notes-0.216 branch 2 times, most recently from 16c2284 to 1a3c056 Compare January 18, 2019 20:16
@mbasmanova
Copy link
Contributor Author

@rschlussel @nezihyigitbasi @electrum Comments addressed. Please take another look.

does not match the time zone database used by the JVM that Presto runs on.
* Fix count(*) aggregation on empty relation when optimize_mixed_distinct_aggregation is enabled.
* Improve performance of table scan on structured types by removing unnecessary memory accounting.
* Add `reduce_agg` aggregate function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to get in #12195.

---------------

* Fix a corner case where ORC writer fails with "integer overflow" when writing
highly compressible data using dictionary encoding (#11930).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbasmanova please fix the issue link as commented above.

Kudu Connector Changes
----------------------

* Add table property ``number_of_replicas`` to ``SHOW CREATE TABLE`` output.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add ``number_of_replicas`` table property to ...

* Restrict ``WHERE`` clause in ``SHOW STATS`` to filters that can be pushed down to the connectors.
* Remove ``node_id`` column from ``system.runtime.queries`` table.
* Return final results to clients immediately for failed queries.
* Performance improvement on ARRAY_INTERSECT.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same as another item above.

* Improve performance for :func:`array_intersect`.

corrupt (e.g., min > max). To disable this behavior, set the configuration
property ``hive.parquet.fail-on-corrupted-statistics``
or session property ``parquet_fail_with_corrupted_statistics`` to false.
* Add support for :ref:`s3selectpushdown`, which enables pushing down
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I build the release notes this item gets rendered like below, which is not what we want.

Add support for Amazon S3 Configuration, which enables pushing down projections and predicates into S3 for text files.

To fix this we can provide a custom text for the link:

:ref:`S3 select pushdown <s3selectpushdown>`

General Changes
---------------

* Fix correctness issue for operations on timestamps in certain time zones (:issue:`11891`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be removed since we are reverting the change

* Fix correctness issue for :func:`array_intersect` and :func:`array_distinct` when input contains
both zeros and nulls.
* Fix ``count(*)`` aggregation on empty relation when ``optimize_mixed_distinct_aggregation`` is enabled.
* Improve the table scan performance for structured types.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove "the" and change to "structured" to "structural"

Is this an engine change that applies to all connectors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC: @yingsu00 Ying, "Is this an engine change that applies to all connectors?"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbasmanova These are array function changes in the engine, and it should apply to all connectors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an engine change that applies to all connectors?

I think this could be mine. Originally it was: Improve performance of table scan on structured types by removing unnecessary memory accounting.

Yes. this applies to all connectors

Web UI
------

* Fix the rendering of live plan view for queries involving index joins.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove "the"

* Restrict ``WHERE`` clause in ``SHOW STATS`` to filters that can be pushed down to the connectors.
* Remove ``node_id`` column from ``system.runtime.queries`` table.
* Return final results to clients immediately for failed queries.
* Performance improvement on ARRAY_INTERSECT.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to the function and move above to be with the other optimizations

* Add :func:`millisecond` function.
* Add an optimizer rule to filter the window partitions before executing the window operators.
* Remove ``ON`` keyword for :doc:`/sql/show-stats`.
* Restrict ``WHERE`` clause in ``SHOW STATS`` to filters that can be pushed down to the connectors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change SHOW STATS here to be a link so that it is consistent with the one above. Or change the one above to be text.

SPI Changes
-----------

* All connectors that use Joda-Time are now required to declare a ``runtime`` dependency on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be removed as we are reverting this change

@mbasmanova
Copy link
Contributor Author

@nezihyigitbasi @electrum Nezih, David, thank you for reviewing. Comments addressed.

Copy link
Contributor

@nezihyigitbasi nezihyigitbasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with two changes:

@mbasmanova
Copy link
Contributor Author

@nezihyigitbasi Thank you, Nezih. Addressed all comments.

@mbasmanova mbasmanova merged commit 868c9f3 into prestodb:master Jan 24, 2019
@mbasmanova mbasmanova deleted the release-notes-0.216 branch February 15, 2019 16:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants