Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix equals implementation to respect that anyone is allowed to implement IonElement #88

Merged
merged 4 commits into from
Mar 21, 2024

Conversation

popematt
Copy link
Contributor

Issue #, if available:

None

Description of changes:

I discovered that all of the equals() implementations for IonElement implementations check that the class is the same. That would mean that if someone else implemented IonElement, two Ion-equivalent values would not be equal.

However, the README says:

As long as the behavior is implemented correctly (see the documentation of each interface), the implementations of IonElement provided by this library are fully interoperable with user supplied implementations.

I've updated all of the equals() implementations to allow other implementations of the public API be potentially equal.

In some cases, I did a little more than just change the type check because some of the equals() functions were relying on implementation details rather than fields in the public API.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Copy link

codecov bot commented Mar 12, 2024

Codecov Report

Attention: Patch coverage is 86.45833% with 13 lines in your changes are missing coverage. Please review.

❗ No coverage uploaded for pull request base (master@5b9ec91). Click here to learn what that means.

Files Patch % Lines
src/com/amazon/ionelement/api/Equivalence.kt 89.47% 2 Missing and 4 partials ⚠️
...rc/com/amazon/ionelement/impl/ByteArrayViewImpl.kt 33.33% 0 Missing and 2 partials ⚠️
...rc/com/amazon/ionelement/impl/StructElementImpl.kt 83.33% 0 Missing and 2 partials ⚠️
src/com/amazon/ionelement/impl/StructFieldImpl.kt 66.66% 0 Missing and 2 partials ⚠️
...com/amazon/ionelement/impl/BigIntIntElementImpl.kt 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##             master      #88   +/-   ##
=========================================
  Coverage          ?   89.40%           
  Complexity        ?      507           
=========================================
  Files             ?       32           
  Lines             ?     1038           
  Branches          ?      140           
=========================================
  Hits              ?      928           
  Misses            ?       66           
  Partials          ?       44           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

type != other.type -> false
!bytes.contentEquals(other.bytes) -> false
bytesValue != other.bytesValue -> false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ This is now comparing the ByteArrayView rather than comparing the ByteArray directly (since we can no longer be sure that other is backed by a ByteArray).

other as LongIntElementImpl

if (longValue != other.longValue) return false
when (other.integerSize) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ If both IntElements are IntElementSize.LONG, then it's slightly more efficient to compare using longValue instead of bigIntegerValue.

@@ -121,43 +122,40 @@ internal class StructElementImpl(

override fun equals(other: Any?): Boolean {
if (this === other) return true
if (other !is StructElementImpl) return false
if (other !is StructElement) return false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ This change had quite a large effect. The prior implementation use a private field, so I had to rewrite it to not use that field.

Copy link

@rmarrowstone rmarrowstone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is anyone allowed to implement IonElement?

Why not prevent (or discourage) that?

The issue with this approach is that it is super easy for them to implement equals in a way that is not symmetric.

I suppose one could move the equals and hashcodes into the interfaces.

@popematt
Copy link
Contributor Author

Why is anyone allowed to implement IonElement?

Why not prevent (or discourage) that?

In no particular order:

  • This already part of the API contract, so taking it back is a breaking change.
  • We have all of the *Element interfaces in order hide the implementation details. Since we have the API decoupled from the implementation, why not let users implement it?
  • As we have learned with IonValue, people will create their own implementations even if we say not to. We should embrace what our users want rather than trying to fight them on it.

The issue with this approach is that it is super easy for them to implement equals in a way that is not symmetric.

I suppose one could move the equals and hashcodes into the interfaces.

It's also super easy for someone to create an implementation that is wrong in other ways. We can improve the documentation to make it clear. If it's really important, we could also export an ion-element-tests jar that provides some conformance tests that people can use (though I'd be inclined to let people copy/paste the existing tests for now).

We can't add default implementations of equals or hashcode to the interfaces because kotlinc complains that "An interface may not implement a method of 'Any'". However, we could provide functions that users could leverage to implement hashCode() and equals() (and toString()) in their own IonElement implementations.

@rmarrowstone
Copy link

We have all of the *Element interfaces in order hide the implementation details. Since we have the API decoupled from the implementation, why not let users implement it?

"Design and document for inheritance or else prohibit it" -- Joshua Bloch.

So when I see an example, such as this, where we didn't properly design for inheritance, it makes me wonder what else we might be missing.

As we have learned with IonValue, people will create their own implementations even if we say not to. We should embrace what our users want rather than trying to fight them on it.

It's fair to say that most users would rather prefer carefully constructed abstractions and guardrails to foot-guns. Understanding why someone would want to implement their own, would guide to better abstractions for all.

So I would be inclined to mark the interfaces sealed (though it looks like this is still using Kotlin 1.4). Or at least document, document, document the expectations for users.

@popematt
Copy link
Contributor Author

We have all of the *Element interfaces in order hide the implementation details. Since we have the API decoupled from the implementation, why not let users implement it?

"Design and document for inheritance or else prohibit it" -- Joshua Bloch.

I didn't mean to sound cavalier about this. We have an interface that was ostensibly designed for users to implement, and I believe it is reasonably well documented. I would also argue that this isn't really the sort of inheritance that Joshua Bloch was referring to. There is no behavior to inherit since only the interfaces are public. Interfaces are not the problem in Java, it's the fact that classes can extend other classes and override or modify all sorts of functionality of the parent class. IonElement does not have that problem. (It's classes are both final and internal.)

So when I see an example, such as this, where we didn't properly design for inheritance, it makes me wonder what else we might be missing.

What exactly was not properly designed for inheritance? This PR is for a bugfix where our implementation is actually not fulfilling the expectations of the clearly specified API. That doesn't mean that the API is wrong.

It's fair to say that most users would rather prefer carefully constructed abstractions and guardrails to foot-guns. Understanding why someone would want to implement their own, would guide to better abstractions for all.

My point (that I failed to state explicitly) is that we cannot predict everything that our users' will want. Yes, we can make better abstractions, and that is one of the reasons that IonElement exists, but I think it's naïve to say that we can design a perfect abstraction and that no users will need an escape hatch for some use case. We can plan for and design an escape hatch, or someone will eventually create their own escape hatch against our wishes, which (if we later try to close it) could then be forced upon us because of the institutional weight behind it.

So I would be inclined to mark the interfaces sealed (though it looks like this is still using Kotlin 1.4). Or at least document, document, document the expectations for users.

I think that ship has already sailed. We're at v1.2.0, and marking the interfaces as sealed (even if we were on a more recent Kotlin version) would be a backwards incompatible change.

Comment on lines 106 to 108
override fun equals(other: Any?): Boolean = delegate == other
override fun hashCode(): Int = delegate.hashCode()
override fun toString(): String = "AnonymousProxy($delegate)"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the override logic across each anonymous class for each IonElement types is consistent, should we consider abstract the common logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's particularly important here, but if you feel strongly about it, I'll do it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's particularly important here, but if you feel strongly about it, I'll do it.

Thanks for considering the feedback, I just see this could potentially enhance the maintainability. However, if there are other higher priorities at this time we can postpone this suggested change.

linlin-s
linlin-s previously approved these changes Mar 14, 2024
Copy link

@linlin-s linlin-s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I've learned a lot from the use of the proxy pattern in testing.

@rmarrowstone
Copy link

What exactly was not properly designed for inheritance? This PR is for a bugfix where our implementation is actually not fulfilling the expectations of the clearly specified API. That doesn't mean that the API is wrong.

It just means that we didn't fully think it and work it through.

I think that ship has already sailed. We're at v1.2.0, and marking the interfaces as sealed (even if we were on a more recent Kotlin version) would be a backwards incompatible change.

This repo has < 10 forks and stars. It strikes me as short-sighted to fear a major revision.

But you seem pretty dug in here, and we're not making progress, so OK.

rmarrowstone
rmarrowstone previously approved these changes Mar 14, 2024
Copy link

@rmarrowstone rmarrowstone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my glance at IonElement.kt I did not find the docs to clearly state the expectation for implements WRT equals and hashCode. I still think remaining open to other implementations is more pain than gain, but please ensure the expectations are crystal clear.

@popematt popematt dismissed stale reviews from rmarrowstone and linlin-s via 4d78b4e March 18, 2024 21:28
@popematt
Copy link
Contributor Author

A user might want to create their own IonElement implementation in order to e.g. have a lazy implementation or to be able to create a view over data that is not Ion. If we close the IonElement interface, then for any of that to happen, it has to go through us (the Ion maintainers) in some form. That would make us a potential bottleneck for innovation, and it seems antithetical to the spirit of open source software.

Re. documenting clear expectations:

* All implementations of [IonElement] implement [Object.equals] and [Object.hashCode] according to the Ion
* specification.

I thought that "anyone can implement IonElement" and this were pretty clear, but you made me realize that not everyone interprets it the same way.

I've gone a made a big update w.r.t. to equals() and hashCode() so that it's more clear.

  • Add Equivalence.kt that defines the following public functions: areElementsEqual(), hashElement(), areFieldsEqual(), hashField(). The are*Equals() methods are normative—all implementations must be equivalent to these functions—and the hash*() methods are the only allowed functions for hashing any implementations of IonElement.
  • Add equals() and hashCode() to IonElement interface with extra doc comments.
  • Update ByteArrayView to be clear about its equality semantics. Since it's just a thin wrapper around ByteArray, I updated the documentation to explain how that relates to equals() and hashCode().
  • Update IntElementSize to make it clear that two equal Ion ints must have the same IntElementSize.
  • Update all *ElementImpl to use hashElement() and areElementsEqual() (except for StructElement, which has a specialized implementation with some specific optimizations)
  • Update the tests to check that an IonElement can be equal to the same Ion value from another implementation, ensure that the implementations match hashElement() and areElementsEqual(), and clean up some of the redundancy in EquivsTestCase.
  • Incidental changes include fixing some typos and adding some missing copyright headers.

rmarrowstone
rmarrowstone previously approved these changes Mar 19, 2024
Copy link

@rmarrowstone rmarrowstone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only actual concern is the "deep" comparison of struct fields in Equivalence.kt. I may be missing something obvious there, if so please lmk.

* This function is normative for equality of [IonElement]s.
* All [IonElement] implementations must override [Any.equals] in a way that is equivalent to calling this function.
*/
public fun areElementsEqual(left: IonElement, right: IonElement): Boolean {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider "equivalent" vs "equal" (which is clearer but also more verbose/obtuse)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did, but I thought that "equal" was a little more clear about the intention that this is the authoritative function that determines if two elements are equal.

false
} else
// Matching an enum rather than a type allows the Kotlin compiler
// to use a table switch instead of a chain of if/else comparisons.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting. nice comment.

thisFields === otherFields -> true
thisFields.size != otherFields.size -> false
// We've tried the inexpensive checks, now do a deep comparison
else -> thisFields.groupingBy { it }.eachCount() == otherFields.groupingBy { it }.eachCount()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this comparing the occurrences of each field or am i missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It compares a Map<StructField, Int>. So for something like this Ion:

{ a: b, a: b, c: d }

It would create a map like this, which allows two structs to be compared even if their fields are in different orders, without having to impose a total ordering over all Ion values.

{
  a:b = 2, 
  c:d = 1,
}

It's not a one-way door. If it proves to be too costly of a comparison, we could (internally) create a total order over all IonElement so that we can compare two sorted lists of StructFields.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right... it is the key and the value. I read it as being just the key.

IntElementSize.LONG -> element.longValue.hashCode()
IntElementSize.BIG_INTEGER -> element.bigIntegerValue.hashCode()
}
// Adding compareTo(0.0) causes 0e0 to have a different hash code than -0e0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... which isn't incorrect though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right—it's not incorrect for them to have the same hashcode. This is just explaining why we are using compareTo. If this ends up being a problem, we can change it.

/**
* Calculates the hash code of an [IonElement].
*
* Implementations of [IonElement] MAY NOT calculate their own hash codes, but MAY cache the result of this function.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAY cache in-memory, but not persist. (that may be too prescriptive, just consider it).

/**
* The size of this [IntElement].
*
* This indicates the magnitude of the integer, but does not imply a particular in-memory representation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@popematt popematt merged commit 718acb2 into amazon-ion:master Mar 21, 2024
5 checks passed
@popematt popematt deleted the fix_equals branch March 21, 2024 20:27
@popematt popematt mentioned this pull request Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants