Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: unencoded subpath cannot contain . or .. #368

Merged
merged 4 commits into from
Mar 4, 2025

Conversation

jeremylong
Copy link
Contributor

Add test cases to check for invalid . and .. segments in the subpath.

supercedes package-url#52

Signed-off-by: Jeremy Long <jeremy.long@gmail.com>
jkowalleck
jkowalleck previously approved these changes Feb 13, 2025
jkowalleck
jkowalleck previously approved these changes Feb 13, 2025
@jkowalleck jkowalleck mentioned this pull request Feb 13, 2025
3 tasks
@jkowalleck
Copy link
Member

@matt-phylum
Copy link
Contributor

I'm wondering if this is actually a good idea. I can agree that the subpath should not contain . or .. segments, and I would prefer if implementations didn't need path normalization logic, but I don't think parsers and formatters should validate that the subpath does not contain . or .. segments. The problem is that if you observe a PURL pkg:golang/google.golang.org/genproto@abcdedf#/googleapis/../api/annotations/ it cannot be understood at all if the parser rejects the .., and it cannot be canonicalized or otherwise parsed and then written back out if the formatter rejects the ... It's syntactically valid and at least the type name version have an clear, unambiguous meaning, but because of the .. everything is thrown away.

@jkowalleck
Copy link
Member

jkowalleck commented Feb 13, 2025

[...] but I don't think parsers and formatters should validate that the subpath does not contain . or .. segments.

why? i mean, this is already part of the spec:

The problem is that if you observe a PURL pkg:golang/google.golang.org/genproto@abcdedf#/googleapis/../api/annotations/ it cannot be understood at all if the parser rejects the .., and it cannot be canonicalized or otherwise parsed and then written back out if the formatter rejects the ... It's syntactically valid and at least the type name version have an clear, unambiguous meaning, but because of the .. everything is thrown away.

per spec, the .. and . are to be discarded, rest is to be kept untouched. to attempt of path-resolution or anything. foo/../bar becomes foo/bar. (not bar)

@matt-phylum
Copy link
Contributor

matt-phylum commented Feb 13, 2025

Then these tests are wrong, right? They have "is_invalid": true, and they include the . and .. in the parsed subpaths.

@jkowalleck

This comment was marked as outdated.

@jkowalleck jkowalleck self-requested a review February 13, 2025 15:10
@dwalluck
Copy link

dwalluck commented Mar 3, 2025

pkg:golang/google.golang.org/genproto@abcdedf#/googleapis/../api/annotations/ it cannot be understood at all if the parser rejects the .., and it cannot be canonicalized or otherwise parsed and then written back out if the formatter rejects the ... It's syntactically valid and at least the type name version have an clear, unambiguous meaning, but because of the .. everything is thrown away.

My problem is that a human reading /googleapis/../api/annotations/ would assume that it means /api/annotations/ when in fact it means /googleapis/api/annotations/. So, I'd prefer an error instead of silently dropping.

The same could be said for qualifiers keys that are not lowercase or are empty, too, in that what the parser does, and what a human might assume that they mean are two different things.

@jkowalleck
Copy link
Member

pkg:golang/google.golang.org/genproto@abcdedf#/googleapis/../api/annotations/ it cannot be understood at all if the parser rejects the .., and it cannot be canonicalized or otherwise parsed and then written back out if the formatter rejects the ... It's syntactically valid and at least the type name version have an clear, unambiguous meaning, but because of the .. everything is thrown away.

My problem is that a human reading /googleapis/../api/annotations/ would assume that it means /api/annotations/ when in fact it means /googleapis/api/annotations/. So, I'd prefer an error instead of silently dropping.

The same could be said for qualifiers keys that are not lowercase or are empty, too, in that what the parser does, and what a human might assume that they mean are two different things.

this would be a change of the current spec.
please open an extra ticket for that.

Copy link
Member

@johnmhoran johnmhoran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@dwalluck
Copy link

dwalluck commented Mar 4, 2025

Sorry to comment on a closed pull request, but the subpath in the test refers to the non-normalized subpath. This doesn't seem right---aren't the components supposed to refer to the components in the canonical purl? For example, like type in this same example refers to "golang" and not "GOLANG".

After the segments are dropped, the resulting subpath component will not contain the dots but the test has them.

@jkowalleck
Copy link
Member

Sorry to comment on a closed pull request, but the subpath in the test refers to the non-normalized subpath. This doesn't seem right---aren't the components supposed to refer to the components in the canonical purl? For example, like type in this same example refers to "golang" and not "GOLANG".

After the segments are dropped, the resulting subpath component will not contain the dots but the test has them.

please open a pull request or annotate the code you are referring to.

please be aware of the purpose of the test components:

  • To test ``purl`` parsing and building, a tool can use this test suite and for
    every listed test object, run these tests:
    - parsing the test canonical ``purl`` then re-building a ``purl`` from these parsed
    components should return the test canonical ``purl``
    - parsing the test ``purl`` should return the components parsed from the test
    canonical ``purl``
    - parsing the test ``purl`` then re-building a ``purl`` from these parsed components
    should return the test canonical ``purl``
    - building a ``purl`` from the test components should return the test canonical ``purl``

@dwalluck
Copy link

dwalluck commented Mar 4, 2025

Opened #404 with an explanation.

dwalluck added a commit to dwalluck/packageurl-java that referenced this pull request Mar 4, 2025
This change makes the two test pass that were introduced in
package-url/purl-spec#368.

See also package-url/purl-spec#404.
dwalluck added a commit to dwalluck/packageurl-java that referenced this pull request Mar 11, 2025
This change makes the two test pass that were introduced in
package-url/purl-spec#368.

See also package-url/purl-spec#404.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants