Statement by reference #35

letmaik · 2022-11-14T18:03:10Z

At IETF 115 there were discussions on having some kind of standard way to deal with statements by reference. Here are also the two relevant slides from https://datatracker.ietf.org/doc/slides-115-scitt-combined-scitt-presentations/:

Simply using COSE detached payloads as defined in the RFC would not be sufficient as the payload would still be required during signature validation when registering the signed statement.

Instead, having a specific content type for referencing external statements may be useful. Note that this format by itself would be a statement.

RFC 9054 gives two examples for such hash structures:

COSE_Hash_V = (
    1 : int / tstr, # Algorithm identifier
    2 : bstr, # Hash value
    ? 3 : tstr, # Location of object that was hashed
    ? 4 : any   # object containing other details and things
    )

and

COSE_Hash_Find = [
    hashAlg : int / tstr,
    hashValue : bstr
]

SUIT's digest container defines this as:

SUIT_Digest = [
  suit-digest-algorithm-id : suit-cose-hash-algs,
  suit-digest-bytes : bstr,
  * $$SUIT_Digest-extensions   ; described as optional extra values required by a hash alg (?)
]

Would having a variant of one of the above as a CBOR content type address this issue?
Should location of the referenced content be included? How? Should location hints be globally unique? Resolvable?
Should a SCITT transparency service know about this content type and at least validate its CDDL schema?

The text was updated successfully, but these errors were encountered:

letmaik · 2022-11-16T08:29:47Z

Another thing to include in the reference should be the content type.

OR13 · 2022-11-17T16:49:48Z

The COSE representations are killing me... I find them incredibly hard to process.

Here is an example I am familiar with:

https://blog.cloudflare.com/cloudflare-distributed-web-resolver/
https://docs.ipfs.tech/concepts/content-addressing/

https://ipfs.io/ipfs/QmbWqxBEKC3P8tqsKc98xmWNzrzDtRLMiMPL8wBuTGsMnR
https://ipfs.io/ipfs/bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi

https://docs.ipfs.tech/how-to/best-practices-for-nft-data/#types-of-ipfs-links-and-when-to-use-them

const cid = await ipfs.add({ content }, {
  cidVersion: 1,
  hashAlg: 'sha2-256'
})

ipfs://bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi

If you don't want to use IPFS, but you want something similar, you can do what docker has been doing:

sha256:fc92eec5cac70b0c324cec2933cd7db1c0eae7c9e2649e42d02e77eb6da0d15f

^ this won't help you resolve or dereference, but it will help you identify.

rjb4standards · 2022-11-17T16:55:50Z

FYI: Sha256 ID's have been working well for over a year for our own registry SAG-CTR. Have not seen any collisions yet.

OR13 · 2022-11-17T16:56:45Z

If you want to build a custom identifier scheme for statements you should consider the precedent of Data URIs:

https://github.com/transmute-industries/did-method-meliorism/blob/f2a7d8673a7b49a6fae84c4348614109ff35409b/src/cli.js#L153

https://en.wikipedia.org/wiki/Data_URI_scheme

data:text/vnd-example+xyz;foo=bar;base64,R0lGODdh

data:text/vnd-scitt+claim;hash=sha256;content-type=application/vnd-cid+ipld;base64,R0lGODdh

See also Tag 42: https://www.iana.org/assignments/cbor-tags/cbor-tags.xhtml

letmaik · 2022-11-18T08:58:41Z

From back to front:

See also Tag 42: https://www.iana.org/assignments/cbor-tags/cbor-tags.xhtml

CIDs use multicodec for identifying content types and those have to registered here: https://github.com/multiformats/multicodec/blob/master/table.csv. I think SCITT should allow CIDs in some way but it doesn't look like a general enough mechanism.

If you want to build a custom identifier scheme for statements you should consider the precedent of Data URIs:

I think what you're suggesting is creating a new media type where the hash of the referenced data is the content and everything else is put into media type parameters, for example:

Media type: application/scitt-statement-by-reference;hash=sha256;content-type=application/spdx
Content: binary sha256 hash of statement

You could put all that into a Data URI by base64 encoding the content (hash), but I don't see where this would go in the COSE envelope and how it interacts with the cty parameter. Base64 encoding the hash seems also a bit wasteful. I don't see how decoding such a Data URI is easier than decoding a CBOR structure to be honest.

The COSE representations are killing me... I find them incredibly hard to process.

Where exactly do you see problems in processing the CBOR representations? Would the same be true for an equivalent JSON representation?

My general feeling is that the detached use case may become the thing that's used exclusively in some settings, and so if we define a standard mechanism for that I think it should be as efficient as possible and not necessarily rely on text representations. In that sense, CBOR CIDs (as mentioned above) go in the right direction but are quite hard to decode (and introduce yet another format next to CBOR) and too limited I think (see above).

SteveLasker · 2022-11-21T16:06:31Z

Have you considered PURLs:?
Here's one we did a while back, specifically for this purpose:
https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#oci

We also spent time discussing separating identity from location:

Separating Identity From Location
and the purl discussion: Decoupling Location from Identity - Is this in the scope of purl?

OR13 · 2022-11-21T16:56:12Z

Where exactly do you see problems in processing the CBOR representations?

Readability and types... getting in the way of "representative examples".

I prefer to argue over a representative example, and then map from it to existing building blocks, not the other way round.

letmaik · 2022-11-21T17:03:30Z

Slides:
Discussion - Statement by reference.pdf

letmaik · 2022-11-22T10:57:46Z

Where exactly do you see problems in processing the CBOR representations?

Readability and types... getting in the way of "representative examples".

I prefer to argue over a representative example, and then map from it to existing building blocks, not the other way round.

Alright, makes sense. So your concern is not about the implementations but just to facilitate discussions. I guess sometimes the two are intertwined but let's try anyway.

letmaik · 2022-11-23T11:25:01Z

@OR13 This is my attempt at representative examples. Is this what you had in mind? Any others you can think of?

Statement stored in undeclared location

hash alg: sha-256
hash: abc
content type: application/foo

Statement stored on web server

hash alg: sha-256
hash: abc
content type: application/foo
location: https://example.com/statements/abc.json

Statement stored on IPFS

hash alg: sha-256
hash: abc
content type: application/foo
location: ipfs://QmPK1s3pNYLi9ERiq3BDxKa4XosgWwFRQUydHUtz4YgpqB

Note: The hash embedded within the CID is not the hash of the raw content!
See https://docs.ipfs.tech/concepts/hashing/#content-identifiers-are-not-file-hashes.

Note2: The content type embedded within the CID cannot be arbitrary.
See https://github.com/multiformats/multicodec/blob/master/table.csv and search for "ipld".
The raw type may be a reasonable fall-back and a specific content type may be stored outside
of the CID.

Statement stored in OCI registry

hash alg: sha-256
hash: abc
content type: application/foo
location: docker.io/library/example@sha256:def

Note: The hash in the location is not the hash of the raw content, but rather of a manifest.
There are a few indirections that make it a bit hard to understand.
See the in-development ORAS artifacts spec at https://github.com/oras-project/artifacts-spec.
Does referencing the location with hash add any benefit? Would a flexible tag be enough?

Note2: The Notary project also defines signing over OCI artifacts and may be in conflict.
See https://github.com/notaryproject/notaryproject.

Statement stored in DID service endpoint

hash alg: sha-256
hash: abc
content type: application/foo
location: did:example:123?service=files&relativeRef=/statement.json

Note: DID dereferencing would be used to retrieve the statement from the given location.

Note2: The DID in the location may be distinct from the issuer of the signed statement.

OR13 · 2022-11-30T22:01:43Z

@letmaik these are excellent examples.

Statement stored on IPFS

The 2 notes are interesting.

perhaps its not for this repo, but I would love to generate some "real examples" from some "safe / fake data"

So we can see the actual proposed data structures.

If there is a repo where I can do that work, I'm happy to tackle the IPFS examples, I did something similar recently for this:

https://github.com/transmute-industries/ns.transmute.org

OR13 · 2022-12-02T19:23:14Z

We have a generic use case for "signed statement" by reference, which I would love to explore as well.

When my statement refers to several "other signed statements" or "transparent statements" by reference.

letmaik · 2023-02-08T17:09:46Z

@OR13 Let's experiment here: https://github.com/ietf-scitt/statements-by-reference

OR13 · 2023-02-08T23:44:46Z

I filed: ietf-scitt/statements-by-reference#1

yogeshbdeshpande · 2023-02-21T15:23:46Z

@letmaik to move this issue to new repository retaining all the history of Conversations been tracked here!

yogeshbdeshpande · 2023-02-21T15:31:23Z

@fournet & @letmaik to work on a PR for Architeture

SteveLasker · 2023-02-21T16:00:52Z

Closing as this should be covered by: ietf-wg-scitt/draft-ietf-scitt-architecture#8

SteveLasker mentioned this issue Nov 16, 2022

Converge Claim and Statement #34

Closed

SteveLasker closed this as completed Feb 21, 2023

SteveLasker mentioned this issue Feb 27, 2023

Statement by reference ietf-wg-scitt/draft-ietf-scitt-architecture#18

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Statement by reference #35

Statement by reference #35

letmaik commented Nov 14, 2022

letmaik commented Nov 16, 2022

OR13 commented Nov 17, 2022 •

edited

Loading

rjb4standards commented Nov 17, 2022

OR13 commented Nov 17, 2022

letmaik commented Nov 18, 2022

SteveLasker commented Nov 21, 2022 •

edited

Loading

OR13 commented Nov 21, 2022

letmaik commented Nov 21, 2022

letmaik commented Nov 22, 2022

letmaik commented Nov 23, 2022

OR13 commented Nov 30, 2022

OR13 commented Dec 2, 2022 •

edited

Loading

letmaik commented Feb 8, 2023

OR13 commented Feb 8, 2023

yogeshbdeshpande commented Feb 21, 2023

yogeshbdeshpande commented Feb 21, 2023

SteveLasker commented Feb 21, 2023

Statement by reference #35

Statement by reference #35

Comments

letmaik commented Nov 14, 2022

letmaik commented Nov 16, 2022

OR13 commented Nov 17, 2022 • edited Loading

rjb4standards commented Nov 17, 2022

OR13 commented Nov 17, 2022

letmaik commented Nov 18, 2022

SteveLasker commented Nov 21, 2022 • edited Loading

OR13 commented Nov 21, 2022

letmaik commented Nov 21, 2022

letmaik commented Nov 22, 2022

letmaik commented Nov 23, 2022

Statement stored in undeclared location

Statement stored on web server

Statement stored on IPFS

Statement stored in OCI registry

Statement stored in DID service endpoint

OR13 commented Nov 30, 2022

OR13 commented Dec 2, 2022 • edited Loading

letmaik commented Feb 8, 2023

OR13 commented Feb 8, 2023

yogeshbdeshpande commented Feb 21, 2023

yogeshbdeshpande commented Feb 21, 2023

SteveLasker commented Feb 21, 2023

OR13 commented Nov 17, 2022 •

edited

Loading

SteveLasker commented Nov 21, 2022 •

edited

Loading

OR13 commented Dec 2, 2022 •

edited

Loading