-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OPTIMADE JSON Lines specification appendix #531
base: develop
Are you sure you want to change the base?
Conversation
Fix code block formatting Update optimade.rst Apply suggestions from code review Apply suggestions from code review Try to fix formatting again Fix formatting and add note about provider fields
6929f10
to
c5f1f53
Compare
96b0231
to
78e158c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ml-evs, and sorry for the delays. Now had some time to take a look. I like the proposal and it can be used for optimade-maker as well. But I wrote some comments for consideration.
Co-authored-by: Kristjan Eimre <eimrek@users.noreply.github.com>
Co-authored-by: Kristjan Eimre <eimrek@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good; and as we said during the web meet, very important extension.
However, how are relationships encoded? Maybe it would be nice if the example included a structure that has a relationship to a reference so one can see explicitly how that is encoded?
Co-authored-by: Rickard Armiento <gitcommits@armiento.net>
|
||
|
||
This JSONL format can also be used to share provider-specific properties. | ||
These should be consistent with any external definitions, and where appropriate, prefixes tied to the tools used to generate the file should be used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not quite get the "prefixes tied to the tools used to generate the file should be used" part. Are there any examples for such cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yeah, this has lost its meaning a bit. The intention was that a tool like aiida or ASE could support OPTIMADE JSONL as an export format, and any "tool-specific definitions" like _aiida_node_id
(or whatever) should be defined and use the tool prefix. I wouldn't be against removing this sentence as the format just follows the normal OPTIMADE rules about defining properties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, thanks for the explanation. Since "MUST" is not used in this paragraph, I gather its main purpose is to recommend best practice. We may probably want that all the OPTIMADE JSONL files bear some sort of producer identification, but we probably do not want to standardize it at this point. But maybe we can reuse meta.implementation
dictionary from the main specification?
Maybe it is worth stressing that OPTIMADE JSONL files have the same requirements ("MUST/SHOULD/MAY") as OPTIMADE responses, in terms of custom properties having prefixes and metadata descriptions. But maybe it is enough just to say that?
Also, what about sharing provider-specific entry types? If we mention provider-specific properties, should be allow entry types as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding non-optimade-spec properties, doesn't the spec say they must have either
- the provider specific prefix; or
- a defininition-provider-specific prefix?
I think we can just point to section 3.5 here, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me either. The part on "prefixes tied to the tools used to generate the file" is unclear to me, otherwise happy to merge this.
Co-authored-by: Andrius Merkys <andrius.merkys@gmail.com>
An additional question about custom prefixes, that is (still) not clear to me, is what should happen when Should all or would |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot, Matthew, for working on this. Left some comments.
|
||
- ``api_version``: The OPTIMADE API version used when generating the file, as described in the ``meta`` member in `JSON Response Schema: Common Fields`_. | ||
|
||
- The next line MAY contain a standard OPTIMADE ``meta`` object, following the same rules described in `JSON Response Schema: Common Fields`_, where every MUST and SHOULD rule can be reinterpreted as a MAY rule. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we specify what meaning fields like meta.data_returned
take in this context?
- First the base info response MUST be provided, following the description at `Base Info Endpoint`_. | ||
- The next lines MUST contain the entry info endpoint responses for the all entry types present later in the file, as described in `Entry Listing Info Endpoints`_. | ||
- The remaining lines of the file contain data entries themselves, described in `Entry Listing JSON Response Schema`_. | ||
These entries can be provided in any order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order of the data entries is not clear to me. Below, we say <entries block ordered by entry type>
, while here they can be in any order. I don't have strong feelings about this, but it should be clear. If we want to mandate block-order (e.g. all structures first, then all references, etc), i guess we should say something about both, 1) the order of the blocks (should they match the info endpoint order?); 2) the order of the entries within the blocks.
My own gut feeling is that perhaps the order of blocks is not important, but maybe it's good to have one type of entries all together in a block.
These should be consistent with any external definitions, and where appropriate, prefixes tied to the tools used to generate the file should be used. | ||
It is RECOMMENDED that custom properties are defined in full within the JSONL file, or pointed to a specific versioned property definition. | ||
|
||
Example OPTIMADE JSON Lines File |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example OPTIMADE JSON Lines File | |
Example file |
Considering there is now two OPTIMADE JSON Lines File
standards (other one being 9.4), maybe better to be more vague here (e.g. i already mixed these up when looking through the table of contents).
|
||
- The next line MAY contain a standard OPTIMADE ``meta`` object, following the same rules described in `JSON Response Schema: Common Fields`_, where every MUST and SHOULD rule can be reinterpreted as a MAY rule. | ||
- The next block of lines provides the ``info`` endpoint responses. | ||
- First the base info response MUST be provided, following the description at `Base Info Endpoint`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this sublist is not rendering correctly to me
This PR begins the work on #471, by defining the conventions on top of JSON Lines for deserializing an entire OPTIMADE API into a file.