Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fairmat 2024: several new base classes in NXsample and NXsample_component #1413

Open
wants to merge 37 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
052672c
Readds mpes relevant changes from 4b064d9
domna Aug 15, 2023
cdd4526
Initial draft for major refactoring of NXsample.
lukaspie Aug 23, 2023
37a5815
Align description of activities and process with Area A's base section
lukaspie Aug 29, 2023
75ed0d7
Update after TF meeting discussion
lukaspie Aug 29, 2023
043212e
Add limits to no. of component sets and substances
lukaspie Aug 29, 2023
756babd
Docstring fixes, remove unneeded NX_CHAR types
lukaspie Aug 30, 2023
dd334e8
Comment out exists fields in NXsample and NXsample_component
lukaspie Aug 30, 2023
0646f1c
Make NXDLs
lukaspie Aug 30, 2023
94cff3b
Fix indentation in NXDL files
lukaspie Aug 30, 2023
3e1659f
Remove unused NXcrystal_structure from NXsample
lukaspie Aug 30, 2023
6fedba0
Regeneration of the nexus file for fixing the changes coming from old…
RubelMozumder Aug 24, 2023
5ba496b
Recreate nxdl files
lukaspie Aug 31, 2023
d76192c
Revert changes to NXsample
lukaspie Aug 31, 2023
d92ffc7
Add general concepts first defined in NXmpes_xps sub app-def
lukaspie Sep 27, 2023
881b56f
rename sample_history in NXsample(_component)
lukaspie Feb 16, 2024
5f898ee
rename sample_history to history
lukaspie Feb 19, 2024
9d5117d
revert style changes to NXsample
lukaspie Apr 4, 2024
47cd822
revert NXsample comment changes that break CI
lukaspie Apr 4, 2024
191621c
use NXidentifier in NXsample and NXfabrication
lukaspie Sep 10, 2024
766f70b
pull out modifications for fairmat-2024-nxsample
lukaspie Sep 20, 2024
dd5c9e9
revert unintentional changes from cherry-pick
lukaspie Sep 23, 2024
85bdf7d
bring in cited base classes
lukaspie Sep 23, 2024
b5106c9
bring in NXrotation_set
lukaspie Sep 23, 2024
1cb49af
bring in cited base classes
lukaspie Sep 23, 2024
e624ea8
use NX_QUATERNION in NXrotation_set
lukaspie Oct 4, 2024
67452c4
remove unneeded new classes
lukaspie Oct 7, 2024
9741b8c
remove NXsample_component_set for now
lukaspie Oct 7, 2024
a74a268
remove sample_id
lukaspie Oct 7, 2024
efba790
remove NXidentifier from NXsample
lukaspie Oct 7, 2024
b364d33
remove reference to NXsample_component_set
lukaspie Oct 7, 2024
fbd687c
use identifierNAME in NXsubstance
lukaspie Jan 16, 2025
8092ae8
remove base classes that describe sample components (will be discusse…
lukaspie Jan 16, 2025
af3ff9f
xml tag fix in NXsubstance
lukaspie Jan 16, 2025
8f27d2e
remove NXfabrication from sample base classes
lukaspie Jan 27, 2025
db4b40e
consistent indentation
lukaspie Jan 29, 2025
910d0d4
add comment to revisit enum for physical_form
lukaspie Jan 31, 2025
6180f23
address comments on NXsubstance
lukaspie Feb 3, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 26 additions & 7 deletions base_classes/NXsample.nxdl.xml
100755 → 100644
Original file line number Diff line number Diff line change
Expand Up @@ -379,13 +379,33 @@
</doc>
</field>
<group type="NXpositioner">
<doc>Any positioner (motor, PZT, ...) used to locate the sample</doc>
<doc>Any positioner (motor, PZT, ...) used to locate the sample</doc>
</group>
<group type="NXoff_geometry" minOccurs="0">
<doc>
This group describes the shape of the sample
</doc>
</group>
<field name="physical_form">
paulmillar marked this conversation as resolved.
Show resolved Hide resolved
<!-- REVISIT: should this be an open enumeration? -->
<doc>
Physical form of the sample material.
Examples include single crystal, foil, pellet, powder, thin film, disc, foam, gas, liquid, amorphous.
</doc>
</field>
<group type="NXenvironment">
<doc>
Any environmental or external stimuli/measurements.
These can include, among others:
applied pressure, surrounding gas phase and gas pressure,
external electric/magnetic/mechanical fields, temperature, ...
</doc>
</group>
<group name="history" type="NXhistory">
<doc>
A set of physical processes that occurred to the sample prior/during experiment.
</doc>
</group>
<group type="NXoff_geometry" minOccurs="0">
<doc>
This group describes the shape of the sample
</doc>
</group>
<attribute name="default">
<doc>
.. index:: plotting
Expand Down Expand Up @@ -418,4 +438,3 @@
</doc>
</group>
</definition>

10 changes: 8 additions & 2 deletions base_classes/NXsample_component.nxdl.xml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
</symbols>

<doc>
One group like this per component can be recorded For a sample consisting of multiple components.
One group like this per component can be recorded for a sample consisting of multiple components.
</doc>
<field name="name">
<doc>Descriptive name of sample component</doc>
Expand Down Expand Up @@ -139,6 +139,12 @@
<group name="transmission" type="NXdata">
<doc>As a function of Wavelength</doc>
</group>
<group name="history" type="NXhistory">
<doc>
A set of physical processes that occurred to the sample component prior/during
experiment.
</doc>
</group>
<attribute name="default">
<doc>
.. index:: plotting
Expand All @@ -152,4 +158,4 @@
for a summary of the discussion.
</doc>
</attribute>
</definition>
</definition>
137 changes: 137 additions & 0 deletions contributed_definitions/NXsubstance.nxdl.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
<?xml version='1.0' encoding='UTF-8'?>
<?xml-stylesheet type="text/xsl" href="nxdlformat.xsl"?>
<!--
# NeXus - Neutron and X-ray Common Data Format
#
# Copyright (C) 2014-2024 NeXus International Advisory Committee (NIAC)
#
# This library is free software; you can redistribute it and/or
# modify it under the terms of the GNU Lesser General Public
# License as published by the Free Software Foundation; either
# version 3 of the License, or (at your option) any later version.
#
# This library is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public
# License along with this library; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
#
# For further information, see http://www.nexusformat.org
-->
<definition xmlns="http://definition.nexusformat.org/nxdl/3.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" category="base" type="group" name="NXsubstance" extends="NXobject" xsi:schemaLocation="http://definition.nexusformat.org/nxdl/3.1 ../nxdl.xsd">
<doc>
A form of matter with a constant, definite chemical composition.

Examples can be single chemical elements, chemical compunds, or alloys.
For further information, see https://en.wikipedia.org/wiki/Chemical_substance.
</doc>
<field name="name">
<doc>
User-defined chemical name of the substance
</doc>
</field>
<field name="molecular_mass" type="NX_FLOAT" units="NX_MOLECULAR_WEIGHT">
<doc>
Molecular mass of the substance
</doc>
</field>
<field name="molecular_formula_hill">
<doc>
The chemical formula specified using CIF conventions.
Abbreviated version of CIF standard:107
This is the *Hill* system used by Chemical Abstracts.

* Only recognized element symbols may be used.
* Each element symbol is followed by a 'count' number. A count of '1' may be omitted.
* A space or parenthesis must separate each cluster of (element symbol + count).
* Where a group of elements is enclosed in parentheses, the multiplier for the
group must follow the closing parentheses. That is, all element and group
multipliers are assumed to be printed as subscripted numbers.
* Unless the elements are ordered in a manner that corresponds to their chemical
structure, the order of the elements within any group or moiety depends on
whether or not carbon is present.
* If carbon is present, the order should be:
- C, then H, then the other elements in alphabetical order of their symbol.
- If carbon is not present, the elements are listed purely in alphabetic order of their symbol.
</doc>
</field>
<field name="identifier_cas">
<doc>
Unique CAS REGISTRY URI.
For further information, see https://www.cas.org/.
</doc>
<attribute name="type">
<enumeration>
<item value="URL"/>
</enumeration>
</attribute>
<attribute name="cas_number">
<doc>
Numeric CAS REGISTRY number associated with this identifier.
</doc>
</attribute>
<attribute name="cas_name">
<doc>
CAS REGISTRY name associated with this identifier.
</doc>
</attribute>
</field>
<group name="cas_image" type="NXnote">
<doc>
CAS REGISTRY image
</doc>
</group>
<field name="identifier_inchi_str">
<doc>
Standard string InChi identifier" (as per v1.02).

The InChI identifier expresses chemical structures in terms of atomic connectivity,
tautomeric state, isotopes, stereochemistry and electronic charge in order to
produce a string of machine-readable characters unique to the respective molecule.
For further information, see https://iupac.org/who-we-are/divisions/division-details/inchi/.
</doc>
</field>
<field name="identifier_inchi_key">

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having both the identifier_inchi_str field and the identifier_inchi_key seems redundant. If you have one, you should be able to figure out the other. If so, do we really need to store (or support storing) both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is that you could enter either one of these concepts and then, if you are using a proper RDM system (like the one we are building with NOMAD), all the other concepts can be filled automatically. So this is really for the case where you may have the str or the key on hand and just want to fill out the rest from the database.

<doc>
Condensed, 27 character InChI key.
Hashed version of the full InChI (using the SHA-256 algorithm).
</doc>
</field>
<field name="identifier_iupac_name">
<doc>
Name according to the IUPAC system (standard).
For further information, see https://iupac.org/.
</doc>
</field>
<field name="identifier_smiles">
<doc>
Identifier in the SMILES (Simplified Molecular Input Line Entry System) system

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why support storing non-canonical representations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think people often work with just the SMILES system (not the canonical one) and there is an algorithm to create the canonical version based on that. So again, you could enter the smiles and the canonical version would get calculated automatically.

But I am certainly no expert on this, so would be open to changing this.

For further information, see https://www.daylight.com/smiles/.
</doc>
</field>
<field name="identifier_canonical_smiles">
<doc>
Canonical version of the SMILES identifier
</doc>
</field>
<field name="identifier_pub_chem">
<doc>
Standard PubChem identifier (CID).

The PubChem Compound Identifier (CID) is a unique numerical identifier assigned to
a compound in the PubChem database, which contains information on the biological activities
of small molecules. The CID allows users to access detailed data about compounds, including
their chemical structure, molecular formula, and biological properties.

For further information, see https://pubchem.ncbi.nlm.nih.gov/.
</doc>
<attribute name="pub_chem_link">
<doc>
CAS REGISTRY name associated with this identifier.
</doc>
</attribute>
</field>
</definition>