-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ntuple] fix type name normalization #17637
base: master
Are you sure you want to change the base?
Conversation
- Integer template arguments are written in standard decimal notation; only if necessary (long integer types), | ||
a suffix `l` or `ul` is added according to the template parameter type. | ||
|
||
The type name of a field has typedefs and usings fully resolved (except for standard integer typedefs). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may belong sooner in the document as the above mentioned rules are applying to the name with fully resolved typedefs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I'm not sure I understand. The normalization rules, as far as I understand, should apply to both the original type name and the resolved type name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The replacement of the typedef is an integral type of the 'normalization' of the name. Without that part, the name are not unique and thus not normalized. I.e. if you have both const vector<vector<Int_t> >
and vector<vector<int32_t, typedef_to_regular_allocator>>
"only" pretty printing the name does not help much. Similar if the typedef are not replaced, neither would/should the Integer template arguments be calculated/compacted. In the second example, without resolving the typedef we can not even remove the default template argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the user passes us vector<vector<int32_t, typedef_to_regular_allocator>>
, its "pretty printed" version will be registered as the field's alias type, i.e. the "original spelling" from the user. Otherwise (vector<vector<int>>
), the alias type is empty.
In any case, the type that is relevant is the resolved type name. The alias one is just for information, except for Double32_t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the original spelling, I guess we ought to decide to either really put it as-is or we need to spell out, separately from the normalization, what applies there.
For example, it is clear that the 'whitespace' rule would apply, maybe the const
rule and what else? (Not the typedef
for sure, what about the std::[u]int(8|16|32|64)_t
or even the fully qualified)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposal is to apply all the normalization rules to the original spelling except typedef resolution.
d260d3e
to
3560223
Compare
3560223
to
5e645de
Compare
the corresponding (at the time of writing) `std::[u]int(8|16|32|64)_t` standard integer typedef. | ||
- Type names used as template arguments are themselves normalized. | ||
- For user-defined types, default-initialized template arguments are explicitly spelled out; | ||
optional template arguments of stdlib types such as allocators are omitted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that for the I/O normalization we have elected to keep the stdlib optional argument whose value is not the default value. ie.:
TClass::GetClass("vector<int, other_alloc>")->GetName()
(const char *) "vector<int,other_alloc>"
I do not recall the detailed argumentation, but at the very least they 'may' be semantically relevant to the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think that should be looked / discussed in #16754.
Test Results 18 files 18 suites 4d 7h 14m 40s ⏱️ For more details on these failures, see this check. Results for commit 3187473. ♻️ This comment has been updated with latest results. |
e4045e6
to
2697372
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments inline
2697372
to
51e2e81
Compare
51e2e81
to
962b8eb
Compare
I'm still trying to wrap my head around the type alias commit. I think we should test it more with class templates, like diff --git a/tree/ntuple/v7/test/rfield_class.cxx b/tree/ntuple/v7/test/rfield_class.cxx
index 02cbc93ed6..52e1bc3ba4 100644
--- a/tree/ntuple/v7/test/rfield_class.cxx
+++ b/tree/ntuple/v7/test/rfield_class.cxx
@@ -278,6 +278,8 @@ TEST(RNTuple, TClassDefaultTemplateParameter)
model->MakeField<DataVector<int, float>>("f2");
model->AddField(RFieldBase::Create("f3", "DataVector<int>").Unwrap());
model->AddField(RFieldBase::Create("f4", "struct DataVector<bool,vector<unsigned>>").Unwrap());
+ model->AddField(RFieldBase::Create("f5", "DataVector<Double32_t>").Unwrap());
+ model->AddField(RFieldBase::Create("f6", "DataVector<int, double>").Unwrap());
auto writer = RNTupleWriter::Recreate(std::move(model), "ntpl", fileGuard.GetPath());
}
@@ -286,10 +288,18 @@ TEST(RNTuple, TClassDefaultTemplateParameter)
const auto &desc = reader->GetDescriptor();
EXPECT_EQ("DataVector<std::int32_t,double>", desc.GetFieldDescriptor(desc.FindFieldId("f1")).GetTypeName());
+ EXPECT_EQ("", desc.GetFieldDescriptor(desc.FindFieldId("f1")).GetTypeAlias());
EXPECT_EQ("DataVector<std::int32_t,float>", desc.GetFieldDescriptor(desc.FindFieldId("f2")).GetTypeName());
+ EXPECT_EQ("", desc.GetFieldDescriptor(desc.FindFieldId("f2")).GetTypeAlias());
EXPECT_EQ("DataVector<std::int32_t,double>", desc.GetFieldDescriptor(desc.FindFieldId("f3")).GetTypeName());
+ EXPECT_EQ("", desc.GetFieldDescriptor(desc.FindFieldId("f3")).GetTypeAlias());
EXPECT_EQ("DataVector<bool,std::vector<std::uint32_t>>",
desc.GetFieldDescriptor(desc.FindFieldId("f4")).GetTypeName());
+ EXPECT_EQ("", desc.GetFieldDescriptor(desc.FindFieldId("f4")).GetTypeAlias());
+ EXPECT_EQ("DataVector<double,double>", desc.GetFieldDescriptor(desc.FindFieldId("f5")).GetTypeName());
+ EXPECT_EQ("DataVector<Double32_t,double>", desc.GetFieldDescriptor(desc.FindFieldId("f5")).GetTypeAlias());
+ EXPECT_EQ("DataVector<std::int32_t,double>", desc.GetFieldDescriptor(desc.FindFieldId("f6")).GetTypeName());
+ EXPECT_EQ("", desc.GetFieldDescriptor(desc.FindFieldId("f6")).GetTypeAlias());
auto v1 = reader->GetView<DataVector<int>>("f1");
auto v3 = reader->GetView<DataVector<int>>("f3"); However, that doesn't seem to work right now:
I believe |
962b8eb
to
24c9896
Compare
e37177c
to
e54ceac
Compare
Following clarifications and fixes: - Strings in the meta-data are UTF-8 encoded (instead of ASCII), matching the section "Naming specification" - Extend restrictions on user-defined classes regarding bit-fields and template arguments - The type name normalization is specified in a dedicated section
Normalize and resolve type name only in the innermost overload.
For types that require meta (TClass, TEnum, etc.), apply the following normalization steps: - Use TClass::GetName() to get the type name normalization from meta - Renormalize that name according to the RNTuple specs
Compares a normalized original type name with the normalized resolved type name to determine if a field type alias (i.e. the use of a typedef) should be set. Previously, the logic was buggy comparing only partially normalized types.
e54ceac
to
3187473
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, some last minor comments from my side
tree/ntuple/v7/test/ntuple_types.cxx
Outdated
@@ -37,7 +37,10 @@ TEST(RNTuple, TypeNameNormalization) | |||
EXPECT_EQ("std::int32_t", RFieldBase::Create("f", "signed").Unwrap()->GetTypeName()); | |||
EXPECT_EQ("std::map<std::int32_t,std::int32_t>", RFieldBase::Create("f", "map<int, int>").Unwrap()->GetTypeName()); | |||
|
|||
EXPECT_TRUE(RFieldBase::Create("f", "std::int32_t").Unwrap()->GetTypeAlias().empty()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably want to keep this newly added test that std::int32_t
has an empty type alias
auto arrayLength = std::stoi(arrayDef[1]); | ||
auto arrayLength = std::stoull(arrayDef[1]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want more error checking here, for example std::stoull("123junk") = 123
without error reporting. std::stoull
takes a second pos
argument that stores the number of processed characters, or we can use the underlying std::strtoull
where str_end
should point to NULL
after successful conversion.
return GetNormalizedInteger(std::stoll(intTemplateArg)); | ||
return GetNormalizedInteger(std::stoull(intTemplateArg)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
Fixes type name and type alias normalization, especially for types that involve meta (TClass etc.). Introduces patch version 1.0.0.1 of the specification.
Fixes #17570