A key goal of Python is to avoid having to state the same thing more than once.
The original ("classic") camdkit
parameters, prior to the addition of tracking info, were with one exception POD ("Plain Old Data") types. There was a paradigm of re-use of POD descriptors for multiple parameters that might use those descriptors, e.g. there could be a descriptor representing UUIDs, the validate()
, to_json()
, from_json()
and make_json_schema()
methods were on that descriptor, and all parameters whose values were UUIDs could be based on this descriptor. Loosely speaking, all of the descriptor logic was in framework.py
, and all of the parameter logic was in model.py
, and it was very readable.
When the tracking info was introduced, and the parameters were very far from being specific instances of POD types, two things happened: first, the 'backing' descriptor in framework.py
became so specific to its use for a particular parameter that there could be no re-use, and second, in many cases the logic implementing the parameter semantics was showing up in model.py
(i.e the aforementioned validate()
, to_json()
, from_json()
and make_json_schema()
methods).
A goal of the Pydantic re-hosting of camdkit
was to reduce the number of places where one was "saying the same thing". As the Pydantic re-hosting PR will show, representing our information as a set of nested models allows one to put all the details of a metadatum's representation, range restrictions, etc. in a single place, without repeating oneself.
Pydantic takes care of validation (see "Cannot construct invalid objects..." below), serialization and deserialization, and schema generation itself. Thus many small errors (use of minLength
for an array instead of minItems
) are prevented because hand-coding is eliminated.
This is what most of you think of as our schema: something that expresses the set of valid metadata that can be passed "through the wire".
One thing not expressed in the hand-generated JSON schema in classic camdkit
is the way in which None
can be a valid value of a Parameter
, and the many places where that value of None
is the default. One might create a Clip
, and find that the initial value of (say) lens_serial_number
is None
, then set the value of lens_serial_number
to "foo"
, read it back to verify that the value is indeed now "foo"
, then decide to clear it so one explicitly assigns a value of None
to the parameter, and one can then read it back to be sure that it is in fact None
once more.
None of this would be evident from looking at the published schema generated by classic camdkit
, because what we publish is a serialization schema.
The Pydantic re-hosting of camdkit
produces a serialization schema when Clip.make_json_schema()
is called, but could easily produce a validation schema if that were desirable.
Code such as
import unittest
from camdkit.framework import Timestamp
from camdkit.model import TimingTimestamp
class TempConstructionCases(unittest.TestCase):
def test_timestamp_is_invalid(self):
self.assertFalse(TimingTimestamp.validate(Timestamp(-1, 2)))
will fail, because in order to run validate()
against something, that something needs to first be constructed, and Pydantic won't let you get even that far -- it will raise a ValidationError
when the attempt to construct a temporary object Timestamp(-1, 2)
fails.
Examples:
- TimecodeFormat objects contain a frame rate as well as the usual HH:MM:SS:FF, and this frame rate includes a sub_frame component, an integer representing a 0-based index into component pieces of the frame. Metadata parameters associated with the first field of an interlaced frame would have a sub_frame component of 0; the second field, 1. (n.b. the serialized canonical form of
sub_frame
issub_frame
).- In classic
camdkit
, the__init__()
method ofTimecodeFormat
defined inmodel.py
defaultedsub_frame
to 0, and theto_json()
method ofTimingTimecode
inmodel.py
always wrote it out. - In modified classic
camdkit
, if it turns out the value ofsub_frame
is the default, it is not serialized; it is assumed that the deserialization at the other end will reconstitute it from the default. - in Pydantic
camdkit
it is not serialized (because all serialization takes place inCompatibleBaseModel.to_json()
, and that method invokes Pydantic'smodel_dump
withexclude_defaults=True
).
- In classic
Examples:
-
The 0-9 range of protocol version number components is now indicated with
minimum
andmaximum
as the priorminValue
andmaxValue
were not valid for integers. -
the minimum of one element in arrays such as those used to carry distortion coefficients is now indicated with
minItems
as the classiccamdkit
use ofminLength
andmaxLength
was inappropriate (minLength
andmaxLength
are only valid for strings).
Examples:
StrictlyPositiveRealParameter
was added to support nominal focal length ans focus distance, two parameters where beyond negative values being disallowed, zero values are disallowed as well.
These two parameters are now based on StrictlyPositiveRealParameter
In classic camdkit
a nominal focal length of 13 mm would be serialized as 13
. In Pydantic, the model field for a nominal focal length serializes as 13.0
. In modified classic camdkit
the value is cast to a float before being serialized, thus producing a Pydantic-compatible 13.0
.
In classic camdkit
this requirement is not expressed by the make_json_schema()
method of Protocol
; in the modified classic camdkit
, this requirement is made explicit (and matches what Pydantic would produce for the corresponding BaseModel
-based object).
This conforms to what we require of other string parameters as well.
I am having trouble finding my references here, but I believe I've seen normative use of both upper-case and lower-case hex letters (i.e. both A-F and a-f) and both colon and hyphen separators.
Example: the numerator for the frame rate of timecode was UINT_MAX
, but should have been INT_MAX
.
Example: lens raw encoder values previously specified a minimum of 0 and no maximum; now they specify a minimum of 0 and a maximum of UINT_MAX (i.e. the largest unsigned 32-bit integer).
Example: "maximum": 2147483647
becomes "maximum": INT_MAX
Example: ...string betwee 0 and 1023 codepoints.
becomes ...string between 0 and 1023 codepoints
PEP 257 (referenced by the all-powerful PEP 8) allows for single-line docstrings; and PEP 8 says that lines can be up to 79 characters. This increases the readability of the code.