Add a skeleton of the trace data model #2012

jmacd · 2021-10-12T18:02:53Z

Part of #1929.

Changes

Adds text specifying how to interpret Span data in the OpenTelemetry trace data model.

Related PR #1899

Prior to incorporating the changes of OTEP 168, we need a foundation.

tigrannajaryan · 2021-10-13T18:47:43Z

specification/trace/datamodel.md

+Span context is the portion of the OpenTelemetry Context that makes up
+the tracing data model.  This is specified by reference to the [W3C


that makes up the tracing data model.

This is not quite clear. Can you expand/reword?

tigrannajaryan · 2021-10-13T18:49:23Z

specification/trace/datamodel.md

+
+## Span fields
+
+### TraceID


Can we use the same TraceId capitalization as the log data model?

tigrannajaryan · 2021-10-13T18:50:25Z

specification/trace/datamodel.md

@@ -0,0 +1,169 @@
+# Trace Data Model
+
+**Status**: [Mixed](../document-status.md)


Why is this Mixed? Do we have anything unstable in the trace data model?

yurishkuro · 2021-10-14T01:09:47Z

specification/trace/datamodel.md

+The OpenTelemetry data model for tracing consists of a protocol
+specification for encoding spans, which represent an individual unit


protocol specification for encoding spans

This combination sounds very confusing to me. I think you're defining a logical data model that includes entities, their attributes, and relationships between entities. So I don't understand what to make of the words "protocol", "specification", and "encoding" in this context.

Another question is: whose data model are you trying to describe, the API's or the SDK's? They have different shapes.

I'm trying to write a document that helps someone interpret the OTLP Span protocol without reading a .proto file, where I can add further detail about how to interpret the OTel tracestate field used for probability sampling. Right now the .proto file says this of tracestate:

// trace_state conveys information about request position in multiple distributed tracing graphs. // It is a trace_state in w3c-trace-context format: https://www.w3.org/TR/trace-context/#tracestate-header // See also https://github.com/w3c/distributed-tracing for more details about this field.

Tangentially, I'm actually not sure how multi-tenancy is expected to work in W3C and it's not a part of the SDK or the API's model. See #1852 (comment)

In any case, I think there should be one model--what differences between the API and the SDK "shape" should be reflected in this kind of document?

Each layer can have a different logical model of the data it operates on. The API data model would be the smallest surface, SDK's data model could be larger (e.g. in OpenTracing API there was nothing related to sampling, but in Jaeger SDK implementing that API there was is_sampled state on the span context, as well as other fields that were implementation details of the SDKs exposed to its internal interfaces). OTLP as a wire format is another layer which might have yet another data model (e.g. it has things like ResourceSpans, which are not a concept in the API).

So you may be trying to write a document that describes the logical data model of OTLP (aka physical data model since there's no more abstractions left), but such description would be ~90% consisting of the logical data model of the SDK, which in turn is only a small extension of the API data model.

a document that helps someone interpret the OTLP Span protocol without reading a .proto file

For the record, can't say I sympathize with this goal, what's wrong with reading the specification of the protocol to understand the protocol? It's another thing that .proto should not need to explain what a Span is or what Span name requirements are, that should come from the other parts of the spec.

The metrics and logs data model files are written without reference to the SDK and API specifications, and I'm not sure why someone should have to understand either to interpret trace data. The reason this happened in the metrics specification is that there are complicated relationships between fields that can't be documented in a single field. An example in trace is "root span". We need to use this term to explain a bunch of things, but it doesn't have a field in the proto. In metrics, we decided that datamodel.md should be the source of truth, not the .proto.

In practice, I haven't followed the trace specification as closely as I have metrics, and now that I'm invested in adding probability sampling to the specification, I think this is sorely needed. As an example, I've just read through the current api.md and sdk.md to try to understand how tracestate is specified. It's incomplete, and the ground is already unstable because W3C refers to "tenants" and we have no equivalent concepts in OTel.

There are APIs given for manipulating tracestate, but no explanation of what it is or why you'd use it (WHY WOULD A USER USE THIS API?)

There is no mention of tracestate in a Span Link

You have to read between the lines to determine that the Span data contains the tracestate returned by the Sampler, but even then it rests on assumptions. E.g., "A Tracestate that will be associated with the Span through the new
SpanContext." doesn't actually say that the recorded span object will contain the associated tracestate, but again if you go to the .proto all it says is:

// trace_state conveys information about request position in multiple distributed tracing graphs. // It is a trace_state in w3c-trace-context format: https://www.w3.org/TR/trace-context/#tracestate-header // See also https://github.com/w3c/distributed-tracing for more details about this field.

So, I'll try again.

yurishkuro · 2021-10-14T01:11:09Z

specification/trace/datamodel.md

+corresponding to a named operation, representing its part in the
+overall, distributed unit of work.
+
+### Root span


Is this actually part of the model? Is there an attribute or trait that tags a span as root span? I think this belongs to the glossary.

(This is in the Glossary section.)

This is why I bring up the question about multi-tenancy, which I don't see as a part of the OpenTelemetry data model at this time (despite @Oberon00's remarks).

yurishkuro · 2021-10-14T01:15:23Z

specification/trace/datamodel.md

+
+**Status**: [Stable](../document-status.md)
+
+The OpenTelemetry Span encodes the `tracestate` that was computed when


This is kind of a circular definition. Can we give a semantic definition of what it represents rather than how it's populated?

I don't know that there is a semantic definition for tracestate, I see it as a thing W3C gives vendors to use as they like. We're already in questionable territory, IMO, because we're calling OpenTelemetry a "vendor" when we declare a use of tracestate for ourselves.

See also #1852 (comment) where it seems there are at least two semantic definitions possible ("universal" and "per-tenant")

yurishkuro · 2021-10-14T01:16:06Z

specification/trace/datamodel.md

+**Status**: [Stable](../document-status.md)
+
+The OpenTelemetry Span name is a short, human-readable description of
+the work performed within the span's context.


Suggested change

the work performed within the span's context.

the work performed by the application and represented by the span.

I am pretty sure we already have the definition of things like span name elsewhere in the spec. I think we should think about eliminating such duplication and referencing other parts of the spec. To me the reasonable hierarchy would be: start with logical data model that describes what the model entities/attributes represent, which then refer to it in the API spec to describe the operations on those entities (but no longer describing the meaning of entities/attributes).

Yes, I'll have to refactor this PR.
There is one sentence in api.md that belongs here, and instead that document can link to this one.

iNikem · 2021-10-14T05:55:41Z

specification/trace/datamodel.md

+The OpenTelemetry SpanID is defined to identify the span that is the
+parent of a new trace context, equivalent to the W3C trace context


span that is the parent of a new trace context

This sentence is confusing. What is trace context? How can span be a parent of it? Does every SpanId identify a parent?

jmacd · 2021-10-14T10:40:57Z

I will incorporate the feedback here and re-open the PR when it is ready for review again.

@yurishkuro I am most interested in your reply to #2012 (comment) and would be glad for you to join the discussion in #1852 while I continue to edit this document.

Joshua MacDonald added 2 commits October 12, 2021 10:59

Add a skeleton of the trace data model

fa69295

changelog

ea64b53

jmacd marked this pull request as ready for review October 13, 2021 16:33

jmacd requested review from a team October 13, 2021 16:33

github-actions bot assigned jsuereth Oct 13, 2021

tigrannajaryan reviewed Oct 13, 2021

View reviewed changes

yurishkuro reviewed Oct 14, 2021

View reviewed changes

iNikem reviewed Oct 14, 2021

View reviewed changes

arminru added area:data-model For issues related to data model spec:trace Related to the specification/trace directory labels Oct 14, 2021

jmacd closed this Oct 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a skeleton of the trace data model #2012

Add a skeleton of the trace data model #2012

jmacd commented Oct 12, 2021

tigrannajaryan Oct 13, 2021

tigrannajaryan Oct 13, 2021

tigrannajaryan Oct 13, 2021

yurishkuro Oct 14, 2021

yurishkuro Oct 14, 2021

jmacd Oct 14, 2021

yurishkuro Oct 15, 2021

jmacd Oct 15, 2021

yurishkuro Oct 14, 2021

jmacd Oct 14, 2021

yurishkuro Oct 14, 2021

jmacd Oct 14, 2021

yurishkuro Oct 14, 2021

yurishkuro Oct 14, 2021

jmacd Oct 14, 2021

iNikem Oct 14, 2021

jmacd commented Oct 14, 2021

		Span context is the portion of the OpenTelemetry Context that makes up
		the tracing data model. This is specified by reference to the [W3C

		@@ -0,0 +1,169 @@
		# Trace Data Model

		Status: [Mixed](../document-status.md)

		The OpenTelemetry data model for tracing consists of a protocol
		specification for encoding spans, which represent an individual unit


		Status: [Stable](../document-status.md)

		The OpenTelemetry Span encodes the `tracestate` that was computed when

	the work performed within the span's context.
	the work performed by the application and represented by the span.

		The OpenTelemetry SpanID is defined to identify the span that is the
		parent of a new trace context, equivalent to the W3C trace context

Add a skeleton of the trace data model #2012

Add a skeleton of the trace data model #2012

Conversation

jmacd commented Oct 12, 2021

Changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmacd commented Oct 14, 2021