-
Notifications
You must be signed in to change notification settings - Fork 904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a skeleton of the trace data model #2012
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,169 @@ | ||||||
# Trace Data Model | ||||||
|
||||||
**Status**: [Mixed](../document-status.md) | ||||||
|
||||||
<!-- Re-generate TOC with `markdown-toc --no-first-h1 -i` --> | ||||||
|
||||||
<!-- toc --> | ||||||
|
||||||
- [Overview](#overview) | ||||||
- [Glossary](#glossary) | ||||||
* [Trace](#trace) | ||||||
* [Span](#span) | ||||||
* [Root span](#root-span) | ||||||
* [Context](#context) | ||||||
* [Span context](#span-context) | ||||||
* [Trace flags](#trace-flags) | ||||||
* [Tracestate](#tracestate) | ||||||
- [Span fields](#span-fields) | ||||||
* [TraceID](#traceid) | ||||||
* [SpanID](#spanid) | ||||||
* [TraceState](#tracestate) | ||||||
* [ParentSpanID](#parentspanid) | ||||||
* [Name](#name) | ||||||
* [SpanKind](#spankind) | ||||||
* [StartTimeUnixNano](#starttimeunixnano) | ||||||
* [EndTimeUnixNano](#endtimeunixnano) | ||||||
* [Attributes](#attributes) | ||||||
* [Events](#events) | ||||||
* [Links](#links) | ||||||
* [Status](#status) | ||||||
|
||||||
<!-- tocstop --> | ||||||
|
||||||
## Overview | ||||||
|
||||||
**Status**: [Stable](../document-status.md) | ||||||
|
||||||
The OpenTelemetry data model for tracing consists of a protocol | ||||||
specification for encoding spans, which represent an individual unit | ||||||
Comment on lines
+38
to
+39
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This combination sounds very confusing to me. I think you're defining a logical data model that includes entities, their attributes, and relationships between entities. So I don't understand what to make of the words "protocol", "specification", and "encoding" in this context. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another question is: whose data model are you trying to describe, the API's or the SDK's? They have different shapes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm trying to write a document that helps someone interpret the OTLP Span protocol without reading a
Tangentially, I'm actually not sure how multi-tenancy is expected to work in W3C and it's not a part of the SDK or the API's model. See #1852 (comment) In any case, I think there should be one model--what differences between the API and the SDK "shape" should be reflected in this kind of document? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Each layer can have a different logical model of the data it operates on. The API data model would be the smallest surface, SDK's data model could be larger (e.g. in OpenTracing API there was nothing related to sampling, but in Jaeger SDK implementing that API there was So you may be trying to write a document that describes the logical data model of OTLP (aka physical data model since there's no more abstractions left), but such description would be ~90% consisting of the logical data model of the SDK, which in turn is only a small extension of the API data model.
For the record, can't say I sympathize with this goal, what's wrong with reading the specification of the protocol to understand the protocol? It's another thing that .proto should not need to explain what a Span is or what Span name requirements are, that should come from the other parts of the spec. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The metrics and logs data model files are written without reference to the SDK and API specifications, and I'm not sure why someone should have to understand either to interpret trace data. The reason this happened in the metrics specification is that there are complicated relationships between fields that can't be documented in a single field. An example in trace is "root span". We need to use this term to explain a bunch of things, but it doesn't have a field in the proto. In metrics, we decided that In practice, I haven't followed the trace specification as closely as I have metrics, and now that I'm invested in adding probability sampling to the specification, I think this is sorely needed. As an example, I've just read through the current
So, I'll try again. |
||||||
of work done in a distributed system. | ||||||
|
||||||
## Glossary | ||||||
|
||||||
### Trace | ||||||
|
||||||
A trace is comprised of a number of spans, connected with each other | ||||||
through parent-child relationships, that describes a unit of work in a | ||||||
distributed system. | ||||||
|
||||||
### Span | ||||||
|
||||||
Each component in a distributed system contributes a span | ||||||
corresponding to a named operation, representing its part in the | ||||||
overall, distributed unit of work. | ||||||
|
||||||
### Root span | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this actually part of the model? Is there an attribute or trait that tags a span as root span? I think this belongs to the glossary. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (This is in the Glossary section.) This is why I bring up the question about multi-tenancy, which I don't see as a part of the OpenTelemetry data model at this time (despite @Oberon00's remarks). |
||||||
|
||||||
A root span is the span that initiates a unit of work in a distributed | ||||||
system. The root span is considered to have caused all the subsequent | ||||||
spans belonging to the trace. | ||||||
|
||||||
### Context | ||||||
|
||||||
OpenTelemetry defines [Context](../context/context.md) as a means of | ||||||
passing values for use in telemetry across program execution | ||||||
boundaries. | ||||||
|
||||||
### Span context | ||||||
|
||||||
Span context is the portion of the OpenTelemetry Context that makes up | ||||||
the tracing data model. This is specified by reference to the [W3C | ||||||
Comment on lines
+70
to
+71
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This is not quite clear. Can you expand/reword? |
||||||
trace context](https://www.w3.org/TR/trace-context/) specification, | ||||||
which defines four parts of the span context: | ||||||
|
||||||
1. TraceID | ||||||
2. SpanID | ||||||
3. Trace flags | ||||||
4. Tracestate | ||||||
|
||||||
The first three of these fields are included in the W3C trace context | ||||||
[`traceparent`](https://www.w3.org/TR/trace-context/#traceparent-header) | ||||||
header. | ||||||
|
||||||
### Trace flags | ||||||
|
||||||
The W3C trace context defines one flag at present, `sampled`, which | ||||||
OpenTelemetry uses to make sampling decisions based on the context. | ||||||
|
||||||
### Tracestate | ||||||
|
||||||
The W3C trace context defines a field known as | ||||||
[`tracestate`](https://www.w3.org/TR/trace-context/#tracestate-header) | ||||||
which enables extending the context with vendor-specific information. | ||||||
|
||||||
## Span fields | ||||||
|
||||||
### TraceID | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we use the same |
||||||
|
||||||
**Status**: [Stable](../document-status.md) | ||||||
|
||||||
The OpenTelemetry TraceID is defined to be equivalent to the W3C trace | ||||||
context `trace-id` field, consisting of 128-bits of information and | ||||||
assigned to the new trace when starting a root span. | ||||||
|
||||||
### SpanID | ||||||
|
||||||
**Status**: [Stable](../document-status.md) | ||||||
|
||||||
The OpenTelemetry SpanID is defined to identify the span that is the | ||||||
parent of a new trace context, equivalent to the W3C trace context | ||||||
Comment on lines
+109
to
+110
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This sentence is confusing. What is |
||||||
`parent-id` identifier in the context of a new span, consisting of | ||||||
64-bits of informaiton. | ||||||
|
||||||
### TraceState | ||||||
|
||||||
**Status**: [Stable](../document-status.md) | ||||||
|
||||||
The OpenTelemetry Span encodes the `tracestate` that was computed when | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is kind of a circular definition. Can we give a semantic definition of what it represents rather than how it's populated? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't know that there is a semantic definition for tracestate, I see it as a thing W3C gives vendors to use as they like. We're already in questionable territory, IMO, because we're calling OpenTelemetry a "vendor" when we declare a use of tracestate for ourselves. See also #1852 (comment) where it seems there are at least two semantic definitions possible ("universal" and "per-tenant") |
||||||
the Span started. | ||||||
|
||||||
### ParentSpanID | ||||||
|
||||||
**Status**: [Stable](../document-status.md) | ||||||
|
||||||
The OpenTelemetry Span contains a ParentSpanID field which for | ||||||
non-root spans refers to the W3C `parent-id` identifiers that was in | ||||||
the trace context when it started (i.e., it is the SpanID of the | ||||||
parent span for non-root spans). | ||||||
|
||||||
### Name | ||||||
|
||||||
**Status**: [Stable](../document-status.md) | ||||||
|
||||||
The OpenTelemetry Span name is a short, human-readable description of | ||||||
the work performed within the span's context. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am pretty sure we already have the definition of things like span name elsewhere in the spec. I think we should think about eliminating such duplication and referencing other parts of the spec. To me the reasonable hierarchy would be: start with logical data model that describes what the model entities/attributes represent, which then refer to it in the API spec to describe the operations on those entities (but no longer describing the meaning of entities/attributes). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I'll have to refactor this PR. |
||||||
|
||||||
### SpanKind | ||||||
|
||||||
TODO(issue #1929): complete the span data model text. | ||||||
|
||||||
### StartTimeUnixNano | ||||||
|
||||||
TODO(issue #1929): complete the span data model text. | ||||||
|
||||||
### EndTimeUnixNano | ||||||
|
||||||
TODO(issue #1929): complete the span data model text. | ||||||
|
||||||
### Attributes | ||||||
|
||||||
TODO(issue #1929): complete the span data model text. | ||||||
|
||||||
Define dropped_attributes_count here. | ||||||
|
||||||
### Events | ||||||
|
||||||
TODO(issue #1929): complete the span data model text. | ||||||
|
||||||
Define dropped_events_count here. | ||||||
|
||||||
### Links | ||||||
|
||||||
TODO(issue #1929): complete the span data model text. | ||||||
|
||||||
Define dropped_links_count here. | ||||||
|
||||||
### Status | ||||||
|
||||||
TODO(issue #1929): complete the span data model text. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this Mixed? Do we have anything unstable in the trace data model?