Releases: openzipkin/brave
Brave 5.12
Brave 5.12 introduces a powerful new way to handle data, completes our RPC abstraction, drops our Zipkin dependency and pours our thinking into RATIONALE docs.
There's a lot in this release for those doing advanced things like managing configuration tools or implementing custom tracing backends. Most users will do nothing except upgrade.
If you are using Brave directly, you should take note of deprecation mentioned. We do a major release every couple years, to remove deprecation and Brave 6 will also do that. By paying attention, not only will your code work faster, but you'll have less surprise later.
Like all releases, volunteers bore a huge responsibility on this release. As so much happened here, it was quite a load. Please reach out and thank those who contributed, star our repo or say hi on gitter. If you have ideas, we'd love to hear about them, too.
On to the main show!
Introducing SpanHandler
Brave 5.12 has a cleaner integration for data than ever before. SpanHandler
replaces FinishedSpanHandler
. SpanHandler
can do everything FinishedSpanHandler
did: redacting, adding tags based on baggage, remapping trace IDs, sending to multiple systems etc.
The more advanced begin
hook adds much more power. You can setup default baggage only on local roots, add correlated mapped data extensions, perform aggregations such as child counts.
This is our most powerful API co-designed by @anuraaga and with lots of good feedback from our usual suspects @jeqo and @jorgheymans. For now, you can just replace FinishedSpanHandler
with SpanHandler
, but if you are curious.. here are few links of interest:
See https://github.com/openzipkin/brave/blob/master/brave/src/main/java/brave/handler/SpanHandler.java
See https://github.com/openzipkin/brave/tree/master/brave/src/test/java/brave/features/handler
See https://github.com/openzipkin/zipkin-reporter-java/tree/master/brave
MutableSpan can do everything now
MutableSpan was initially a response to complaints that immutable conversions added GC pressure and generally weren't a good choice for telemetry.
Before, we paired TraceContext
with MutableSpan
, splitting responsibilities. However, this would make things like natively writing JSON from Zipkin types difficult. Hence, we fully fleshed out MutableSpan
so that it accompanies, but is decoupled from TraceContext
.
Here are some features newly available with much thanks to @anuraaga for a month of help on them!
MutableSpanBytesEncoder
- allows you to writeMutableSpan
directly to JSON without any dependencies or intermediating through another type such aszipkin2.Span
.MutableSpan.xxxId()
- allows you to specify read or remap all IDs including trace IDs, depending on your outputMutableSpan.annotations(), tags()
- read-only immutable collection views for convenience of those not concerned with performance (internally implemented as arrays)MutableSpan.annotationCount(), tagCount() xxxValueAt(index)
- allocation free tools to write data conversions as for loops.
RPC abstraction is now complete!
We started an RPC abstraction about 9 months ago. Last October, we RPC sampling support in Brave 5.8.
With a lot of thanks to our contributors @devinsba @jeqo @jcchavezs and especially weeks of effort by volunteer @anuraaga, we have a complete product. Those using gRPC or Dubbo can now uniformly sample and parse parse based on RPC metadata:
By default, the following are added to both RPC client and server spans:
- Span.name is the RPC service/method. Ex. "zipkin.proto3.SpanService/Report"
- If the service is absent, the method is the name and visa versa.
- Tags:
- "rpc.method", eg "Report"
- "rpc.service", eg "zipkin.proto3.SpanService"
- "rpc.error_code", eg "CANCELLED"
- "error" the RPC error code if there is no exception
- Remote IP and port information
Users familiar with how HTTP works will love the familiarity. The APIs are similar, exactly the same features are supported, whether that's sampling, baggage you name it. Those curious about our decision making process can have a look at the RATIONALE as we tried our best to make sound decisions and be transparent about them. Enjoy!
Zipkin dependency is dropped!
With the SpanHandler
type finalized, we have deprecated support for zipkin2.Reporter<zipkin2.Span>
in Brave and removes dependencies on Zipkin libraries.
This isn't to deprecate Zipkin support, of course, just move the responsibility to the zipkin-reporter-brave project (even [XML beans](https://github.com/openzipkin/zipkin-reporter-java/tree/master/spring-beans for those who need it!)
The end result is cleaner integrations for the various SaaS offerings who use Brave, but don't use Zipkin. Such use cases should be directly implemented as SpanHandler
now, with no need to route through zipkin format.
Zipkin users should simply replace AsyncReporter
with AsyncZipkinSpanHandler
to adjust, similar to what's in our README:
// Configure a reporter, which controls how often spans are sent
// (this dependency is io.zipkin.reporter2:zipkin-sender-okhttp3)
sender = OkHttpSender.create("http://127.0.0.1:9411/api/v2/spans");
// (this dependency is io.zipkin.reporter2:zipkin-reporter-brave)
zipkinSpanHandler = AsyncZipkinSpanHandler.create(sender);
tracing = Tracing.newBuilder()
.addSpanHandler(zipkinSpanHandler)
...
Test infrastructure overhaul
As we no longer have a Zipkin dependency, we decided to make tools to help common unit and integration tests. For example, vendors integrating with Brave should be able to assert on the data produced. Third party libraries should be able to avoid common bugs. Beyond our normal ITHttpServer
and similar tests, we've extracted the following in the brave-tests
package:
- ITRemote - configures the most common test fixtures for multi-threaded integration tests
- TestSpanHandler - allows simple assertions for unit tests
- IntegrationTestSpanHandler - blocking span reporter for remote multi-threaded unit tests.
Rationale
We have updated and added many RATIONALE files including the below to better help people understand our thinking.. and to help us remember our thinking!
Thanks to @jorgheymans @jeqo @jcchavezs @anuraaga and @NersesAM for the help adding content and reviewing
brave
brave-instrumentation
brave-instrumentation-dubbo
brave-instrumentation-http
brave-instrumentation-grpc
brave-instrumentation-kafka-streams
brave-instrumentation-rpc
Other Notable Changes
Updates
- Kafka 2.5 is now supported, thanks to @jeqo
Behavior
- one-way RPC span modeling should no longer use
span.start().flush()
on one host andspan.finish()
(without start) on the other. This was implemented inconsistently and not very compatible with most clones.
Additions
Tracing.Builder.clearSpanHandlers(), spanHandlers()
- allowsTracingCustomizer
instances to re-order or prune span handlers. For example, to ensure Zipkin is last, or theirs is first.Tracing.Builder.alwaysSampleLocal()
- special hook for metrics aggregation and secondary-sampling that says the backend should always see recorded spans even if they weren't sampled in headers
Deprecations:
Tracer.propagationFactory()
is deprecated for the existingTracer.propagation()
as we no longer rely on non-string keys (these were only used by gRPC and we changed to hide this conversion).brave.ErrorParser
is deprecated as it was only used for Zipkin conversion. You can optionally specifyTag<Throwable>
to affect the default "error" tag in zipkin-reporter-brave
Brave 5.11
Brave 5.11 adds new Apis for tags, baggage (propagated fields) and correlation fields (MDC). These apis were designed over many weeks of hard effort, with a goal of availing features that would otherwise take a major release to accomplish. The result is you can ease into this with no code impact.
Brave 5.11 also adds MongoDB instrumentation, something requested for a long time and obviates custom code sites formerly used to fill the gaps.
As the bulk of the load is volunteer work, please thank people directly using any means you can, or chat on https://gitter.im/openzipkin/zipkin to say the same. If you rely on code here, make sure you star Brave.
Volunteers trade "couch time" to help make sure your tracing works. Stars are an easy way volunteers to see their efforts are impactful and appreciated.
Note: Do not use Brave 5.11.0 or 5.11.1 as there were problems in these distributions. Use 5.11.2 or higher.
Tag, Tags and HttpTags
Brave 5.11 adds a long overdue feature to ease support of tagging spans. Tag
bakes in all logic needed to add a tag to a span, leaving the user left only to decide what the key is and how to parse it. Many thanks to volunteer @anuraaga for design and review work on this.
This not only works for both in-flight and already finished spans, but also takes care of null checking and error handling.
Here's an example of a potentially expensive tag:
SUMMARY_TAG = new Tag<Summarizer>("summary") {
@Override protected String parseValue(Summarizer input, TraceContext context) {
return input.computeSummary();
}
}
// This works for any variant of span
SUMMARY_TAG.tag(summarizer, span);
We also have constants in Tags
and HttpTags
you can make type-safe updates on standard fields.
Ex.
httpTracing = httpTracing.toBuilder()
.clientRequestParser((req, context, span) -> {
HttpClientRequestParser.DEFAULT.parse(req, context, span);
HttpTags.URL.tag(req, context, span); // add the url in addition to defaults
})
.build();
All these types have Javadoc and there are introductions in Markdown here:
- https://github.com/openzipkin/brave/tree/master/brave#customizing-spans
- https://github.com/openzipkin/brave/tree/master/instrumentation/http#span-data-policy
History
We once had constants for tag names based on thrift definitions, but they were removed when Brave decoupled from the zipkin library.
The closest type we had recently is ErrorParser
as that does a similar dispatch. Externally, the closest is OpenTracing Tag.
brave.Tag
and OpenTracing's Tag share ability to set tags on Spans before they start and while they are in progress. However, there are some differences:
brave.Tag
has Javadoc explaining how and why you would use it.brave.Tag
integrates withFinishedSpanHandler
, so it can change tags regardless of instrumentation policy, even after they complete.brave.Tag
is sealed except how to parse the value, which means error handling can be built in.
BaggagePropagation and BaggageField
Sometimes you need to propagate additional fields, such as a request ID or an alternate trace context. Thanks to many weeks of design and review from @anuraaga as well input from site owner @jorgheymans, we now have formal support for "baggage".
For example, you need a specific request's country code, you can propagate it through the trace as an HTTP header with the same name:
import brave.baggage.BaggagePropagationConfig.SingleBaggageField;
// Configure your baggage field
COUNTRY_CODE = BaggageField.create("country-code");
// When you initialize the builder, add the baggage you want to propagate
tracingBuilder.propagationFactory(
BaggagePropagation.newFactoryBuilder(B3Propagation.FACTORY)
.add(SingleBaggageField.remote(COUNTRY_CODE))
.build()
);
// Later, you can retrieve that country code in any of the services handling the trace
// and add it as a span tag or do any other processing you want with it.
countryCode = COUNTRY_CODE.getValue(context);
This may look familiar to ExtraFieldPropagation
, as it includes all the features it had and more. BaggagePropagation
can also integrate with logging contexts and cleanly encapsulate field configuration.
Currently, BaggagePropagationConfig
only supports predefined fields. However, dynamic fields will be supported in a future version, with no API break to you. Dynamic fields must either be in-process only, or use single header encoding. We will likely default to W3C encoding once they decide on a header name that works with JMS.
All these types have Javadoc and there is an introduction in Markdown here:
History
The name Baggage was first introduced by Brown University in Pivot Tracing as maps, sets and tuples. They then spun baggage out as a standalone component, BaggageContext and considered some of the nuances of making it general purpose. The implementations proposed in these papers are different to the implementation here, but conceptually the goal is the same: to propagate "arbitrary stuff" with a request.
Even though OpenTracing named propagated fields Baggage initially, we decided not to, as the Apis were not safe for arbitrary usage. For example, there was no implementation which could allow control over which fields to propagate, set limits or how to redact them. We didn't want to call anything Baggage until we could do that safely.
Instead, Brave 4.9 introduced ExtraFieldPropagation
as a way to push other fields, such as a country code or request ID, alongside the trace context. It had get()
and set()
methods to retrieve values anywhere a span is active, but the above issues remained.. hard issues described here #577
The current baggage apis resolve the design problems that limited us in the past. It took many weeks of full-time effort from volunteer co-designer @anuraaga as well site input from @jorgheymans to surmount these hurdles.
CorrelationScopeDecorator (MDC integration)
CorrelationScopeDecorator
is an advanced implementation of correlation shared by all implementations (like log4j, log4j2, slf4j). It can map field names, even allow you to flush updates of baggage synchronously to the underlying context. This integrates seamlessly with BaggagePropagation
thanks to many volunteered weeks of design and review from @anuraaga as well input from site owner @jorgheymans.
All context integrations extend CorrelationScopeDecorator.Builder
which means you can make portable configuration.
Ex. this is the only part that has to do with the implementation:
CorrelationScopeDecorator.Builder builder = MDCScopeDecorator.newBuilder();
By default, if you call build()
, only traceId
and spanId
integrate with the underlying context. This is great for performance( only better if you customize to only include traceId
!).
A common configuration would be to integrate a BaggageField
as a correlation field in logs.
Assuming the above setup for COUNTRY_CODE
, you can integrate like this:
import brave.baggage.CorrelationScopeConfig.SingleCorrelationField;
decorator = MDCScopeDecorator.newBuilder()
.add(SingleCorrelationField.create(COUNTRY_CODE))
.build();
tracingBuilder.currentTraceContext(ThreadLocalCurrentTraceContext.newBuilder()
.addScopeDecorator(decorator)
.build()
);
// Any scope operations (updates to the current span) apply the fields defined by the decorator.
ScopedSpan span = tracing.tracer().startScopedSpan("encode");
try {
// The below log message will have %X{country-code} in the context!
logger.info("Encoding the span, hope it works");
--snip--
All these types have Javadoc and there is an introduction in Markdown here:
History
Before, we had types like MDCScopeDecorator
for integrating extra fields as correlation fields in logging contexts. However, they were not customizable. In order to satisfy any user that needs "parentId", all scope decorators set this. This meant overhead in all cases, which adds up especially in reactive code.
MongoDB instrumentation
brave-instrumentation-mongodb
includes a TraceMongoCommandListener
, a CommandListener
for the Mongo Java driver that will report via Brave how long each command takes, along with relevant tags like the collection/view name, the command's name (insert
, update
, find
, etc.).
Volunteer @csabakos spent a month developing this for you and is owed a lot of thanks, also to volunteers @anuraaga and @kojilin for review and advice
https://github.com/openzipkin/brave/tree/master/instrumentation/mongodb
An application registers command listeners with a MongoClient
by configuring MongoClientSettings
as follows:
CommandListener listener = MongoDBTracing.create(Tracing.current())
.commandListener();
MongoClientSettings settings = MongoClientSettings.builder()
.addCommandListener(listener)
.build();
MongoClient client = MongoClients.create(settings);
Support for asynchronous clients is unimplemented. To re...
DO NOT USE
This was a bad release, please use 5.11.2 or higher
Brave 5.10
Brave 5.10 completes migration to our new HTTP instrumentation types: HttpRequest
and HttpResponse
by introducing new parsers. It also makes it easier to access request and error details from a response. Finally, we lower overhead relating to scoping across the board.
This release was the sum of many contributors, but a special shout-out is given to @anuraaga who reviewed all change and provided a lot of excellent feedback we need to keep documentation and design as clean and understandable as possible.
Let's get to it!
Introducing HttpRequestParser
and HttpResponseParser
Those of you doing custom data policy are familiar with the HttpAdapter
type introduced in Brave v4. This type allowed you to take a raw request, such as HttpServletRequest
and pull something portable out of it, such as the HTTP url adapter.url(request)
. In Brave 5.7, we replaced this with wrapper types to do the same thing. For example request.url()
would dispatch to the corresponding framework-specific implementation.
We decided to do wrappers as even if there is an extra allocation to instantiate one (performance hit), in practice we often needed to combine multiple types to achieve a single goal. Let's take the URL example. Sometimes, you need to access a route object and also an HTTP request object in order to build the actual URL called. In other words, the assumption that a single type could work as a raw request was faulty. We decided in Brave 5.7 to fix that, starting with sampling. Brave 5.10 completes the task by migrating all work to the new types, including data policy.
It was not easy to migrate while still keeping compatible with old code. The way we did it was introducing new parsers which should be used instead of the former HttpParser
: HttpRequestParser
and HttpResponseParser
. Those who made custom policy should be able to migrate easily.
Ex. To add the URL tag in addition to defaults:
httpTracing = httpTracing.toBuilder()
- .clientParser(new HttpClientParser() {
- @Override
- public <Req> void request(HttpAdapter<Req, ?> adapter, Req req, SpanCustomizer span) {
- super.request(adapter, req, span);
- span.tag("http.url", adapter.url(req)); // add the url in addition to defaults
- }
+ .clientRequestParser((req, context, span) -> {
+ HttpClientRequestParser.DEFAULT.parse(req, context, span);
+ span.tag("http.url", req.url()); // add the url in addition to defaults
})
.build();
What's subtle about the design is that it is splitting request and response allowed us to use lambdas. Future functionality, such as composable units can be built easier with functional code such as above.
The other subtlety is that we pass the context
argument explicitly now. This was a choice due to performance overhead of the prior design. "Scoping" is when you use thread-locals (and anything hung off them) to make something implicit. We found users are doing a lot in scope functions, and that has a cost to it. This design moves implicit to explicit, reducing scope operations 2x per HTTP call.
Meanwhile, commands such as tagging extra fields are still possible, using explicit parameters, like so:
httpTracing = httpTracing.toBuilder()
.clientRequestParser((req, context, span) -> {
HttpClientRequestParser.DEFAULT.parse(req, context, span);
String userName = ExtraFieldPropagation.get(context, "user-name");
if (userName != null) span.tag("user-name", userName);
})
.build();
Introducing Response.request()
and Response.error()
A great idea @anuraaga had was to avail more properties at response time than we currently do. We first felt this tension about the "http.route" tag. A matched route is usually unknown until late in processing. Hence, we formerly had HttpResponse.route()
to avail this property late in the cycle. Over time, you could imagine other request properties useful at response time. Instead of adding these, piece by piece, we now have an optional Response.request()
accessor. Building off that, we realised that the error associated with a response could be more convenient if available directly as Response.error()
. This helps in functional code as it allows a single-argument: Response
to reach all parseable data.
Other change
Brave 5.10 includes a lot of other work, less Api impacting, but still critical to things becoming better each time. We are lucky to have so much feedback and help continuing Brave's nearly 7 years service to users.
- Add
Tracer.nextSpanWithParent()
for explicit propagation - Lowers the minimum version of Apache HttpClient instrumentation from 4.4 to 4.3
- Enforces sampled propagation field must be well formed
- Handles special case when a JMS destination is both queue and topic. Thanks @nomeez for the investigation
- Stops writing the gRPC "method" propagated tag
- Ensures gRPC client response callbacks happen in invocation context
- Makes HTTP request method mandatory when parsing
Brave 5.9.5
Brave 5.9.5 notably preserves case of local and remote service names. This allows non-Zipkin destination, such as cloud service providers, to see the raw data in their FinishedSpanHandler
exporters.
Thanks @csabakos for the investigation and code to fix this!
Brave 5.9.2
Brave 5.9.2 is our first release in the year 2020, brought to you by the direct efforts of ten volunteers. Notably, @anuraaga reviewed nearly all changes in this release and deserves credit for keeping the fire burning. If you are happy with the teamwork, please star this repo or say hi on https://gitter.im/openzipkin/zipkin. There's plenty to do and we appreciate help if you can spare time.
Without further adieu, here are the release notes:
Features:
- Async HTTP client callbacks should be invoked in invocation context of the caller (ex the callsite of the initial http client request). Otherwise, HTTP client spans can appear nested, which are confusing and effect statistics. Formerly, we were inconsistent on this, but now it is enforced across the board with integration tests. Thanks for the analysis and help by @simontoens (#1055, #1067)
- kafka-clients users can now control whether or not to continue a trace or fork a new one with the flag
singleRootSpanOnReceiveBatch
. (#1033) thanks @jeqo for leading this and @jorgheymans, @anuraaga for the detailed review
Fixes:
- accidental inheritance of shared flag (#1071) Thanks @narayaruna and @devinsba for the investigation!
- workaround JAX-RS in-compliant client libraries which return immutable views of headers when they shouldn't (#1046) Thanks @SimY4 for the analysis and code.
- accidental wrapping of null JMS message listeners (#1065) Thanks @ohr for the fix!
- spring-rabbit interceptor order was incorrect, so other interceptors couldn't see the trace context (#1051) Thanks @kubamarchwicki for getting to the bottom of this and volunteering to fix it
- certain interceptor usage patterns could lead to a leaked scope in httpasyncclient. (#1050) thanks @andylintner for tracking this down and correcting the code
- a fragment of code modified from guava was not properly attributed. It is now in the NOTICE file in brave's jar (#1056)
Notes:
- Our gRPC instrumentation was written to be compatible with the Census project. Starting with gRPC v1.22, the Census integration stopped propagating the tag "method". While we still propagate this, the next minor version of Brave will also stop, and save needless in-process and wire overhead. As gRPC dropped Census as a core dependency in v1.27, we will similarly also de-prioritize interop with Census.
Brave 5.9.1
Brave 5.9.1. adds some minor fixes and additional features:
Features:
Fixes:
- Fix path parsing of HttpClient to return the whole path (#1036)
- Catch Throwable instead of RuntimeException or Error on Kafka Streams (#1030)
- Remove duplicate mockito dependency (#1027)
Thanks to @jcchavezs, @anuraaga, @jorgheymans and @worldtiki for your contributions!
Brave 5.9
Brave 5.9 notably begins a messaging abstraction. It also allows customizing of which B3 format is used based on Span kinds (CLIENT, SERVER, etc.). If you enjoy this work, please star our repo or join gitter to thank folks!
New Messaging abstraction
In Brave's main repository, we have three messaging instrumentation libraries: Kafka, JMS, and RabbitMQ. We've started a messaging abstraction with sampling.
Ex. Here's a sampler that traces 100 consumer requests per second, except for the "alerts" channel. Other requests will use a global rate provided by the Tracing component.
import brave.sampler.Matchers;
import static brave.messaging.MessagingRequestMatchers.channelNameEquals;
messagingTracingBuilder.consumerSampler(MessagingRuleSampler.newBuilder()
.putRule(channelNameEquals("alerts"), Sampler.NEVER_SAMPLE)
.putRule(Matchers.alwaysMatch(), RateLimitingSampler.create(100))
.build());
This code is 100% portable across traced libraries. In other words, JMS tracing uses exactly the same MessagingTracing
component as Spring Rabbit: Rules can be mixed in the same way as they can with our HttpTracing
component. We hope this can help you prune traces to the most impactful!
Thanks very much to @anuraaga and @jeqo for design and code review.
Propagation customization
It is already the case that folks can make custom propagation components to address different header formats, such as Amazon's. One repeated concern was to control which of the B3 formats should be used when sending headers down. For example, in a transition, you may want to send both our single and multi-header formats. However, new sites may choose to only send the single format as it is cheaper.
In a pragmatic move, we've retro-fitted the default B3Propagation
implementation to consider the kind of span when choosing formats to write. This inherits the default formats used prior: "x-b3-" prefixed for client/server spans and the single "b3" format for producer/consumer spans.
To override this policy, you can use a builder like below. The following makes RPC and HTTP instrumentation write "b3" single format such as messaging spans do.
tracingBuilder.propagationFactory(B3Propagation.newFactoryBuilder()
.injectFormat(Format.SINGLE) // things that don't extend `brave.Request`
.injectFormat(Span.Kind.CLIENT, Format.SINGLE)
.injectFormat(Span.Kind.SERVER, Format.SINGLE)
.build())
Regardless of this policy, both "b3" and "x-b3-" headers are read, as if we changed that, it would break existing sites! Tolerant reads are best for compatibility and interop!
Note: Our HTTP, RPC, and Messaging abstractions employ these by default by extending brave.Request
. If you are not using our abstractions, please consider updating to also use brave.Request for remote instrumentation so that users can easily tune header formats!
Minor changes
- OSGi manifests are now present in all jars
- @jeqo fixed where Kafka and JMS callbacks couldn't read no-op trace contexts
- @simondean allowed kafka-streams to inherit policy from kafka-clients instrumentation
- @rmichela fixed where SQS used with JMS could raise exceptions where it shouldn't
Brave 5.8
Brave 5.8 significantly improves sampling infrastructure and begins an Rpc abstraction
Functional sampling with SamplerFunction
Over time, we've accumulated a number of sampling tools. For example, we have DeclarativeSampler
, which can be used to process Java annotations, and HttpRuleSampler
for path and method rules. We've consolidated all parameterized samplers under a functional interface: SamplerFunction
, which allows better re-use as we journey into more models such as RPC and Messaging.
Here's an example using functional expressions to declare "/health" should never be sampled, while we want POST /api
requests limited to 100 traces per second:
import static brave.http.HttpRequestMatchers.*;
import static brave.sampler.Matchers.and;
httpTracing = httpTracingBuilder.serverSampler(HttpRuleSampler.newBuilder()
.putRule(pathStartsWith("/health"), Sampler.NEVER_SAMPLE)
.putRule(and(methodEquals("POST"), pathStartsWith("/api")), RateLimitingSampler.create(100))
.build()).build();
As a side effect, we've deprecated Tracer.withSampler
in favor of lighter methods that accomplish the same:
Tracer.startScopedSpan(name, sampler, param)
Tracer.nextSpan(sampler, param)
New RPC abstraction
In Brave's main repository, we have two different RPC libraries: Dubbo and gRPC. We've started an RPC abstraction with sampling.
For example, here's a sampler that traces 100 "GetUserToken" requests per second. This doesn't start new traces for requests to the health check service. Other requests will use a global rate provided by the tracing component.
import static brave.rpc.RpcRequestMatchers.*;
rpcTracing = rpcTracingBuilder.serverSampler(RpcRuleSampler.newBuilder()
.putRule(serviceEquals("grpc.health.v1.Health"), Sampler.NEVER_SAMPLE)
.putRule(methodEquals("GetUserToken"), RateLimitingSampler.create(100))
.build()).build();
While the conventions above are gRPC centric, the code is 100% portable. In other words, Dubbo tracing uses exactly the same RpcTracing
component and rules can be mixed in the same way as they can with our HttpTracing
component.
Thanks very much to @trustin @anuraaga and @jeqo for design and code review.
Tracing.Builder.alwaysReportSpans()
Most users will want to use defaults, when deciding where to report trace data. Some advanced sites will have a trace forwarder, a proxy that sends data to one or more places. One open source example is https://github.com/HotelsDotCom/pitchfork. In cases like this, one proxy may want all data, regardless of if it is B3 sampled or not. This change introduces Tracing.Builder.alwaysReportSpans()
, primarily in support of the secondary sampling project.
Brave 5.7
Brave 5.7 introduces configuration customizers and revamps our HTTP abstraction.
XXXCustomizer
Users want to customize only certain parts of the tracing subsystem, letting frameworks configure the rest. For example, a user wants to override the sampler, but not affect the span reporter. This split of concerns works as long as it is possible to run multiple configuration before constructing a component.
Along with a dependency injection tool, customizers help decouple configuration and also ease testing of new features. Users provide XXXCustomizer instances, and frameworks like spring-cloud-sleuth call them at the right time.
In Brave 5.7, we expose several interfaces to customize configuration and use them in our spring-beans integration.
CurrentTraceContextCustomizer
- called before invokingCurrentTraceContext.Builder.build()
ExtraFieldCustomizer
- called before invokingExtraFieldPropagation.FactoryBuilder.build()
HttpTracingCustomizer
- called before invokingHttpTracing.Builder.build()
TracingCustomizer
- called before invokingTracing.Builder.build()
HTTP abstraction overhaul
Before, we used an adapter model to parse a framework type, such as HttpServletRequest
. This was a partial function, and aimed to eliminate an extra object allocation otherwise needed during parsing. This worked well until we started to see propagation integration.
For example, Netflix wanted to inspect the http path to make a secondary sampling decision. They could see the request type as a parameter of Extractor.extract()
. However, this function had no access to the adapter which would provide means to parse the path. In other words, without a known http request type, secondary sampling could not be accomplished portably.
Through quite a lot of effort, we overhauled every http integration so that they directly use new types: brave.http.HttpClientRequest
and brave.http.HttpServerRequest
. This opens up new integration possibilities, not just secondary sampling, but also deciding at runtime which headers to use on a per-request basis.
Thanks @basvanbeek for the design help and @anuraaga @devinsba and @jeqo for review
External timestamps in HTTP client and server handling
Armeria supports RequestLog.responseEndTimeNanos()
for more precise
timestamps, which line up with metrics. Our HTTP handlers now allow external timestamps, so that advanced frameworks can collaborate in timing decisions. Thanks for the help by @trustin and @kojilin
FinishedSpanHandler.supportsOrphans()
@lambcode noticed that FinishedSpanHandler
instances were not called when data is orphaned by buggy instrumentation. While this is technically correct (the "finished" in FinishedSpanHandler
is about the correct case), there are use cases to process orphaned data. You can now implement FinishedSpanHandler.supportsOrphans()
to indicate handler desires all data, not just complete data. Thanks @lambcode for the code spike and @devinsba @anuraaga for review.