Skip to content

Releases: openzipkin/brave

Brave 5.12

20 May 09:00
Compare
Choose a tag to compare

Brave 5.12 introduces a powerful new way to handle data, completes our RPC abstraction, drops our Zipkin dependency and pours our thinking into RATIONALE docs.

There's a lot in this release for those doing advanced things like managing configuration tools or implementing custom tracing backends. Most users will do nothing except upgrade.

If you are using Brave directly, you should take note of deprecation mentioned. We do a major release every couple years, to remove deprecation and Brave 6 will also do that. By paying attention, not only will your code work faster, but you'll have less surprise later.

Like all releases, volunteers bore a huge responsibility on this release. As so much happened here, it was quite a load. Please reach out and thank those who contributed, star our repo or say hi on gitter. If you have ideas, we'd love to hear about them, too.

On to the main show!

Introducing SpanHandler

Brave 5.12 has a cleaner integration for data than ever before. SpanHandler replaces FinishedSpanHandler. SpanHandler can do everything FinishedSpanHandler did: redacting, adding tags based on baggage, remapping trace IDs, sending to multiple systems etc.

The more advanced begin hook adds much more power. You can setup default baggage only on local roots, add correlated mapped data extensions, perform aggregations such as child counts.

This is our most powerful API co-designed by @anuraaga and with lots of good feedback from our usual suspects @jeqo and @jorgheymans. For now, you can just replace FinishedSpanHandler with SpanHandler, but if you are curious.. here are few links of interest:

See https://github.com/openzipkin/brave/blob/master/brave/src/main/java/brave/handler/SpanHandler.java
See https://github.com/openzipkin/brave/tree/master/brave/src/test/java/brave/features/handler
See https://github.com/openzipkin/zipkin-reporter-java/tree/master/brave

MutableSpan can do everything now

MutableSpan was initially a response to complaints that immutable conversions added GC pressure and generally weren't a good choice for telemetry.

Before, we paired TraceContext with MutableSpan, splitting responsibilities. However, this would make things like natively writing JSON from Zipkin types difficult. Hence, we fully fleshed out MutableSpan so that it accompanies, but is decoupled from TraceContext.

Here are some features newly available with much thanks to @anuraaga for a month of help on them!

  • MutableSpanBytesEncoder - allows you to write MutableSpan directly to JSON without any dependencies or intermediating through another type such as zipkin2.Span.
  • MutableSpan.xxxId() - allows you to specify read or remap all IDs including trace IDs, depending on your output
  • MutableSpan.annotations(), tags() - read-only immutable collection views for convenience of those not concerned with performance (internally implemented as arrays)
  • MutableSpan.annotationCount(), tagCount() xxxValueAt(index) - allocation free tools to write data conversions as for loops.

RPC abstraction is now complete!

We started an RPC abstraction about 9 months ago. Last October, we RPC sampling support in Brave 5.8.

With a lot of thanks to our contributors @devinsba @jeqo @jcchavezs and especially weeks of effort by volunteer @anuraaga, we have a complete product. Those using gRPC or Dubbo can now uniformly sample and parse parse based on RPC metadata:

By default, the following are added to both RPC client and server spans:

  • Span.name is the RPC service/method. Ex. "zipkin.proto3.SpanService/Report"
    • If the service is absent, the method is the name and visa versa.
  • Tags:
    • "rpc.method", eg "Report"
    • "rpc.service", eg "zipkin.proto3.SpanService"
    • "rpc.error_code", eg "CANCELLED"
    • "error" the RPC error code if there is no exception
  • Remote IP and port information

Users familiar with how HTTP works will love the familiarity. The APIs are similar, exactly the same features are supported, whether that's sampling, baggage you name it. Those curious about our decision making process can have a look at the RATIONALE as we tried our best to make sound decisions and be transparent about them. Enjoy!

Zipkin dependency is dropped!

With the SpanHandler type finalized, we have deprecated support for zipkin2.Reporter<zipkin2.Span> in Brave and removes dependencies on Zipkin libraries.

This isn't to deprecate Zipkin support, of course, just move the responsibility to the zipkin-reporter-brave project (even [XML beans](https://github.com/openzipkin/zipkin-reporter-java/tree/master/spring-beans for those who need it!)

The end result is cleaner integrations for the various SaaS offerings who use Brave, but don't use Zipkin. Such use cases should be directly implemented as SpanHandler now, with no need to route through zipkin format.

Zipkin users should simply replace AsyncReporter with AsyncZipkinSpanHandler to adjust, similar to what's in our README:

// Configure a reporter, which controls how often spans are sent
//   (this dependency is io.zipkin.reporter2:zipkin-sender-okhttp3)
sender = OkHttpSender.create("http://127.0.0.1:9411/api/v2/spans");
//   (this dependency is io.zipkin.reporter2:zipkin-reporter-brave)
zipkinSpanHandler = AsyncZipkinSpanHandler.create(sender);

tracing = Tracing.newBuilder()
                 .addSpanHandler(zipkinSpanHandler)
                 ...

Test infrastructure overhaul

As we no longer have a Zipkin dependency, we decided to make tools to help common unit and integration tests. For example, vendors integrating with Brave should be able to assert on the data produced. Third party libraries should be able to avoid common bugs. Beyond our normal ITHttpServer and similar tests, we've extracted the following in the brave-tests package:

Rationale

We have updated and added many RATIONALE files including the below to better help people understand our thinking.. and to help us remember our thinking!

Thanks to @jorgheymans @jeqo @jcchavezs @anuraaga and @NersesAM for the help adding content and reviewing

brave
brave-instrumentation
brave-instrumentation-dubbo
brave-instrumentation-http
brave-instrumentation-grpc
brave-instrumentation-kafka-streams
brave-instrumentation-rpc

Other Notable Changes

Updates

  • Kafka 2.5 is now supported, thanks to @jeqo

Behavior

  • one-way RPC span modeling should no longer use span.start().flush() on one host and span.finish() (without start) on the other. This was implemented inconsistently and not very compatible with most clones.

Additions

  • Tracing.Builder.clearSpanHandlers(), spanHandlers() - allows TracingCustomizer instances to re-order or prune span handlers. For example, to ensure Zipkin is last, or theirs is first.
  • Tracing.Builder.alwaysSampleLocal() - special hook for metrics aggregation and secondary-sampling that says the backend should always see recorded spans even if they weren't sampled in headers

Deprecations:

  • Tracer.propagationFactory() is deprecated for the existing Tracer.propagation() as we no longer rely on non-string keys (these were only used by gRPC and we changed to hide this conversion).
  • brave.ErrorParser is deprecated as it was only used for Zipkin conversion. You can optionally specify Tag<Throwable> to affect the default "error" tag in zipkin-reporter-brave

Brave 5.11

08 Apr 04:45
Compare
Choose a tag to compare

Brave 5.11 adds new Apis for tags, baggage (propagated fields) and correlation fields (MDC). These apis were designed over many weeks of hard effort, with a goal of availing features that would otherwise take a major release to accomplish. The result is you can ease into this with no code impact.

Brave 5.11 also adds MongoDB instrumentation, something requested for a long time and obviates custom code sites formerly used to fill the gaps.

As the bulk of the load is volunteer work, please thank people directly using any means you can, or chat on https://gitter.im/openzipkin/zipkin to say the same. If you rely on code here, make sure you star Brave.

Volunteers trade "couch time" to help make sure your tracing works. Stars are an easy way volunteers to see their efforts are impactful and appreciated.

Note: Do not use Brave 5.11.0 or 5.11.1 as there were problems in these distributions. Use 5.11.2 or higher.

Tag, Tags and HttpTags

Brave 5.11 adds a long overdue feature to ease support of tagging spans. Tag bakes in all logic needed to add a tag to a span, leaving the user left only to decide what the key is and how to parse it. Many thanks to volunteer @anuraaga for design and review work on this.

This not only works for both in-flight and already finished spans, but also takes care of null checking and error handling.

Here's an example of a potentially expensive tag:

SUMMARY_TAG = new Tag<Summarizer>("summary") {
  @Override protected String parseValue(Summarizer input, TraceContext context) {
    return input.computeSummary();
  }
}

// This works for any variant of span
SUMMARY_TAG.tag(summarizer, span);

We also have constants in Tags and HttpTags you can make type-safe updates on standard fields.

Ex.

httpTracing = httpTracing.toBuilder()
    .clientRequestParser((req, context, span) -> {
      HttpClientRequestParser.DEFAULT.parse(req, context, span);
      HttpTags.URL.tag(req, context, span); // add the url in addition to defaults
    })
    .build();

All these types have Javadoc and there are introductions in Markdown here:

History

We once had constants for tag names based on thrift definitions, but they were removed when Brave decoupled from the zipkin library.

The closest type we had recently is ErrorParser as that does a similar dispatch. Externally, the closest is OpenTracing Tag.

brave.Tag and OpenTracing's Tag share ability to set tags on Spans before they start and while they are in progress. However, there are some differences:

  • brave.Tag has Javadoc explaining how and why you would use it.
  • brave.Tag integrates with FinishedSpanHandler, so it can change tags regardless of instrumentation policy, even after they complete.
  • brave.Tag is sealed except how to parse the value, which means error handling can be built in.

BaggagePropagation and BaggageField

Sometimes you need to propagate additional fields, such as a request ID or an alternate trace context. Thanks to many weeks of design and review from @anuraaga as well input from site owner @jorgheymans, we now have formal support for "baggage".

For example, you need a specific request's country code, you can propagate it through the trace as an HTTP header with the same name:

import brave.baggage.BaggagePropagationConfig.SingleBaggageField;

// Configure your baggage field
COUNTRY_CODE = BaggageField.create("country-code");

// When you initialize the builder, add the baggage you want to propagate
tracingBuilder.propagationFactory(
  BaggagePropagation.newFactoryBuilder(B3Propagation.FACTORY)
                    .add(SingleBaggageField.remote(COUNTRY_CODE))
                    .build()
);

// Later, you can retrieve that country code in any of the services handling the trace
// and add it as a span tag or do any other processing you want with it.
countryCode = COUNTRY_CODE.getValue(context);

This may look familiar to ExtraFieldPropagation, as it includes all the features it had and more. BaggagePropagation can also integrate with logging contexts and cleanly encapsulate field configuration.

Currently, BaggagePropagationConfig only supports predefined fields. However, dynamic fields will be supported in a future version, with no API break to you. Dynamic fields must either be in-process only, or use single header encoding. We will likely default to W3C encoding once they decide on a header name that works with JMS.

All these types have Javadoc and there is an introduction in Markdown here:

History

The name Baggage was first introduced by Brown University in Pivot Tracing as maps, sets and tuples. They then spun baggage out as a standalone component, BaggageContext and considered some of the nuances of making it general purpose. The implementations proposed in these papers are different to the implementation here, but conceptually the goal is the same: to propagate "arbitrary stuff" with a request.

Even though OpenTracing named propagated fields Baggage initially, we decided not to, as the Apis were not safe for arbitrary usage. For example, there was no implementation which could allow control over which fields to propagate, set limits or how to redact them. We didn't want to call anything Baggage until we could do that safely.

Instead, Brave 4.9 introduced ExtraFieldPropagation as a way to push other fields, such as a country code or request ID, alongside the trace context. It had get() and set() methods to retrieve values anywhere a span is active, but the above issues remained.. hard issues described here #577

The current baggage apis resolve the design problems that limited us in the past. It took many weeks of full-time effort from volunteer co-designer @anuraaga as well site input from @jorgheymans to surmount these hurdles.

CorrelationScopeDecorator (MDC integration)

CorrelationScopeDecorator is an advanced implementation of correlation shared by all implementations (like log4j, log4j2, slf4j). It can map field names, even allow you to flush updates of baggage synchronously to the underlying context. This integrates seamlessly with BaggagePropagation thanks to many volunteered weeks of design and review from @anuraaga as well input from site owner @jorgheymans.

All context integrations extend CorrelationScopeDecorator.Builder which means you can make portable configuration.

Ex. this is the only part that has to do with the implementation:

CorrelationScopeDecorator.Builder builder = MDCScopeDecorator.newBuilder();

By default, if you call build(), only traceId and spanId integrate with the underlying context. This is great for performance( only better if you customize to only include traceId!).

A common configuration would be to integrate a BaggageField as a correlation field in logs.

Assuming the above setup for COUNTRY_CODE, you can integrate like this:

import brave.baggage.CorrelationScopeConfig.SingleCorrelationField;

decorator = MDCScopeDecorator.newBuilder()
                             .add(SingleCorrelationField.create(COUNTRY_CODE))
                             .build();

tracingBuilder.currentTraceContext(ThreadLocalCurrentTraceContext.newBuilder()
    .addScopeDecorator(decorator)
    .build()
);

// Any scope operations (updates to the current span) apply the fields defined by the decorator.
ScopedSpan span = tracing.tracer().startScopedSpan("encode");
try {
  // The below log message will have %X{country-code} in the context!
  logger.info("Encoding the span, hope it works");
--snip--

All these types have Javadoc and there is an introduction in Markdown here:

History

Before, we had types like MDCScopeDecorator for integrating extra fields as correlation fields in logging contexts. However, they were not customizable. In order to satisfy any user that needs "parentId", all scope decorators set this. This meant overhead in all cases, which adds up especially in reactive code.

MongoDB instrumentation

brave-instrumentation-mongodb includes a TraceMongoCommandListener, a CommandListener for the Mongo Java driver that will report via Brave how long each command takes, along with relevant tags like the collection/view name, the command's name (insert, update, find, etc.).

Volunteer @csabakos spent a month developing this for you and is owed a lot of thanks, also to volunteers @anuraaga and @kojilin for review and advice

https://github.com/openzipkin/brave/tree/master/instrumentation/mongodb

An application registers command listeners with a MongoClient by configuring MongoClientSettings as follows:

CommandListener listener = MongoDBTracing.create(Tracing.current())
        .commandListener();
MongoClientSettings settings = MongoClientSettings.builder()
        .addCommandListener(listener)
        .build();
MongoClient client = MongoClients.create(settings);

Support for asynchronous clients is unimplemented. To re...

Read more

DO NOT USE

03 Apr 05:20
Compare
Choose a tag to compare
DO NOT USE Pre-release
Pre-release

This was a bad release, please use 5.11.2 or higher

Brave 5.10

28 Feb 05:53
Compare
Choose a tag to compare

Brave 5.10 completes migration to our new HTTP instrumentation types: HttpRequest and HttpResponse by introducing new parsers. It also makes it easier to access request and error details from a response. Finally, we lower overhead relating to scoping across the board.

This release was the sum of many contributors, but a special shout-out is given to @anuraaga who reviewed all change and provided a lot of excellent feedback we need to keep documentation and design as clean and understandable as possible.

Let's get to it!

Introducing HttpRequestParser and HttpResponseParser

Those of you doing custom data policy are familiar with the HttpAdapter type introduced in Brave v4. This type allowed you to take a raw request, such as HttpServletRequest and pull something portable out of it, such as the HTTP url adapter.url(request). In Brave 5.7, we replaced this with wrapper types to do the same thing. For example request.url() would dispatch to the corresponding framework-specific implementation.

We decided to do wrappers as even if there is an extra allocation to instantiate one (performance hit), in practice we often needed to combine multiple types to achieve a single goal. Let's take the URL example. Sometimes, you need to access a route object and also an HTTP request object in order to build the actual URL called. In other words, the assumption that a single type could work as a raw request was faulty. We decided in Brave 5.7 to fix that, starting with sampling. Brave 5.10 completes the task by migrating all work to the new types, including data policy.

It was not easy to migrate while still keeping compatible with old code. The way we did it was introducing new parsers which should be used instead of the former HttpParser: HttpRequestParser and HttpResponseParser. Those who made custom policy should be able to migrate easily.

Ex. To add the URL tag in addition to defaults:

 httpTracing = httpTracing.toBuilder()
-    .clientParser(new HttpClientParser() {
-      @Override
-      public <Req> void request(HttpAdapter<Req, ?> adapter, Req req, SpanCustomizer span) {
-        super.request(adapter, req, span);
-        span.tag("http.url", adapter.url(req)); // add the url in addition to defaults
-      }
+    .clientRequestParser((req, context, span) -> {
+      HttpClientRequestParser.DEFAULT.parse(req, context, span);
+      span.tag("http.url", req.url()); // add the url in addition to defaults
     })
     .build();

What's subtle about the design is that it is splitting request and response allowed us to use lambdas. Future functionality, such as composable units can be built easier with functional code such as above.

The other subtlety is that we pass the context argument explicitly now. This was a choice due to performance overhead of the prior design. "Scoping" is when you use thread-locals (and anything hung off them) to make something implicit. We found users are doing a lot in scope functions, and that has a cost to it. This design moves implicit to explicit, reducing scope operations 2x per HTTP call.

Meanwhile, commands such as tagging extra fields are still possible, using explicit parameters, like so:

httpTracing = httpTracing.toBuilder()
    .clientRequestParser((req, context, span) -> {
      HttpClientRequestParser.DEFAULT.parse(req, context, span);
      String userName = ExtraFieldPropagation.get(context, "user-name");
      if (userName != null) span.tag("user-name", userName);
    })
    .build();

Introducing Response.request() and Response.error()

A great idea @anuraaga had was to avail more properties at response time than we currently do. We first felt this tension about the "http.route" tag. A matched route is usually unknown until late in processing. Hence, we formerly had HttpResponse.route() to avail this property late in the cycle. Over time, you could imagine other request properties useful at response time. Instead of adding these, piece by piece, we now have an optional Response.request() accessor. Building off that, we realised that the error associated with a response could be more convenient if available directly as Response.error(). This helps in functional code as it allows a single-argument: Response to reach all parseable data.

Other change

Brave 5.10 includes a lot of other work, less Api impacting, but still critical to things becoming better each time. We are lucky to have so much feedback and help continuing Brave's nearly 7 years service to users.

  • Add Tracer.nextSpanWithParent() for explicit propagation
  • Lowers the minimum version of Apache HttpClient instrumentation from 4.4 to 4.3
  • Enforces sampled propagation field must be well formed
  • Handles special case when a JMS destination is both queue and topic. Thanks @nomeez for the investigation
  • Stops writing the gRPC "method" propagated tag
  • Ensures gRPC client response callbacks happen in invocation context
  • Makes HTTP request method mandatory when parsing

Brave 5.9.5

13 Feb 19:32
Compare
Choose a tag to compare

Brave 5.9.5 notably preserves case of local and remote service names. This allows non-Zipkin destination, such as cloud service providers, to see the raw data in their FinishedSpanHandler exporters.

Thanks @csabakos for the investigation and code to fix this!

Brave 5.9.2

30 Jan 02:32
Compare
Choose a tag to compare

Brave 5.9.2 is our first release in the year 2020, brought to you by the direct efforts of ten volunteers. Notably, @anuraaga reviewed nearly all changes in this release and deserves credit for keeping the fire burning. If you are happy with the teamwork, please star this repo or say hi on https://gitter.im/openzipkin/zipkin. There's plenty to do and we appreciate help if you can spare time.

Without further adieu, here are the release notes:

Features:

  • Async HTTP client callbacks should be invoked in invocation context of the caller (ex the callsite of the initial http client request). Otherwise, HTTP client spans can appear nested, which are confusing and effect statistics. Formerly, we were inconsistent on this, but now it is enforced across the board with integration tests. Thanks for the analysis and help by @simontoens (#1055, #1067)
  • kafka-clients users can now control whether or not to continue a trace or fork a new one with the flag singleRootSpanOnReceiveBatch. (#1033) thanks @jeqo for leading this and @jorgheymans, @anuraaga for the detailed review

Fixes:

  • accidental inheritance of shared flag (#1071) Thanks @narayaruna and @devinsba for the investigation!
  • workaround JAX-RS in-compliant client libraries which return immutable views of headers when they shouldn't (#1046) Thanks @SimY4 for the analysis and code.
  • accidental wrapping of null JMS message listeners (#1065) Thanks @ohr for the fix!
  • spring-rabbit interceptor order was incorrect, so other interceptors couldn't see the trace context (#1051) Thanks @kubamarchwicki for getting to the bottom of this and volunteering to fix it
  • certain interceptor usage patterns could lead to a leaked scope in httpasyncclient. (#1050) thanks @andylintner for tracking this down and correcting the code
  • a fragment of code modified from guava was not properly attributed. It is now in the NOTICE file in brave's jar (#1056)

Notes:

  • Our gRPC instrumentation was written to be compatible with the Census project. Starting with gRPC v1.22, the Census integration stopped propagating the tag "method". While we still propagate this, the next minor version of Brave will also stop, and save needless in-process and wire overhead. As gRPC dropped Census as a core dependency in v1.27, we will similarly also de-prioritize interop with Census.

Brave 5.9.1

21 Nov 22:47
Compare
Choose a tag to compare

Brave 5.9.1. adds some minor fixes and additional features:

Features:

  • Kafka Streams Tracing flatMap operator (#1041)
  • Update Zipkin Reporter to 2.11.1 (#1038)

Fixes:

  • Fix path parsing of HttpClient to return the whole path (#1036)
  • Catch Throwable instead of RuntimeException or Error on Kafka Streams (#1030)
  • Remove duplicate mockito dependency (#1027)

Thanks to @jcchavezs, @anuraaga, @jorgheymans and @worldtiki for your contributions!

Brave 5.9

01 Nov 01:06
Compare
Choose a tag to compare

Brave 5.9 notably begins a messaging abstraction. It also allows customizing of which B3 format is used based on Span kinds (CLIENT, SERVER, etc.). If you enjoy this work, please star our repo or join gitter to thank folks!

New Messaging abstraction

In Brave's main repository, we have three messaging instrumentation libraries: Kafka, JMS, and RabbitMQ. We've started a messaging abstraction with sampling.

Ex. Here's a sampler that traces 100 consumer requests per second, except for the "alerts" channel. Other requests will use a global rate provided by the Tracing component.

import brave.sampler.Matchers;

import static brave.messaging.MessagingRequestMatchers.channelNameEquals;

messagingTracingBuilder.consumerSampler(MessagingRuleSampler.newBuilder()
  .putRule(channelNameEquals("alerts"), Sampler.NEVER_SAMPLE)
  .putRule(Matchers.alwaysMatch(), RateLimitingSampler.create(100))
  .build());

This code is 100% portable across traced libraries. In other words, JMS tracing uses exactly the same MessagingTracing component as Spring Rabbit: Rules can be mixed in the same way as they can with our HttpTracing component. We hope this can help you prune traces to the most impactful!

Thanks very much to @anuraaga and @jeqo for design and code review.

Propagation customization

It is already the case that folks can make custom propagation components to address different header formats, such as Amazon's. One repeated concern was to control which of the B3 formats should be used when sending headers down. For example, in a transition, you may want to send both our single and multi-header formats. However, new sites may choose to only send the single format as it is cheaper.

In a pragmatic move, we've retro-fitted the default B3Propagation implementation to consider the kind of span when choosing formats to write. This inherits the default formats used prior: "x-b3-" prefixed for client/server spans and the single "b3" format for producer/consumer spans.

To override this policy, you can use a builder like below. The following makes RPC and HTTP instrumentation write "b3" single format such as messaging spans do.

tracingBuilder.propagationFactory(B3Propagation.newFactoryBuilder()
      .injectFormat(Format.SINGLE) // things that don't extend `brave.Request`
      .injectFormat(Span.Kind.CLIENT, Format.SINGLE)
      .injectFormat(Span.Kind.SERVER, Format.SINGLE)
      .build())

Regardless of this policy, both "b3" and "x-b3-" headers are read, as if we changed that, it would break existing sites! Tolerant reads are best for compatibility and interop!

Note: Our HTTP, RPC, and Messaging abstractions employ these by default by extending brave.Request. If you are not using our abstractions, please consider updating to also use brave.Request for remote instrumentation so that users can easily tune header formats!

Minor changes

  • OSGi manifests are now present in all jars
  • @jeqo fixed where Kafka and JMS callbacks couldn't read no-op trace contexts
  • @simondean allowed kafka-streams to inherit policy from kafka-clients instrumentation
  • @rmichela fixed where SQS used with JMS could raise exceptions where it shouldn't

Brave 5.8

07 Oct 20:16
Compare
Choose a tag to compare

Brave 5.8 significantly improves sampling infrastructure and begins an Rpc abstraction

Functional sampling with SamplerFunction

Over time, we've accumulated a number of sampling tools. For example, we have DeclarativeSampler, which can be used to process Java annotations, and HttpRuleSampler for path and method rules. We've consolidated all parameterized samplers under a functional interface: SamplerFunction, which allows better re-use as we journey into more models such as RPC and Messaging.

Here's an example using functional expressions to declare "/health" should never be sampled, while we want POST /api requests limited to 100 traces per second:

import static brave.http.HttpRequestMatchers.*;
import static brave.sampler.Matchers.and;

httpTracing = httpTracingBuilder.serverSampler(HttpRuleSampler.newBuilder()
  .putRule(pathStartsWith("/health"), Sampler.NEVER_SAMPLE)
  .putRule(and(methodEquals("POST"), pathStartsWith("/api")), RateLimitingSampler.create(100))
  .build()).build();

As a side effect, we've deprecated Tracer.withSampler in favor of lighter methods that accomplish the same:

  • Tracer.startScopedSpan(name, sampler, param)
  • Tracer.nextSpan(sampler, param)

New RPC abstraction

In Brave's main repository, we have two different RPC libraries: Dubbo and gRPC. We've started an RPC abstraction with sampling.

For example, here's a sampler that traces 100 "GetUserToken" requests per second. This doesn't start new traces for requests to the health check service. Other requests will use a global rate provided by the tracing component.

import static brave.rpc.RpcRequestMatchers.*;

rpcTracing = rpcTracingBuilder.serverSampler(RpcRuleSampler.newBuilder()
  .putRule(serviceEquals("grpc.health.v1.Health"), Sampler.NEVER_SAMPLE)
  .putRule(methodEquals("GetUserToken"), RateLimitingSampler.create(100))
  .build()).build();

While the conventions above are gRPC centric, the code is 100% portable. In other words, Dubbo tracing uses exactly the same RpcTracing component and rules can be mixed in the same way as they can with our HttpTracing component.

Thanks very much to @trustin @anuraaga and @jeqo for design and code review.

Tracing.Builder.alwaysReportSpans()

Most users will want to use defaults, when deciding where to report trace data. Some advanced sites will have a trace forwarder, a proxy that sends data to one or more places. One open source example is https://github.com/HotelsDotCom/pitchfork. In cases like this, one proxy may want all data, regardless of if it is B3 sampled or not. This change introduces Tracing.Builder.alwaysReportSpans(), primarily in support of the secondary sampling project.

Brave 5.7

07 Oct 19:57
Compare
Choose a tag to compare

Brave 5.7 introduces configuration customizers and revamps our HTTP abstraction.

XXXCustomizer

Users want to customize only certain parts of the tracing subsystem, letting frameworks configure the rest. For example, a user wants to override the sampler, but not affect the span reporter. This split of concerns works as long as it is possible to run multiple configuration before constructing a component.

Along with a dependency injection tool, customizers help decouple configuration and also ease testing of new features. Users provide XXXCustomizer instances, and frameworks like spring-cloud-sleuth call them at the right time.

In Brave 5.7, we expose several interfaces to customize configuration and use them in our spring-beans integration.

  • CurrentTraceContextCustomizer - called before invoking CurrentTraceContext.Builder.build()
  • ExtraFieldCustomizer - called before invoking ExtraFieldPropagation.FactoryBuilder.build()
  • HttpTracingCustomizer - called before invoking HttpTracing.Builder.build()
  • TracingCustomizer - called before invoking Tracing.Builder.build()

HTTP abstraction overhaul

Before, we used an adapter model to parse a framework type, such as HttpServletRequest. This was a partial function, and aimed to eliminate an extra object allocation otherwise needed during parsing. This worked well until we started to see propagation integration.

For example, Netflix wanted to inspect the http path to make a secondary sampling decision. They could see the request type as a parameter of Extractor.extract(). However, this function had no access to the adapter which would provide means to parse the path. In other words, without a known http request type, secondary sampling could not be accomplished portably.

Through quite a lot of effort, we overhauled every http integration so that they directly use new types: brave.http.HttpClientRequest and brave.http.HttpServerRequest. This opens up new integration possibilities, not just secondary sampling, but also deciding at runtime which headers to use on a per-request basis.

Thanks @basvanbeek for the design help and @anuraaga @devinsba and @jeqo for review

External timestamps in HTTP client and server handling

Armeria supports RequestLog.responseEndTimeNanos() for more precise
timestamps, which line up with metrics. Our HTTP handlers now allow external timestamps, so that advanced frameworks can collaborate in timing decisions. Thanks for the help by @trustin and @kojilin

FinishedSpanHandler.supportsOrphans()

@lambcode noticed that FinishedSpanHandler instances were not called when data is orphaned by buggy instrumentation. While this is technically correct (the "finished" in FinishedSpanHandler is about the correct case), there are use cases to process orphaned data. You can now implement FinishedSpanHandler.supportsOrphans() to indicate handler desires all data, not just complete data. Thanks @lambcode for the code spike and @devinsba @anuraaga for review.