Skip to content

What is the best way of running an aggregation over a subset of the file? #333

Answered by Bluetopia
Bluetopia asked this question in Q&A
Discussion options

You must be logged in to vote

So if I understand the suggestion correctly, it would be:

  • Instantiate the aggregation that performs date filtering with start/end parameters specified.
  • Load the aggregation using GCToolKit.loadAggregation
  • Analyze the file using GCToolKit.analyze()
  • Retrieve the result using jvm.getAggregation().

If this is the case then I may have misunderstood the way this library is used. I thought I'd be able to "replay" the full event sequence against an aggregator, such that I could perform the same operation against a smaller data set. Rather, it seems that all the aggregations get run once during GCToolKit.analyze(), so if I want to re-process data, I'll need to cache the values of interest in the…

Replies: 2 comments 3 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
3 replies
@Bluetopia
Comment options

Answer selected by karianna
@dsgrieve
Comment options

@kcpeppe
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
4 participants