QueryGen - Duplicates #51

ljukas · 2019-08-23T08:30:15Z

Hello.

When we met at the mid-term review we talked about the test methodology. One thing we mentioned was that we wanted to run throughput tests with the different query-templates.

For some query-templates only a few different queries can be generated that are distinctly different, for example query-template 2 will only generate 22 different queries when the database is created with the regular settings.

One way to combat this was for the querygen to be able to generate duplicates. I'm ready, or very very soon ready, to start running the real tests now. Would it be possible to include an option in the querygen that makes it generate duplicates? Or should I just copy-paste the generated queries to get more of them?

hartig · 2019-08-23T12:42:08Z

I think in the particular case of your experiments it would be better to have the test driver reuse queries once it runs out of available queries (i.e., essentially, starting from the beginning again). This approach is easier to control and it allows for achieving deterministic experiment runs, which is not the case if the query generator simply creates duplicates in a random fashion. Of course, instead of implementing this approach into your test driver, you may also simply copy-paste the generated queries to achieve the same result. In any case, you need to think a bit about how you want to adopt this approach for multi-client experiments. Please outline a strategy in an email to us and we can discuss the strategy further.

Having said that, perhaps there are use cases in which we actually want workloads with duplicates. However, this requires a more systematic approach than just randomly generating queries without caring about duplicates. Instead, it should be possible to control the fraction of duplicates within the generated workloads. While developing (and implementing) such a more systematic approach is not a priority at the moment, we can leave this issue open as a reminder that we may come back to it later if needed.

hartig added the wontfix at the moment This will not be worked on at the moment label Aug 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QueryGen - Duplicates #51

QueryGen - Duplicates #51

ljukas commented Aug 23, 2019

hartig commented Aug 23, 2019

QueryGen - Duplicates #51

QueryGen - Duplicates #51

Comments

ljukas commented Aug 23, 2019

hartig commented Aug 23, 2019