Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Machine-specific configuration and denoise configuration #257

Open
smarr opened this issue Aug 27, 2024 · 2 comments
Open

Support for Machine-specific configuration and denoise configuration #257

smarr opened this issue Aug 27, 2024 · 2 comments

Comments

@smarr
Copy link
Owner

smarr commented Aug 27, 2024

With the latest machines coming into our benchmarking infrastructure being hybrid processors with efficiency and performance cores, aka big.LITTLE architectures, it becomes more desirable to configure on which cores benchmarks are executing.

At the moment, denoise uses cset to enable shielding for a rather large number of cores based on a simple heuristic of leaving some room for the system.

def _shield_lower_bound(num_cores):
    return int(floor(log(num_cores)))


def _shield_upper_bound(num_cores):
    return num_cores - 1


def _activate_shielding(num_cores):
    min_cores = _shield_lower_bound(num_cores)
    max_cores = _shield_upper_bound(num_cores)
    core_spec = "%d-%d" % (min_cores, max_cores)

With hybrid architectures, but also multi-socket systems, it is desirable to be more proactive to decide what's executed where.
One may even want to compare the different types of cores.

So, I think some kind of automatic approach of defining the cores is not sufficient.
Instead, it would be good to be able to configure the settings explicitly.

denoise currently knows the following configuration parameters:

  • use_nice
  • use_shielding
  • for_profiling

And ReBench's configuration system has the following priority list of configurations:

  1. benchmark
  2. benchmark suites
  3. executor
  4. experiment
  5. experiments
  6. runs (as defined by the root element)

Here 1. overrides all other configurations, and 6. has the least priority.

Since #170, we also have the option to mark invocations, iteration, and warmup settings as "important" with the !. Though, that's not yet documented...

So, at this point, I am thinking of adding a new lowest level of priority to the list: machine.
Then we have the priority list:

  1. benchmark
  2. benchmark suites
  3. executor
  4. experiment
  5. experiments
  6. runs
  7. machine

#161 already introduced the notion of a machine to be able to filter by it so that we can run benchmarks easily on specific machines.

With a new type of setting for denoise in the configuration as part of the run details (rebench-schema.yml), we could then do something like:

runs:
  denoise:
    shield: 1-5 (with the cset syntax)

As well as:

machines:
  yuria1:
    denoise:
      shield: 7, 8, 9
    invocations: 4
  yuria2:
    denoise: 1-3,40-50

Of course, this opens the possibility to also do:

benchmark_suites:
  ExampleSuite:
    invocations: 3
    denoise:
      nice: false

So, we may need to frequently change the denoise settings.
Though, because of #249, we should rework how denoise settings are applied anyway.
This should also consider #168.

@OctaveLarose
Copy link
Contributor

That sounds sound to me. Just make the default all cores and denoise (like it currently is, correct?). You probably also want to add a warning to Rebench if using an architecture that might not play nicely with the current default settings?

@smarr
Copy link
Owner Author

smarr commented Sep 4, 2024

One of the issues I am not yet quite sure about is that we already have a notion of machine or rather machines.

Introduced with #161 and in the schema here:
https://github.com/smarr/ReBench/blob/master/rebench/rebench-schema.yml#L140-L145

Having two independent notions of machine seems like a great source for confusion.

So, I am currently thinking I would want to keep these two features separate. The machine feature that was introduced by #161 is really only used to filter the set of experiments based on the "tag". It's very convenient to split a configuration into something that can be ran on multiple machines.

However, it does not really relate to the machine itself.

Of course another option would be to combine the two notions of machine. For the machine-based configuration, I am thinking we may want a rebench -m yuria1 command-line option so that rebench activates the configuration of the selected machine for the execution.

This then would also be possible to filter the benchmarks at the same time.

By keeping things separated, we'd have more flexibility. On the one hand, one would not need to "tag" benchmarks for a specific machine, and could simply run the same set on multiple machines, with the corresponding configuration applied.

On the other hand, we don't really have any use case for a separate tagging mechanism. While it could be useful, for instance, to tag fast or slow benchmarks, or things like latency vs throughput, even when they are in the same benchmark suite, we have not really needed it so far.

smarr added a commit that referenced this issue Nov 7, 2024
This is in preparation for the support of #257, which will add machine and denoise configuration support.
Though, it’s also useful without, because it allows us to distinguish RunIds that differ in their environment variables, for instance. Or more generally, RunIds that differ based on any property that is not strictly part of the command line, which is was previously used to establish equality.

This as a consequence means, we have a much weaker ability to determine equality of RunIds than before, but I think that’s fine and less surprising/buggy.

The main changes are a proper implementation of the __eq__, __lt__, and __hash__ methods, as well as the serialization of RunIds into the data file, and deserialization when loading a data file.

This change includes some minor refactorings, because it is split from a too huge and unmanagable patch.

Minor refactorings:
 - configurator: extract config validation and assembling of run details
 - executor: use exe name and suite name directly in indicate_build
 - executor/BuildCommand: location is in BuildCommand only for equality, but semantically, we should use the location/path of the suite/executor when processing the build command. That’s now made explicit, and asserted for correctness.
 - BuildCommand is now storing only the command, since the location is used from suite/exe

Signed-off-by: Stefan Marr <git@stefan-marr.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants