Skip to content

Latest commit

 

History

History
96 lines (70 loc) · 3.5 KB

README.md

File metadata and controls

96 lines (70 loc) · 3.5 KB

LDES Evaluation Runner

Setup

Retrieve the dataset

First, make sure to install the required dependencies and build the code by running:

cd setup
npm install
npm run build

Then, you can gather the data by running:

MAX=100000 npx js-runner setup/pipeline.ttl
# OR
npx js-runner setup/pipeline-6months.ttl

With the MAX environment variable you can specify the amount of triples to download. The default for setup/pipeline.ttl is 100000. To download the full LDES, set MAX to 0.

Build the client

After the data has been gathered, you can build the client by running:

cd client
npm install
npm run build

Build the orchestrator

After the client has been built, you can build the orchestrator by running:

cd orchestrator
npm install
npm run build

Configuration

Configuring the benchmark is done with the use of .env files. The following variables can be set:

  • TYPE: The benchmark type. Currently supported: UPDATING_LDES.
  • EXEC_FILE: The file to execute. This is the file that will be benchmarked.
  • WARMUP_FILE: The file to execute during the warmup phase.
  • WARMUP_ROUNDS: The amount of warmup iterations to run.
  • ITERATIONS: The amount of iterations to run the benchmark.
  • INGEST_PIPELINE: The pipeline to use for ingesting the LDES.
  • PAGE_SIZE: The page size to use for the LDES.
  • K_SPLIT: In the case of a time-based LDES, the amount of child buckets to split the full bucket into.
  • MIN_BUCKET_SPAN: In the case of a time-based LDES, the minimum span of a bucket.
  • INTERVAL: The interval at which the LDES is updated.
  • AMOUNT_PER_INTERVAL: The amount of members to add per interval.
  • EXPECTED_AMOUNT: In the case of UPDATING_LDES. The expected amount of members in the LDES after the benchmark. Used to end the benchmark.
  • POLL_INTERVAL: In the case of UPDATING_LDES. The poll interval used by the ldes-client during the benchmark.
  • NGINX_CONFIG: The path to the nginx configuration file to use.
  • NGINX_SITE: The path to the nginx site configuration file to use.
  • DATABASE_URL: The URL to the database to use. Supported: mongodb://... and redis://....
  • NUM_CLIENTS: The number of clients that should simultaneously run during the benchmark.
  • REPLICATION_DATA: The path on the host to the replication data file to use. This is the input to ingest the LDES with.
  • METADATA_FILE: The path on the container to the metadata file to use. This is the metadata input to ingest the LDES with.
  • TIMESTAMP_PATH: The timestamp property in the LDES.
  • UNORDERED_RELATIONS: In case of an HourBucketizer LDES, whether default tree:Relations should be used or ordered tree:GreaterThanOrEqualRelations.
  • CLIENT_ORDER: the order with which the ldes clients should be started. ascending, descending or none.
  • LDES_PAGE: The LDES page to extract the members from in case of the EXTRACT_MEMBERS benchmark type.
  • CBD_SPECIFY_SHAPE: Whether a shape should be specified for CBD.
  • CBD_DEFAULT_GRAPH: Whether to use default graph for CBD.

Preconfigured .env files can be found in the env directory.

Running the benchmark

To run the benchmark, execute the following commands:

# Run as many client runners as you want, optionally on different machines.
node client <name> <server-hostname>

# Run the benchmark orchestrator, this will start the benchmark and use the client runners.
node orchestrator <env-file> <output-file>

<env-file> should be the absolute path to the .env file you want to use.