Environment variables

There is a backend.env file in the conf/ directory, that contains all environment variables passed in the backend container. By using those variables, it is possible to manipulate the configuration of the backend.

Semi-required variables are required for the system to work, but are correctly configured in any default setup. Optional settings can be set to manipulate the systems behavior.

Those variables are available:

API_HOST (semi-required) should be set to the name that the executor scripts can use to call the API. This should not be set when the usual docker-compose files are used, the local Docker instance is used as well and the networking is enabled (see later variables) as in those cases it will be set correctly automatically. When the execution environments are run on a remote docker instance, or when the Docker network used by the backend is not reused in the execution environments, this should be set to the correct IP/hostname where the webserver is available from the outside world.
DB_TYPE (semi-required, postgresql) The database type to be used for the configuration database. This should correspond to the DB types used for SQLAlchemy connection strings (https://docs.sqlalchemy.org/en/latest/core/engines.html#sqlalchemy.create_engine)
DB_HOST (semi-required, database) The host, where the configuration DB is available
DB_DATABASE (semi-required, postgres) The name of the DB on the DBMS running on the DB_HOST
DB_USER (semi-required, postgres) The user to log in on the DB
DB_PASSWORD (semi-required, postgres) The password for the user
SQLALCHEMY_DATABASE_URI (semi-required) Will be generated automatically from the settings above, can be set here otherwise. Should be an SQLAlchemy compatible connection string. If set, the settings above will not be used.
DATA_SOURCE_CONNECTIONS (optional) If set, should be in the form of an JSON-object. It should contain a mapping from name on SQLAlchemy compatible connection strings (see above). An example set would be: DATA_SOURCE_CONNECTIONS={"hana":"hana+pyhdb://user@host:port"}
DAEMON_CYCLE_TIME (semi-required, 5 seconds) The time in seconds, which the background daemon waits to check for jobs that should be running but have stopped doing so. If this is decreased, this might increase the load on the Docker daemon running the experiments but increases the interactivity of the application (as jobs are marked as failed faster).
UWSGI_NUM_PROCESSES (semi-required, 4) This can be set when the system is used in a non-default configuration, where the uWSGI instance is configured elsewhere or not used at all. This is necessary, because we only want one daemon to run, but the app factory is run once for each worker thread. To determine which worker thread, should launch the daemon every worker calculates its process id modulo the number of uWSGI processes. This returns 0 for only one worker, as all workers have continuous increasing PIDs (for example 4,5,6,7). This worker launches the daemon. Set this variable, if the amount of worker threads is not equal to the variable set in the uwsgi.ini for any reason, to avoid the daemon to launch multiple times.
RESULT_READ_BUFF_SIZE (semi-required, 16384kb) The buffer size used by the JSON parser when an experiments result is posted. This can be set accordingly to the available RAM.
RESULT_WRITE_BUFF_SIZE (semi-required, 1024 objects) The number of objects stored in RAM that are sent as bulk insert to the configuration DB, when a result is written. Decreasing this decreases the RAM required by the worker that handles the request, but increases the time required to handle the result, as more DB operations are initiated.
LOAD_SEPARATION_SET (semi-required, default true) Set to 'true' to enable acceptance of separation sets when results are sent back to the server. Setting this to 'false' or any other value but 'true' disables parsing on the server side and notifies the script via command line parameter that it is supposed not to send the separation sets with the result.
DOCKER_BASE_URL (semi-required, local docker.sock) The socket where the docker daemon is available that should launch the experiments in their images. The execution environment images have to be available on this daemon.
DOCKER_EXECUTION_NETWORK (semi-required, defaults to the network that is also used by the server itself to make the backend available to the execution environments directly) The docker network, that experiments should be attached to. Set empty to disable attachment of any network.
DOCKER_MOUNT_LOG_VOLUME (semi-required, true) If set to 'true', the system will keep a centralized volume that is mounted in every execution environment and should be used as emergency storage, for example when an error occurred or the result can not be send. In the default environment, the dump of the failed request will be stored in this volume.
DOCKER_LOG_VOLUME_NAME (semi-required, 'mpci_worker_logs') The name of the volume used as emergency storage volume.
DOCKER_LOG_VOLUME_MOUNT_PATH (semi-required, '/logs') The path, where the emergency storage should be available in the execution environments.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Environment variables

Causal Inference Pipeline

Clone this wiki locally