Introduction of Configuration Module

This is a short introduction of the configuration module. As mentioned in the main README, this module provides the flexibility of controlling different parameters in the other modules.

The platform leverages a global_config.yaml to set a small number of parameters that can be widely applied to multiple models. In addition, each model has its unique config file to enable custom adjustment.

Global config

The global_config.yaml determines the following setups:

all - applying to all models
- prediction_tasks: a list that includes all supported model prediction tasks
- ds_keys: a list of dataset keys involved in the evaluation
- flag_more_feat_types: whether to use additional feature types. Currently only can be True when ds_keys only contains INS-W.
ml - applying to traditional models
- save_and_reload: a flag to indicate whether to save and re-use features repetitively (intermediate files will be saved in tmp folder). Default False. Be careful when turning this flag on, as it will not update the feature file once it is saved. Set it to True only when re-running the exact same algorithm.
dl - applying to deep models
- best_epoch_strategy: a flag to choose the best training epoch as the final prediction model: direct or on_test. When it is set as direct, it will use a standard strategy: picking the best training epoch on the validation/training set. When it is set as on_test, it will use another strategy that involves information leakage. It iterates through all training epochs, and performs the same direct strategy at each epoch. Then, the results on the testing set across all epochs are compared to identify the best epoch. The results only indicate whether a model is overfitted, and reflect the theoretical upper bound performance during the training.
- skip_training: similar to save_and_reload in ml, this is a flag to accelerate the deep model evaluation process. A model's intermediate training epoch results will be saved in tmp folder. When this flag is turned on, the model can leverage the saved results to re-identify the best epoch. A typical usage case: (1) set skip_training as False and best_epoch_strategy as direct to go through the training. (2) set skip_training as True and best_epoch_strategy as on_test to find another epoch without the need to re-train the model.

It is worth noting that global_config.yaml will overwrite the individual config files on the same items. This can save the effort of changing individual parameters one by one.

Model config

Each algorithm can lead to one or more models, and each model is accompanied by one config yaml file with a unique name.

Here is a list of the current supported models:

Traditional Machine Learning Model
Deep-learning Model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Introduction of Configuration Module

Global config

Model config

Files

README.md

Latest commit

History

README.md

File metadata and controls

Introduction of Configuration Module

Global config

Model config