Populating the Database
Sibyl currently offer three approaches for populating the database.
The command-line option does not currently support editing existing databases
To populate the database with the command line, you will need to:
Prepare a config YAML (see below).
Prepare all required input CSVs and put them in a single directory
Run the command-line script:
where config.yaml
is the path to your config yaml and directory
is the path to the directory containing your inputs.
Config YAML
See the data preparation config template for the most up-to-date set of configurations.
A minimum working database preparation config looks like:
The full set of currently supporting configs includes:
General setup
database_name: Name of database to create.
drop_old: If a database with the selected name already exists, drop it if True. Default: False
Data locations
entity_fn: Name of file with entity data. Default: "entities.csv"
feature_fn: Name of file with feature data. Default: "features.csv"
realapp_fn: Name of file with pickled RealApp object (for single-model applications). Default: "realapp.pkl"
realapp_directory_name: Name of directory with multiple pickled RealApp objects. The model IDs will be set to the filenames. Overrides realapp_fn.
context_config_fn: Name of context configuration YAML (see Configuring Applications) Default: "context_config.yaml"
category_fn: Name of the file with category data. Default: "categories.csv"
Model processing configurations
model_id: Model ID (for single-model applications)
label_column: Name of label column in training dataset (y-values). Default: "label"
fit_explainers: If True, fit all explainers in the provided RealApp(s) using the training dataset Default: False
training_size: If fit_explainers is True, the number of entities to use to fit. Default: 1000
Last updated