Skip to content

Project configuration

Configuration for your project is required so that Cubonacci knows exactly what to do in certain situations. This is stored in the YAML file named cubonacci.yaml and is available in the root of your repository. New to YAML? You can find a quick introduction over here.

The first required value is the cubonacciVersion that refers to the version of the configuration that you are using. At the moment, this is v1 or alpha1. The second option is a description of the project. The other fields are explained in detail in the next sections.

Runtime

The runtime section described which language you are using, at the moment this is only python. The version field configures which exact version is required.

Optimization

The optimization field involves which metric to optimize directly. This is used to guide hyperparameter search and to determine the best trial out of a full experiment.

Another key called includeTrainMetrics has a boolean value (true/false) and involves whether or not to measure how well the model performs on the training set itself, which will then be shown in the experiment page.

Validation

Validation schemes can be used to more accurately measure the performance of models. While it is possible to have custom validation schemes, there are a number of options for Cubonacci to take care of it. The type field describes the option chosen, which will be explained in the next sections.

The boolean shuffle option determines whether or not to randomly shuffle the data upfront. This is important to do if for example the data is ordered by a label. It is important not to shuffle the data if it is ordered by time and there is a dependence on this.

train-validate

This is the regular training and validation split. In validationSize you can specify the percentage of the data that will end up in the validation set.

k-fold cross-validation

In case more accurate measurements for performance are required, a k-fold cross-validation scheme can be useful to achieve that by repeating the experiment multiple times using different parts of your data for validation.

The folds field has an integer value that describes the number of folds to use in the validation scheme.

Example

cubonacciVersion: v1
description: Example repository
runtime:
  name: python
  version: 3.6.4
optimization:
  metric: accuracy
  type: maximize
  includeTrainMetrics: false
validation:
  type: kfold-crossvalidation
  shuffle: true
  folds: 5