Skip to content

Data snapshots

Data snapshots are based on the data provided by the DataLoader class. The two types of patterns supported are explained in the next sections.


Data provided by the in-memory pattern is serialized and stored in the Cubonacci object store. When needed for training, either for an experiment or for a full model this data is deserialized and passed to the fit method of the algorithms.


The iterative approaches all take a path parameter where raw data is stored, both for loading the data from the external source and for returning the generator, TensorFlow Dataset or PyTorch DataLoader. This raw data is stored on the Cubonacci object store. When a model needs to be trained, this raw data is available for the model again for the DataLoader class to iterate on.