Spaces and DataStore

Spaces and DataStore#

This section explains MINTO’s internal structure and key components. This will help you understand how MINTO works and enable more advanced usage and customization.

In QuickStart, we explained the basic usage of MINTO. First, let’s explain the DataStore object that represents each record obtained from Experiment.runs.

First, let’s prepare and execute the same numerical experiment code as in QuickStart.

import minto
import ommx_pyscipopt_adapter as scip_ad
from ommx.dataset import miplib2017

instance_name = "reblock115"
instance = miplib2017(instance_name)

timelimit_list = [0.1, 0.5, 1, 2]

experiment = minto.Experiment(
    "quickstart_example",
    auto_saving=False,     # True is recommended, but set to False for demonstration
    verbose_logging=False  # True is recommended, but set to False for demonstration
)

adapter = scip_ad.OMMXPySCIPOptAdapter(instance)
scip_model = adapter.solver_input

for timelimit in timelimit_list:
    with experiment.run() as run:
        run.log_parameter("timelimit", timelimit)

        scip_model.setParam("limits/time", timelimit)
        scip_model.optimize()
        solution = adapter.decode(scip_model)

        run.log_solution(solution)

experiment.runs contains multiple DataStore objects. Each DataStore object holds data related to a specific experiment run. For example, you can access it as follows:

experiment.runs

DataStore object is defined as

@dataclass
class DataStore:
    problems: dict[str, jm.Problem]
    instances: dict[str, ommx.v1.Instance]
    solutions: dict[str, ommx.v1.Solution]
    objects: dict[str, dict]
    parameters: dict[str, int | float | str]
    metadata: dict[str, Any]

Data saved with the run.log_* methods is stored in the corresponding attributes. For example, a problem saved with run.log_problem("my_problem", problem) can be accessed with DataStore.problems["my_problem"].

Two spaces Data storage#

MINTO’s data storage consists of two spaces. The first is the Experiment space, and the second is each run space.

We have seen data storage in the run space above. It is stored in the list of DataStore objects accessible via .runs.

experiment.dataspace.experiment_datastore

In mathematical optimization, experiments are often conducted by fixing an instance or model while changing solver parameters, or by sweeping parameters included in the model. In such cases, it is often desirable to keep the mathematical models and instances that remain fixed during the experiment constant. Also, since instances can be very large data, it is inefficient to save the same instance data in each run. Therefore, MINTO provides a mechanism to save fixed data at the Experiment level. Experiment level data can be accessed via experiment.dataspace.experiment_datastore.

Saving data at the Experiment level is done with the experiment.log_global_* methods. For example, to save a mathematical model or instance, do the following:

experiment.log_global_instance(instance_name, instance)
experiment.dataspace.experiment_datastore.instances["reblock115"]

Also, the Experiment level DataStore can be obtained as a DataFrame. However, note that for the Experiment level, a dataframe is generated for each DataStore attribute, so the return value of .get_experiment_tables() is dict[str, pandas.DataFrame]. For example, it is used as follows:

experiment.get_experiment_tables()["instance"]

Summary#

minto provides a mechanism to efficiently manage data using two spaces. If you understand up to this section, you can say you understand the core of minto. Conversely, minto does nothing more complex than saving data to these two spaces, and aims to make data management in mathematical optimization easy by providing a simple management function.

The following tutorials will introduce utils to manipulate the two spaces and to share data managed by minto with others, making it even more user-friendly.