Sharing Various Types of Data in an OMMX Artifact

Sharing Various Types of Data in an OMMX Artifact#

In mathematical optimization workflows, it is important to generate and manage a variety of data. Properly handling these data ensures reproducible computational results and allows teams to share information efficiently.

OMMX provides a straightforward and efficient way to manage different data types. Specifically, it defines a data format called an OMMX Artifact, which lets you store, organize, and share various optimization data through the OMMX SDK.

Preparation: Data to Share#

First, let’s prepare the data we want to share. We will create an ommx.v1.Instance representing the 0-1 knapsack problem and solve it using SCIP. We will also share the results of our optimization analysis. Details are omitted for brevity.

Variable Name

Description

Value

instance

ommx.v1.Instance object representing the 0-1 knapsack problem

Instance(raw=<builtins.Instance object at 0x7f6ccd688830>, annotations={})

solution

ommx.v1.Solution object containing the results of solving the 0-1 knapsack problem with SCIP

Solution(raw=<builtins.Solution object at 0x7f6ccd5593b0>, annotations={})

data

Input data for the 0-1 knapsack problem

{'v': [10, 13, 18, 31, 7, 15], 'w': [11, 15, 20, 35, 10, 33], 'W': 47, 'N': 6}

df

pandas.DataFrame object representing the optimal solution of the 0-1 knapsack problem

	Item Number	Put in Knapsack?
id
0	0	Yes
1	1	Yes
2	2	Yes
3	3	No
4	4	No
5	5	No

Creating an OMMX Artifact as a File#

OMMX Artifacts can be managed as files or by assigning them container-like names. Here, we’ll show how to save the data as a file. Using the OMMX SDK, we’ll store the data in a new file called my_instance.ommx. First, we need an ArtifactBuilder.

import os
from ommx.artifact import ArtifactBuilder

# Specify the name of the OMMX Artifact file
filename = "my_instance.ommx"

# If the file already exists, remove it
if os.path.exists(filename):
    os.remove(filename)

# 1. Create a builder to create the OMMX Artifact file
builder = ArtifactBuilder.new_archive_unnamed(filename)

ArtifactBuilder has several constructors, allowing you to choose whether to manage it by name like a container or as an archive file. If you use a container registry to push and pull like a container, a name is required, but if you use an archive file, a name is not necessary. Here, we use ArtifactBuilder.new_archive_unnamed to manage it as an archive file.

Constructor	Description
`ArtifactBuilder.new`	Manage by name like a container
`ArtifactBuilder.new_archive`	Manage as both an archive file and a container
`ArtifactBuilder.new_archive_unnamed`	Manage as an archive file
`ArtifactBuilder.for_github`	Determine the container name according to the GitHub Container Registry

Regardless of the initialization method, you can save ommx.v1.Instance and other data in the same way. Let’s add the data prepared above.

# Add ommx.v1.Instance object
desc_instance = builder.add_instance(instance)

# Add ommx.v1.Solution object
desc_solution = builder.add_solution(solution)

# Add pandas.DataFrame object
desc_df = builder.add_dataframe(df, title="Optimal Solution of Knapsack Problem")

# Add an object that can be converted to JSON
desc_json = builder.add_json(data, title="Data of Knapsack Problem")

In OMMX Artifacts, data is stored in layers, each with a dedicated media type. Functions like add_instance automatically set these media types and add layers. These functions return a Description object with information about each created layer.

desc_json.to_dict()

{'mediaType': 'application/json',
 'digest': 'sha256:6cbfaaa7f97e84d8b46da95b81cf4d5158df3a9bd439f8c60be26adaa16ab3cf',
 'size': 78,
 'annotations': {'org.ommx.user.title': 'Data of Knapsack Problem'}}

The part added as title="..." in add_json is saved as an annotation of the layer. OMMX Artifact is a data format for humans, so this is basically information for humans to read. The ArtifactBuilder.add_* functions all accept optional keyword arguments and automatically convert them to the org.ommx.user. namespace.

Finally, call build to save it to a file.

# 3. Create the OMMX Artifact file
artifact = builder.build()

This artifact is the same as the one that will be explained in the next section, which is the one you just saved. Let’s check if the file has been created:

! ls $filename

my_instance.ommx

Now you can share this my_instance.ommx with others using the usual file sharing methods.

Read OMMX Artifact file#

Next, let’s read the OMMX Artifact we saved. When loading an OMMX Artifact in archive format, use Artifact.load_archive.

from ommx.artifact import Artifact

# Load the OMMX Artifact file locally
artifact = Artifact.load_archive(filename)

OMMX Artifacts store data in layers, with a manifest (catalog) that details their contents. You can check the Descriptor of each layer, including its Media Type and annotations, without reading the entire archive.

import pandas as pd

# Convert to pandas.DataFrame for better readability
pd.DataFrame({
    "Media Type": desc.media_type,
    "Size (Bytes)": desc.size
  } | desc.annotations
  for desc in artifact.layers
)

	Media Type	Size (Bytes)	org.ommx.user.title
0	application/org.ommx.v1.instance	327	NaN
1	application/org.ommx.v1.solution	295	NaN
2	application/vnd.apache.parquet	2595	Optimal Solution of Knapsack Problem
3	application/json	78	Data of Knapsack Problem

For instance, to retrieve the JSON in layer 3, use Artifact.get_json. This function confirms that the Media Type is application/json and reinstates the bytes into a Python object.

artifact.get_json(artifact.layers[3])

{'v': [10, 13, 18, 31, 7, 15], 'w': [11, 15, 20, 35, 10, 33], 'W': 47, 'N': 6}

Sharing Various Types of Data in an OMMX Artifact

Contents

Sharing Various Types of Data in an OMMX Artifact#

Preparation: Data to Share#

Creating an OMMX Artifact as a File#

Read OMMX Artifact file#