Sharing Various Types of Data in an OMMX Artifact

Sharing Various Types of Data in an OMMX Artifact#

In mathematical optimization workflows, it is important to generate and manage a variety of data. Properly handling these data ensures reproducible computational results and allows teams to share information efficiently.

OMMX provides a straightforward and efficient way to manage different data types. Specifically, it defines a data format called an OMMX Artifact, which lets you store, organize, and share various optimization data through the OMMX SDK.

Preparation: Data to Share#

First, let’s prepare the data we want to share. We will create an ommx.v1.Instance representing the 0-1 knapsack problem and solve it using SCIP. We will also share the results of our optimization analysis. Details are omitted for brevity.

Hide code cell source
from ommx.v1 import Instance, DecisionVariable, Constraint
from ommx_pyscipopt_adapter.adapter import OMMXPySCIPOptAdapter
import pandas as pd

# Prepare data for the 0-1 knapsack problem
data = {
    # Values of each item
    "v": [10, 13, 18, 31, 7, 15],
    # Weights of each item
    "w": [11, 15, 20, 35, 10, 33],
    # Knapsack capacity
    "W": 47,
    # Total number of items
    "N": 6,
}

# Define decision variables
x = [
    # Define binary variable x_i
    DecisionVariable.binary(
        # Specify the ID of the decision variable
        id=i,
        # Specify the name of the decision variable
        name="x",
        # Specify the subscript of the decision variable
        subscripts=[i],
    )
    # Prepare num_items binary variables
    for i in range(data["N"])
]

# Define the objective function
objective = sum(data["v"][i] * x[i] for i in range(data["N"]))

# Define constraints
constraint = Constraint(
    # Name of the constraint
    name = "Weight Limit",
    # Specify the left-hand side of the constraint
    function=sum(data["w"][i] * x[i] for i in range(data["N"])) - data["W"],
    # Specify equality constraint (==0) or inequality constraint (<=0)
    equality=Constraint.LESS_THAN_OR_EQUAL_TO_ZERO,
)

# Create an instance
instance = Instance.from_components(
    # Register all decision variables included in the instance
    decision_variables=x,
    # Register the objective function
    objective=objective,
    # Register all constraints
    constraints=[constraint],
    # Specify that it is a maximization problem
    sense=Instance.MAXIMIZE,
)

# Solve with SCIP
solution = OMMXPySCIPOptAdapter.solve(instance)

# Analyze the optimal solution
df_vars = solution.decision_variables
df = pd.DataFrame.from_dict(
    {
        "Item Number": df_vars.index,
        "Put in Knapsack?": df_vars["value"].apply(lambda x: "Yes" if x == 1.0 else "No"),
    }
)

Variable Name

Description

Value

instance

ommx.v1.Instance object representing the 0-1 knapsack problem

Instance(raw=decision_variables {
  kind: KIND_BINARY
  bound {
    upper: 1
  }
  name: "x"
  subscripts: 0
}
decision_variables {
  id: 1
  kind: KIND_BINARY
  bound {
    upper: 1
  }
  name: "x"
  subscripts: 1
}
decision_variables {
  id: 2
  kind: KIND_BINARY
  bound {
    upper: 1
  }
  name: "x"
  subscripts: 2
}
decision_variables {
  id: 3
  kind: KIND_BINARY
  bound {
    upper: 1
  }
  name: "x"
  subscripts: 3
}
decision_variables {
  id: 4
  kind: KIND_BINARY
  bound {
    upper: 1
  }
  name: "x"
  subscripts: 4
}
decision_variables {
  id: 5
  kind: KIND_BINARY
  bound {
    upper: 1
  }
  name: "x"
  subscripts: 5
}
objective {
  linear {
    terms {
      coefficient: 10
    }
    terms {
      id: 1
      coefficient: 13
    }
    terms {
      id: 2
      coefficient: 18
    }
    terms {
      id: 3
      coefficient: 31
    }
    terms {
      id: 4
      coefficient: 7
    }
    terms {
      id: 5
      coefficient: 15
    }
  }
}
constraints {
  equality: EQUALITY_LESS_THAN_OR_EQUAL_TO_ZERO
  function {
    linear {
      terms {
        coefficient: 11
      }
      terms {
        id: 1
        coefficient: 15
      }
      terms {
        id: 2
        coefficient: 20
      }
      terms {
        id: 3
        coefficient: 35
      }
      terms {
        id: 4
        coefficient: 10
      }
      terms {
        id: 5
        coefficient: 33
      }
      constant: -47
    }
  }
  name: "Weight Limit"
}
sense: SENSE_MAXIMIZE
, annotations={})

solution

ommx.v1.Solution object containing the results of solving the 0-1 knapsack problem with SCIP

Solution(raw=state {
  entries {
    key: 0
    value: 1
  }
  entries {
    key: 1
    value: 1
  }
  entries {
    key: 2
    value: 1
  }
  entries {
    key: 3
    value: 0
  }
  entries {
    key: 4
    value: 0
  }
  entries {
    key: 5
    value: 0
  }
}
objective: 41
decision_variables {
  kind: KIND_BINARY
  bound {
    upper: 1
  }
  name: "x"
  subscripts: 0
}
decision_variables {
  id: 1
  kind: KIND_BINARY
  bound {
    upper: 1
  }
  name: "x"
  subscripts: 1
}
decision_variables {
  id: 2
  kind: KIND_BINARY
  bound {
    upper: 1
  }
  name: "x"
  subscripts: 2
}
decision_variables {
  id: 3
  kind: KIND_BINARY
  bound {
    upper: 1
  }
  name: "x"
  subscripts: 3
}
decision_variables {
  id: 4
  kind: KIND_BINARY
  bound {
    upper: 1
  }
  name: "x"
  subscripts: 4
}
decision_variables {
  id: 5
  kind: KIND_BINARY
  bound {
    upper: 1
  }
  name: "x"
  subscripts: 5
}
evaluated_constraints {
  equality: EQUALITY_LESS_THAN_OR_EQUAL_TO_ZERO
  evaluated_value: -1
  used_decision_variable_ids: 0
  used_decision_variable_ids: 1
  used_decision_variable_ids: 2
  used_decision_variable_ids: 3
  used_decision_variable_ids: 4
  used_decision_variable_ids: 5
  name: "Weight Limit"
}
feasible: true
optimality: OPTIMALITY_OPTIMAL
feasible_relaxed: true
, annotations={})

data

Input data for the 0-1 knapsack problem

{'v': [10, 13, 18, 31, 7, 15], 'w': [11, 15, 20, 35, 10, 33], 'W': 47, 'N': 6}

df

pandas.DataFrame object representing the optimal solution of the 0-1 knapsack problem

Item Number Put in Knapsack?
id
0 0 Yes
1 1 Yes
2 2 Yes
3 3 No
4 4 No
5 5 No

Creating an OMMX Artifact as a File#

OMMX Artifacts can be managed as files or by assigning them container-like names. Here, we’ll show how to save the data as a file. Using the OMMX SDK, we’ll store the data in a new file called my_instance.ommx. First, we need an ArtifactBuilder.

import os
from ommx.artifact import ArtifactBuilder

# Specify the name of the OMMX Artifact file
filename = "my_instance.ommx"

# If the file already exists, remove it
if os.path.exists(filename):
    os.remove(filename)

# 1. Create a builder to create the OMMX Artifact file
builder = ArtifactBuilder.new_archive_unnamed(filename)

ArtifactBuilder has several constructors, allowing you to choose whether to manage it by name like a container or as an archive file. If you use a container registry to push and pull like a container, a name is required, but if you use an archive file, a name is not necessary. Here, we use ArtifactBuilder.new_archive_unnamed to manage it as an archive file.

Constructor

Description

ArtifactBuilder.new

Manage by name like a container

ArtifactBuilder.new_archive

Manage as both an archive file and a container

ArtifactBuilder.new_archive_unnamed

Manage as an archive file

ArtifactBuilder.for_github

Determine the container name according to the GitHub Container Registry

Regardless of the initialization method, you can save ommx.v1.Instance and other data in the same way. Let’s add the data prepared above.

# Add ommx.v1.Instance object
desc_instance = builder.add_instance(instance)

# Add ommx.v1.Solution object
desc_solution = builder.add_solution(solution)

# Add pandas.DataFrame object
desc_df = builder.add_dataframe(df, title="Optimal Solution of Knapsack Problem")

# Add an object that can be converted to JSON
desc_json = builder.add_json(data, title="Data of Knapsack Problem")

In OMMX Artifacts, data is stored in layers, each with a dedicated media type. Functions like add_instance automatically set these media types and add layers. These functions return a Description object with information about each created layer.

desc_json.to_dict()
{'mediaType': 'application/json',
 'digest': 'sha256:6cbfaaa7f97e84d8b46da95b81cf4d5158df3a9bd439f8c60be26adaa16ab3cf',
 'size': 78,
 'annotations': {'org.ommx.user.title': 'Data of Knapsack Problem'}}

The part added as title="..." in add_json is saved as an annotation of the layer. OMMX Artifact is a data format for humans, so this is basically information for humans to read. The ArtifactBuilder.add_* functions all accept optional keyword arguments and automatically convert them to the org.ommx.user. namespace.

Finally, call build to save it to a file.

# 3. Create the OMMX Artifact file
artifact = builder.build()

This artifact is the same as the one that will be explained in the next section, which is the one you just saved. Let’s check if the file has been created:

! ls $filename
my_instance.ommx

Now you can share this my_instance.ommx with others using the usual file sharing methods.

Read OMMX Artifact file#

Next, let’s read the OMMX Artifact we saved. When loading an OMMX Artifact in archive format, use Artifact.load_archive.

from ommx.artifact import Artifact

# Load the OMMX Artifact file locally
artifact = Artifact.load_archive(filename)

OMMX Artifacts store data in layers, with a manifest (catalog) that details their contents. You can check the Descriptor of each layer, including its Media Type and annotations, without reading the entire archive.

import pandas as pd

# Convert to pandas.DataFrame for better readability
pd.DataFrame({
    "Media Type": desc.media_type,
    "Size (Bytes)": desc.size
  } | desc.annotations
  for desc in artifact.layers
)
Media Type Size (Bytes) org.ommx.user.title
0 application/org.ommx.v1.instance 325 NaN
1 application/org.ommx.v1.solution 266 NaN
2 application/vnd.apache.parquet 2595 Optimal Solution of Knapsack Problem
3 application/json 78 Data of Knapsack Problem

For instance, to retrieve the JSON in layer 3, use Artifact.get_json. This function confirms that the Media Type is application/json and reinstates the bytes into a Python object.

artifact.get_json(artifact.layers[3])
{'v': [10, 13, 18, 31, 7, 15], 'w': [11, 15, 20, 35, 10, 33], 'W': 47, 'N': 6}