An optimization & machine learning toolkit

The Optimization and Machine Learning Toolkit (OMLT) is an open-source software program that incorporates machine-learning-trained neural networks and gradient-boosted tree surrogate models into bigger optimization problems. We will talk about this library, its various functions, and its design in this article. Finally, we’ll see a practical demonstration of how to load our standard conventional Keras model into this framework, which will then be used for optimization. The major points to be discussed in this article are listed below. 

Table of contents

  1. What is OMLT?
  2. Design of library
  3. Implementation of OMLT

Let’s first understand this toolkit.

What is OMLT?

The optimization and machine learning toolkit is an open-source software program for optimizing neural network (NN) and gradient-boosted tree high-level representations (GBTs). NNs or GBTs can be integrated into bigger decision-making issues by optimizing over-trained surrogate models. 

Optimizing a neural acquisition function or verifying neural networks are two examples of computer science applications. Gray-box optimization in engineering combines mechanistic, model-based optimization with surrogate models learned from data. 

GBTs are supported by an ONNX interface (ONNX is an intermediary machine learning framework used to convert between multiple machine learning frameworks.) and NNs are supported by both ONNX and Keras interfaces in OMLT.

To encode the optimization formulations, OMLT translates these pre-trained machine learning models into the algebraic modelling language Pyomo (Pyomo allows users to define optimization problems in Python in a style that is close to the notation frequently used in mathematical optimization.)

A formulation containing the decision variables, objectives, constraints, and any parameters is required as input to mathematical optimization solver software. OMLT streamlines the normally time-consuming and error-prone procedure of converting previously trained NN and GBT models into optimization formulas appropriate for solver software. OMLT uses ONNX as an input interface because the ONNX interoperability capabilities allow OMLT to support packages such as Keras, PyTorch, and TensorFlow.

OMLT is a much more generic tool that incorporates both NNs and GBTs, a large number of input models via ONNX interoperability, fully-dense and convolutional layers, many activation functions, and a variety of optimization formulations.

Design of library

Model surrogation

For optimization, OMLT uses Pyomo, a Python-based algebraic modelling language. Because most machine learning frameworks utilize Python as their primary interface, Python is a suitable starting point for OMLT. Pyomo provides a flexible modelling interface to Pyomo-enabled solvers: switching solvers enables OMLT users to choose the appropriate optimization solution for an application without directly dealing with each solver. Pyomo is frequently used by OMLT. 

To begin, Pyomo’s efficient auto-differentiation of nonlinear functions utilizing the AMPL solver library allows for NN nonlinear activation functions. Second, Pyomo’s numerous expansions, such as decomposition for large-scale issues, enable OMLT to interact with cutting-edge optimization methodologies.

The most crucial aspect is that OMLT employs Pyomo blocks. Pyomo blocks in OMLT are used to encapsulate the GBT or NN components of a larger optimization formulation. When linking to the inputs and outputs of a Pyomo block, users just need to understand the input/output structure of the NN or GBT.

Optimization formulation

OmltBlock is a Pyomo block that delegates the generation of the surrogate model’s optimization formulation to the PyomoFormulation object. Users of OMLT build the input/output objects, such as constraints, that connect the surrogate model to the wider optimization problem and the user-defined variables. 

The surrogate’s optimization formulation is derived automatically from its higher level (ONNX or Keras) representation. Users of OMLT can also define a scale object for the variables as well as a dictionary of variable boundaries. Some optimization formulations demand scaling and variable bound information, which may not be present in ONNX or Keras representations.

Network definition

For GBTs, OMLT automatically creates the optimization formulation from the higher-level representation, such as ONNX. Because neural networks are substantially more difficult to optimize, OMLT instead generates an intermediate representation (NetworkDefinition) of the neural network that serves as a doorway to various other mathematical optimization formulations of neural networks.

Implementation of OMLT

We’ll go over how to import your neural networks into OMLT in this section. OMLT includes an importer for ONNX-saved neural networks. This notebook demonstrates how to import neural networks from a variety of popular machine learning frameworks into ONNX. Finally, we demonstrate how to open an ONNX file with OMLT.

We’ll use the Pima Indians Diabetes dataset to train a neural network that can predict whether a patient has diabetes based on a set of medical data throughout this notebook.

The below implementation is part of official implementations

First, install the packages.

! pip install omlt
! pip install onnx
! pip install tf2onnx

Let’s first load and prepare the data. 

import pandas as pd
import numpy as np

data = pd.read_csv('/content/diabetes.csv')
X = data.drop(['Outcome'], axis=1).values
Y = data['Outcome'].value

When developing optimization models, it’s crucial to establish variable boundaries. In order to obtain a tighter MIP formulation of the ReLU activations, OMLT demands that all input variables be limited if the neural network incorporates ReLU activation functions.

The OMLT program offers a function for writing the input boundaries in a format that can be read later by the ONNX model. We start by computing limits for the eight input variables in our rudimentary neural networks. We’ll use input bounds to preserve the ONNX model.

# computing bounds
lb = np.min(X, axis=0)
ub = np.max(X, axis=0)
input_bounds = [(l, u) for l, u in zip(lb, ub)]
input_bounds

Now we’ll build and train a Keras based model that the latter will load to OMLT.

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
 
# model
model = Sequential()
model.add(Dense(15, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='linear'))
# compile
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=['accuracy'])
 
# train
model.fit(X, Y, epochs=20, batch_size=10)

Now we’ll import the network to OMLT for that we import the function used to write the ONNX model together with its bounds. After this, we can simply now export the Keras model to ONNX.

import tempfile
from omlt.io import write_onnx_model_with_bounds
import tf2onnx

onnx_model, _ = tf2onnx.convert.from_keras(model)
 
with tempfile.NamedTemporaryFile(suffix='.onnx', delete=False) as f:
    write_onnx_model_with_bounds(f.name, onnx_model, input_bounds)
    print(f"Wrote ONNX model with bounds at {f.name}")

That’s how we have imported our model into OMLT.

Final words

We have OMLT features available to us as a result of the demonstration in this article. Higher-level representations, such as those provided by Keras and PyTorch, are invaluable for modelling neural networks and gradient-boosted trees. OMLT extends the applicability of these representations to larger decision-making situations by automating the conversion of these pre-trained models into variables and constraints suitable for optimization solvers.

Reference

Read more here: Source link