Hadamard Matrices #4: Constraint Programming
(with Google's OR-Tools)

Rúben Dhanaraju
Rúben Dhanaraju Author
David Carvalho
David Carvalho Editor
Fábio Cruz
Fábio Cruz Reviewer
Luís Sarmento
Luís Sarmento Reviewer

Constraint programming represents one of the closest approaches Computer Science has yet made to the Holy Grail of programming: the user states the problem, the computer solves it.

At Inductiva, we are trying to solve very hard combinatorial optimization problems, and one of the finest specimens is the problem of finding Hadamard matrices.

To recap, a Hadamard matrix of order \(n\) is a square matrix of size \(n \times n\) that only contains the values \(\{1, -1\}\), and all of its rows (and columns) are pairwise orthogonal. Very simple to define, yet very hard to find for all \(n\).

In the previous blog post of this series, we saw mathematical constructive methods that allow us to build Hadamard matrices.

Constructive methods rely on theoretical proofs that ensure the existence of Hadamard matrices satisfying some properties. These methods allow for explicitly finding Hadamard matrices of specific orders (for example, for powers of 2 using Sylvester’s method). However, these only get us so far, as there are still no such methods for some orders. At the time of this writing, the smallest integers for which no Hadamard matrix is known are \(668\), \(716\) and \(892\)!

We would like to try claiming that prize with the aid of Machine Learning. But before that, how can we solve this using classic computational methods?

Hadamard Matrices #4: Constraint Programming (with Google’s OR-Tools)

There are numerous ways of solving Combinatorial Optimization problems! A classic method and well suited for this problem is Constraint Programming (CP).

Constraint Programming allows modelling and solving a problem by defining three key components: variables, domains and constraints.

The variables represent the decisions of our problem, and each one has a set of possible values it can take - the domain. An assignment corresponds to fixing one or more variables with specific value(s) from their respective domains.

What makes CP so powerful, however, are the constraints: each constraint restricts the valid simultaneous assignments of all the variables of our problem. Consequently, a solution to our problem corresponds to a complete assignment of the variables that obeys all of the constraints.

Let’s look at a simple example. We want to find, using CP, the numbers of the set \(\{1, 2, \dots, 20\}\) that are pythagorean triples, i.e. three numbers \(a\), \(b\), and \(c\) that satisfy:

\[a^2 + b^2 = c^2\]

For this example, we have three variables (\(a\), \(b\), and \(c\)) each with the domain \(D_a = D_b = D_c = \{1, 2, \dots, 20\}\).

We only have one constraint, which is the equation above. Nevertheless, if we wanted to find primitive pythagorean triples, we would add another constraint requiring \(a\), \(b\), and \(c\) to be coprime integers.

A solution to our problem is the complete assignment \(a = 3\), \(b=4\), \(c=5\).

Traditional CP Solvers normally use a combination of backtracking search and constraint propagation. During backtracking search, an assignment is built by, at every step, selecting a variable, and value from its domain, until either a solution is found or no more selections can be performed.

When the latter occurs, it means an infeasible state was reached, and the solver undoes its last decision (it backtracks), and keeps trying other possibilities. Every time a selection is made, the solver propagates the constraints and removes values from the domains to keep the assignment consistent.

Here is a small video to get a more visual explanation:

Video 1: “Solving Combinatorial Optimization Problems with Constraint Programming and OscaR”. Credits: UCLouvain - Université catholique de Louvain

For a more detailed explanation on CP, take a look at [1].

How to model and solve the Hadamard problem using CP?

Regarding the Hadamard problem, each element of the matrix represents a CP variable. For a matrix of order \(n\), we’ll have the variables \(X_{1,1}, X_{1,2}, \dots, X_{n, n}\), each with the domain \(D_{i,j} = \{1, -1\}\).

We add a constraint for each pair of rows \(i\) and \(j\), stating that they need to be orthogonal, i.e. that their dot product needs to equal \(0\).

\[\forall i \neq j, \sum_k X_{i, k} X_{j, k} = 0\]

There are several CP solvers out there that are very efficient (Google OR-Tools, Choco-Solver, Yuck, GeCode). We chose Google’s OR-Tools for our experiments due to its award-winning history, and its Python API.

OR-Tools provides two CP solvers: the original CP-Solver and the CP-SAT Solver, both award winners in the field of CP. The former is a classic backtracking search method, like the ones we described before, and the latter is based on satisfiability methods [2]. The CP-SAT solver is the one recommended by Google because it is more efficient and actively maintained, as opposed to the original solver.

It would be great if any of these solvers allowed parallelizing the search, resulting in faster search times. While it seems that we cannot easily parallelize the search of the original CP-Solver, the CP-SAT solver uses a configurable amount of threads, with some communication, where each one runs an independent search with a different strategy to solve the problem, and the one which first gets to a solution wins (you can find more details about this mechanism here). However, no internal customization regarding the search itself is allowed.

We’re interested in benchmarking these solvers both in terms of time and success rate on the task of finding a Hadamard matrix starting from an empty matrix (the cells of the matrix start empty, and values from \(\{-1, 1\}\) are successively chosen until a Hadamard matrix is found).

We’ll do this benchmark for orders \(n = 12, 16, 20, \dots,40\). Notice that while the values of \(n\) seem quite small, the size search space of the search space is HUUGE. To give you an idea, for \(n = 20\), the size of the state space is \(2^{20^2} = 2.58225 \times 10^{120}\). The estimated number of particles in the observable universe is about \(10^{80}\), which is \(10^{40}\) times smaller! Moreover, for the smallest missing order \(n = 668\), there are \(2^{446224}\) possibilities!

Every little increase on the order means an exponential increase in the search space, since we have \(2^{n^2}\) possibilities for a matrix of order \(n\).

We will stick to the following selection strategy for one variable and value: both solvers use the strategy of at every step, picking the first available variable in the matrix when traversing the matrix top-down, left-to-right. After this, we assign it the value of \(-1\).

If you ever find yourself needing to find Hadamard matrices, here is a Python code snippet for how the skeleton of code for the CP-SAT solver would look like.

# Import the CP-SAT model from or-tools library
from ortools.sat.python import cp_model

def find_hadamard_matrix_cp_sat(size, max_time_in_seconds=3600):

  # Instantiate the model.
  cp_sat_model = cp_model.CpModel()

  # Create the variables.
  variables = create_variables(cp_sat_model, size)

  # Add constraint to enforce pairwise orthogonality.
  enforce_all_rows_orthogonal(model, variables, size)

  # Use the variable and value selection strategies of
  # selecting the first unbound variable and the value -1.
  cp_sat_model.AddDecisionStrategy(flatten(variables),
                                          cp_model.CHOOSE_FIRST,
                                          cp_model.SELECT_MIN_VALUE)

  # Create a solver.
  solver = cp_model.CpSolver()

  # Set the time limit.
  solver.parameters.max_time_in_seconds = max_time_in_seconds

  # Run the solver!
  solver.Solve(cp_sat_model)

The original solver follows a similar structure, with some slight differences.

# Import the original CP solver from or-tools library
from ortools.constraint_solver import pywrapcp

def find_hadamard_matrix_original_solver(size, max_time_in_seconds=3600):

  # Instantiate the solver.
  solver = pywrapcp.Solver()

  # Set the time limit.
  # The time limit constructor receives as argument the
  # limit in ms, thus we scale it by 1000.
  time_limit = solver.TimeLimit(max_time_in_seconds * 1000)

  # Create the variables
  variables = create_variables(solver, size)

  # Add constraint to enforce pairwise orthogonality.
  enforce_all_rows_orthogonal(solver, variables, size)

  # Use the variable and value selection strategies of
  # selecting the first unbound variable and the value -1.
  decision_builder = solver.Phase(flatten(variables),
                                  solver.CHOOSE_FIRST_UNBOUND,
                                  solver.ASSIGN_MIN_VALUE)

  # Run the solver!
  solver.Solve(decision_builder, [time_limit])

Here is how the orthogonality constraint can be implemented in the original CP-Solver (easily translatable to the CP-SAT code).

def enforce_all_rows_orthogonal(solver, variables, size):
    # All n*(n-1)/2 pairwise combinations of row vectors need to be
    # orthogonal.

    # For each row.
    for i in range(size - 1):

        # And the rows below it.
        for j in range(i + 1, size):

            # Accumulate product variables.
            dot_product_terms = []

                # Perform a dot product between rows.
                for k in range(size):

                    # prod_var will represent the product
                    # between variables[i][k] and variables[j][k].
                    prod_var = new_one_minus_one_var(
                        solver, f'p{i}_{j}_{k}')

                    # Append the product variable to the accumulator
                    dot_product_terms.append(prod_var)

                    # Enforce p{i}_{j}_{k} = v_{i}_{k} * v_{j}_{k}
                    add_multiplication_equality(solver, prod_var,
                                                      variables[i][k],
                                                      variables[j][k])

                # Enforce the sum of the terms to add up to 0.
                solver.Add(sum(dot_product_terms) == 0)

Let us run some experiments, shall we?

Run forrest, run!

The experiments were executed on a 12th Gen Intel(R) Core(TM) i5-12400F CPU (released in Q1 2022, 12th gen., 4.4 GHz, 6 cores, 12 threads, 64 GB RAM).

To deal with internal non-deterministic mechanisms of the solvers (for example, random restarts in the CP-SAT solver), we run each solver 10 times with different seeds. We’ll be benchmarking the “success” of the solvers with the same time limit, thus in each of these runs, each solver has at most 1 hour to find a Hadamard matrix.

Additionally, as previously mentioned, the CP-SAT solver has a configurable amount of threads that run different independent search strategies in parellel until one of them finds a solution. We set this number the maximum that the CPU allows (in this case 12).

solver.parameters.num_search_workers = multiprocessing.cpu_count()

We thus expect this solver to be the fastest one.

In Figures 1 and 2, we present a comparison between the solvers, both on the mean search time over the 10 runs and the percentage of successful runs.

Fig. 1: Mean search time for finding a Hadamard matrix as a function of the order.

Fig. 2: Percentage of Hadamard matrices found as a function of the order.

Looking at Figure 1, the intractability of the problem is already evident as the original CP Solver is only able to find matrices of orders \(n = 12\) and \(n = 16\) within an hour. Yikes!

OR-Tools’ CP-SAT solver is able to find matrices of orders \(n = 20\) in all \(10\) runs before dropping to \(40\%\) when dealing with order \(n = 24\) and plummeting down to \(0\) for the subsequent orders.

Nevertheless, the original solver could only get to order \(16\), and the difference in the search space between these orders is enormous \(2^{24^2} / 2 ^ {16^2} = 2^{230}\) - this is a drastic improvement!

With regards to Figure 2, this exponential growth of the search space is noticeable by the skyrocketing impact that increasing the matrix order has on the total search time.

Here are two examples of matrices found by the original solver and the CP-SAT solver (dark elements represent \(-1\) and white elements represent \(+1\)).

Fig. 3: Hadamard matrix of order \(n = 16\) found by the Original CP-Solver.

Fig. 4: Hadamard matrix of order \(n = 24\) found by the CP-SAT Solver.

Although constraint propagation does seem like a good candidate to solve this problem, the pairwise orthogonality constraint is clearly not enough to prune the search space, and effectively tackle the problem.

Can we do better?

Decreasing the search space

Can we use any property of Hadamard matrices to our advantage, such that we can impose more restricting constraints that allow us to navigate the search space more efficiently? The answer is yes! 

In the second blog post of the series, we introduced equivalence classes for Hadamard matrices. Two Hadamard matrices are said to be equivalent if they differ by any sequence of the following operations:

  • Transposing

  • Column and row permutations

  • Multiplying rows and columns by \(-1\)

Fig. 5: Visualization of row (or column) permutations and negations to a Hadamard matrix. Credits: [Hadamard #2](/blog/article/hadamard-matrices-2-an-introduction)

Furthermore, we have presented the concept of a normalized Hadamard matrix (a Hadamard matrix with only \(+1\)s in its first row and column) and showed that every Hadamard matrix is equivalent to a normalized counterpart. Here is an example of a normalized Hadamard matrix:

\[\begin{align*} \begin{pmatrix} 1 & 1 & 1 & 1 \\ 1 & -1 & 1 & -1 \\ 1 & 1 & -1 & -1 \\ 1 & -1 & -1 & 1 \end{pmatrix} \end{align*}\]

Not only does this mean that if we find a single normalized Hadamard matrix, we find a representative for a whole equivalence class, but it also means that we can add extra constraints based on some properties of normalized Hadamard matrices.

Let us constraint ourselves, no pun intended, to thinking only about normalized Hadamard matrices!

Notice that for any row to be orthogonal to the first one, it needs to contain exactly \(n / 2\) \(+1\)s and \(-1\)s, because we need to cancel out exactly half of the \(+1\)s. Consequently, for any row except the first one, exactly half the \(+1\)s overlap with another \(+1\) of any other row. This also holds for columns.

That being said, we can already start filling a matrix of a generic order derived from the implications above.

For the second row, we can choose one with the first half filled with \(+1\)s followed by \(-1\)s. Since its first row is filled with only \(+1\)s, any normalized Hadamard matrix can be put in this form by simply permuting its columns, without affecting the first one.

To be orthogonal to this choice, its third row must have its \(+1\)s and \(-1\)s equally distributed between each of its quarters, thus the choice for the third row have \(n / 4\) \(+1\)s in its first and last quarters, and the second and third filled with \(-1\)s. 

So that’s three rows already filled!

To further reduce the search space, we will also introduce a constraint which requires the matrix to be symmetric, thus restricting the search space to this sub-class of Hadamard matrices. This addition should cut about half the decisions we need to make. However, for a single \(n\), there may only exist non-symmetric Hadamard matrices. So far, the first order for which no symmetric Hadamard matrix is known is \(n = 236\)!

Will CP allow us to beat that record? Let’s see how far we can go!

Formally, our new set of constraints would look like this:

\[\begin{align} \begin{array}{lr} \forall j, \space X_{1j} = 1 \space & \textrm{(First row filled with +1)} \\ \forall j, \leq \frac{n}{2} \space X_{2j} = 1 & \textrm{(First half of second row filled with +1)} \\ \forall j > \frac{n}{2} , \space X_{2j} = -1 & \textrm{(Second half of second row filled with -1)} \\ \forall j, j \leq \frac{n}{4} \land j > \frac{3n}{4} , \space X_{3j} = 1 & \textrm{(First and last quarter of third row filled with +1)} \\ \forall j, \frac{n}{4} < j \leq \frac{3n}{4}, \space X_{3j} = -1 & \textrm{(Second and third quarters of third row filled with -1)}\\ \forall i, \sum_j \space X_{ij} = 0 & \textrm{(Equal number of +1s and -1s in each row)} \\ \forall i,j, \space X_{ij} = X_{ji} & \textrm{(Symmetric matrix)} \\ \end{array} \end{align}\]

Which in code could be translated for something like this.

def enforce_first_row_equal_to_one(solver, variables, size):
    # Each variable of the first row must be equal to 1.

    for i in range(size):
        solver.Add(variables[0][i] == 1)

def enforce_second_row_first_half_equals_to_one(solver, variables, size):
    for i in range(size):

        # If this variable in the first half of the second row
        # second row, it must be equal to 1, otherwise to -1.

        if i < int(size / 2):
            solver.Add(variables[1][i] == 1)
        else:
            solver.Add(variables[1][i] == -1)

 def enforce_third_row(solver, variables, size):
    for i in range(size):

        # If the variable is in the first or last quarter of
        # the third row, it must be equal to 1, otherwise to -1.

        if i < int(size / 4) or i >= int(3 * size / 4):
            solver.Add(variables[2][i] == 1)
        else:
            solver.Add(variables[2][i] == -1)

Regarding the symmetry constraints, we do this explicitly when creating the variables, by creating one single variable for each symmetric entry.

Let’s see what happens when we introduce this new set of constraints and compare them with the previous results! We used the same experimental setup as before.

The experiments which use the new constraints are marked with a (+) in the plots.

Fig. 6: Mean search time for finding a Hadamard matrix (symmetric and normalized) as a function of the order.

Fig. 7: Percentage of Hadamard matrices (symmetric and normalized) found as a function of the order.

Oh wow! It is evident that introducing these new constraints significantly improves convergence for both solvers, as we are able to find matrices of orders up to \(n = 24\) with the original CP-Solver and \(n = 28\) with the CP-SAT, as before we got up to \(n = 16\) and \(n = 24\), respectively. Once again, to show how significant the gain is, the respective differences in search spaces is \(2^{24^2} / 2^{16^2} = 2^{320}\) and \(2^{28^2} / 2^{24^2} = 2^{208}\)!

Here are two more examples of matrices found by each solver but with the added set of constraints (like before, dark squares represent \(-1\) and white squares represent \(+1\)).

Fig. 8: Symmetric and normalized Hadamard matrix of order \(n = 16\) found by the Original CP-Solver.

Fig. 9: Symmetric and normalized Hadamard matrix of order \(n = 24\) found by the CP-SAT Solver.

While the gain in performance is noticeable with the addition of these constraints, 1 hour is still not enough to find matrices of orders above \(28\). Note that coming up with better refined constraints is a hard task! More constraints could be formulated to additionally prune the search space, if we further decomposed the problem, or if we considered a fixed matrix order. However, finding these constraints manually is not trivial as it requires deeply understanding the structure and caveats of the problem. In the future, we will be looking at automated methods to either find constraints or to prune the search space.

Fighting the giant of Computational Complexity is no easy task. But that is the main course here at Inductiva!

In the next blog post, we shall look at a more modest version of the problem: If we already have a Hadamard matrix, but with some of its entries hidden, can we recover them and reconstruct the original matrix? Let’s find out!

References & Notes

[1] To further learn about Constraint Satisfaction.

[2] For some reading on how OR-Tools’ CP-SAT Solver works.

Recent posts from our blog

Fábio Cruz

Fábio Cruz

Maria Castro

Maria Castro

Luís Sarmento

Luís Sarmento

David Carvalho

David Carvalho

Do GPUs solve sparse eigenproblems faster than CPUs? In this post, we answer this question by comparing the SciPy and CuPy eigensolvers.

David Carvalho

David Carvalho

In this new post, we start our discussion on how artificial reefs can also be used to enhance surfability conditions. We start off by introducing what makes a good wave for surfing.