/Benchmarking Quantum Annealing Against “Hard” Instances of the Bipartite Matching Problem (via Qpute.com)
Benchmarking Quantum Annealing Against “Hard” Instances of the Bipartite Matching Problem

Benchmarking Quantum Annealing Against “Hard” Instances of the Bipartite Matching Problem (via Qpute.com)


Concrete Implementation on a D-Wave

In this section, we detail the steps that we have followed to concretely map and solve the QUBO instances associated to (G_n), (n in {1,2,3,4}), on a DW2X operated by the University of South California. Unfortunately (yet unsurprisingly), the QUBO matrices defined in the previous section are not directly mappable on the Chimera interconnection topology and, thus, we need to resort to qubit duplication i.e., use several physical qubits to represent one problem variable (or “logical qubit”). Fortunately, the D-Wave software pipeline automates this duplication process. Yet, this need for duplication (or equivalently the sparsity of the Chimera interconnection topology) severely limits the size of the instances we were able to map on the device and we had to stop at (G_4) which 125 variables required using 951 of the 1098 available qubits. Table 1 provides the number of qubits required for each of our four instances. For (G_1), (G_2) the maximum duplication is 6 qubits and for (G_3), (G_4) it is 18 qubits.

Table 1 Number of qubits required to handle the QUBO instances associated to (G_1), (G_2), (G_3) and (G_4). See text

Eventually, qubit duplication leads to an expanded QUBO with more variables and an economic function which includes an additional set of penalty constraints to favor solutions in which qubits representing the same variable indeed end up with the same value. More precisely, each pair of distinct qubits q and (q’) (associated to the same QUBO variable) adds a penalty term of the form (varphi q(1-q’)). Where the penalty constant (varphi ) is (user) chosen as minus the cost of the worst possible solution to the initial QUBO which is obtained for a vector filled with ones (i.e., a solution that selects all edges of the graph and which therefore maximizes the highly-penalized violations of the cardinality contraints). This, therefore, guarantees that a solution which violates at least one of these consistency constraints cannot be optimal (please note that we have switched from a maximization problem in Sect. 4.2 to a minimization problem as required by the machine). Lastly, as qubit duplication leads to an expanded QUBO which support graph is trivially isomorphic to the Chimera topology, it can be mapped on the device after a renormalization of its coefficients to ensure that the diagonal terms of Q are in ([-2,2]) and the others in ([-1,1]).

Results Summary

This section reports on the experiments we have been able to perform on instances of the previous QUBO problems. As already emphasized, due to the sparsity of the qubit interconnection topology, our QUBO instances were not directly mappable on the D-Wave machine and we had to resort to qubit duplications (whereby one problem variable is represented by several qubits on the D-Wave, bound together to end up with the same value at the end of the annealing process). This need for qubit duplication limited us to (G_4) which, with 125 binary variables, already leads to a combinatorial problem of non trivial size. Yet, to solve it, we had to mobilize about (87%) of the 1098 qubits of the machine. The results below have been obtained by running 10,000 times the quantum annealer with a 20 (mu )s annealing time (although we also experimented with 200 and 2000 (mu )s, which did not appear to affect the results significantly).

Additionally, to improve the quality of the results obtained in our experiments, we used different gauges (spin-reversal transformations). The principle of a gauge is to apply a Boolean inversion transformation to operators (sigma _i) in our Hamiltonian (in QUBO terms, after qubit duplication, this just means replacing some variable (x_i) by (1-y_i), with (y_i=1-x_i) and updating the final QUBO matrix accordingly). This transformation has the particularity of not changing the optimal solution of the problem and of limiting the effect of local biases of the qubits, as well as machine accuracy errors [6]. Following common practices (e.g., [2]), we randomly selected 10% of the physical qubits used as gauges for each (G_n) instance that we mapped to the D-Wave. This prepocessing does indeed improve the results obtained, but not widely so: for example, on (G_4), it leads to a 2.5% improvement on the mean solution cost outputted by the D-Wave and only a 1.2% improvement on the mean solution cost after correction of the duplication inconsistencies by majority voting. Our overall results are given in Table 2.

Table 2 Experimental results summary without (top) and with (bottom) majority voting to fix qubit duplication issues on (G_1), (G_2), (G_3), (G_4). See text
Fig. 3
figure3

Histograms on the left represent the economic function over 10000 annealing runs on (G_3) and (G_4). Histograms on the right represent the economic function over 10,000 annealing runs on (G_3) and (G_4) (with duplication inconsistencies fixed by majority voting).

Instances Solutions

(G_1). This instance leads to a graph with 8 vertices, 8 edges and then (before duplication) to a QUBO with 8 variables and 12 nonzero nondiagonal coefficientsFootnote 6; 16 qubits are then finally required. Over 10,000 runs, the optimal solution (with a cost of -68) was obtained 9265 times (with correction 9284 times). Interestingly, the worst solution obtained (with a cost of (-9)) violates duplication consistency as all the 6 qubits representing variable 6 do not have the same value (4 of them are 0, so in that particular case, rounding the solution by means of majority voting gives the optimal solution).

(G_2). This instance leads to a graph with 18 vertices, 27 edges and then to a QUBO with 27 variables and 72 nonzero nondiagonal coefficients. Overall, 100 qubits are required. Over 10,000 runs the optimal solution (with cost (-495)) was obtained only 510 times (i.e., a  6% hitting probability). Although the best solution obtained is optimal, the median solution (with cost (-388)) does not lead to a valid matching since four vertices are covered 3 timesFootnote 7. As for (G_1), we also observe that the worst solution (with cost (-277)) has duplication consistency issues. Fixing these issues by means of majority voting results only in a marginal left shift of the average solution cost from (-398.2) to (-400.4), the median being unchanged.

(G_3). This instance leads to a graph with 32 vertices, 64 edges and then to a QUBO with 64 variables and 240 nonzero nondiagonal coefficients. Post-duplication, 431 qubits were required (39% of the machine capacity). Over 10,000 runs the optimal solution was never obtained. For (G_3), the optimum value is (-2064), thus the best solution obtained (with cost (-1810)) is around 15% far-off (the median cost of (-1548) is 25% far-off). Furthermore, neither the best nor the median solution lead to valid matchings since in both, some vertices are covered several times. We also observe that the worst solution has duplication consistency issues. Figure 3a shows the (renormalized) histogram of the economic function as outputted by the D-Wave for the 10,000 annealing runs we performed. Additionally, since some of these solutions are inconsistent with respect to duplication, Fig. 3b shows the histogram of the economic function for the solutions in which duplication inconsistencies were fixed by majority voting (thus left shifting the average cost from (-1454.8) to (-1496.5) and the median cost from (-1548) to (-1550) which is marginal).

(G_4). This instance leads to a graph with 50 vertices, 125 edges and then to a QUBO with 125 variables and 600 nonzero non-diagonal coefficients. Post-duplication, 951 qubits were required (i.e., 87% of the machine capacity). Over 10,000 runs the optimal solution was never obtained. Still, Fig. 4 provides a graphic representation of the best solutions obtained, with cost (-5527) (median and worst solutions obtained respectively had costs (-4675) and (-2507)). For (G_4), the optimum value is (-6075), thus the best solution obtained is around 10% far-off (a better ratio than for (G_3)) and median cost 25%. Furthermore, neither the best nor the median solution lead to valid matches since in both, some vertices are covered several times. We also observe that the worst solution (as well as many others) has duplication consistency issues. Figure 3c shows the (renormalized) histogram of the economic function as outputted by the D-Wave for the 10000 annealing runs we performed. Additionally, since some of these solutions are inconsistent with respect to duplication, Fig. 3d shows the histogram of the economic function for the solutions in which duplication inconsistencies were fixed by majority voting (resulting, in this case, in a slight right shift of the average solution cost from -4609.9 to -4579.2 and of the median cost from -4675 to -4527 which is also marginalFootnote 8).

Fig. 4
figure4

Graphic representation of the best solution obtained for (G_4). See text

Resolution by Simulated Annealing

Simulated annealing was introduced in the mid-80’s [9, 19] and its countless practical successes quickly established it as a mainstream method for approximately solving computationally-hard combinatorial optimization problems. As simulated annealing has been around for so long, there is no need to introduce the general method but rather to specify the key free parameter choices. In our case we have used a standard cooling schedule of the form (T_{k+1}=0.95T_k) starting at (T_0=|c_0|) ((c_0) is the high cost of the initial random solution) and stopping when (T<10^{-3}). The key parameter of our implementation, however, is the number of iterations of the Metropolis algorithm running for each k at a constant temperature which we set to n, (n^{1.5}) and (n^2) (where n denotes the number of variables in the QUBO). For n iterations per plateau of temperature, the algorithm is very fast but the Metropolis algorithm has less iterations to reach its stationary distribution and, hence, the algorithm is expected to provide lower quality results. On the other end of the spectrum, (n^2) iterations per plateau means that one can expect high-quality results but the computation time is then much more important.

Table 3 Experimental results obtained when solving the raw QUBO instances for (G_1), (G_2), (G_3) and (G_4) by means of simulated annealing for several numbers of iterations per plateau of temperature

Table 3 presents the results obtained when solving the raw QUBO for (G_1), (G_2), (G_3) and (G_4) with simulated annealing and several numbers of iterations for the Metropolis algorithm (over just 30 runs). Needless to emphasize that simulated annealing performs extremely well compared to the D-Wave with the worst solution of the former (yet over 30 runs) almost always beating the best solutions obtained by the D-Wave over 10,000 runs. Also, the fact that simulated annealing (even with only n iterations per plateau) finds the optimal solution with high probability, suggests that the instance size up to (G_4) are too small to reach the (asymptotic) exponential number of iterations regime of Sasaki & Hajek theorem i.e., these instances are small enough to remain relatively easy for classical annealing (although we can observe a shift in the number of iterations per plateau to achieve optimality with almost certainty, e.g., for (G_4) this occurs only for (n^2) iterations per plateau). Yet, as shown in the previous section, quantum annealing was not able to solve them satisfactorily (for (G_3) and (G_4)). Also, note that computing time is not issue when solving these instances: simulated annealing runs natively in less than 5 s ((G_4) with (n^2) iterations per plateau) on an average laptop PC with only moderately optimized code.

Studying the Topology Bias

Let us emphasize that this comparison between our simulated annealing and the results obtained on the D-Wave 2X is perfectly fair as we compare the optimization capabilities of two devices coming with their operational constraints. Yet, it should also be emphasized that, for example on (G_4), simulated annealing solved a 125 variables QUBO problems while quantum annealing had to solve an (artificially) much larger 958 variables QUBO. So, although the larger QUBO is equivalent to the smaller one, it is worth investigating whether or not these expanded QUBO are somehow harder to solve by simulated annealing.

To do so, we have considered the QUBO instances obtained after mapping the original QUBO for (G_4)Footnote 9 on both the Chimera and Pegasus topologies and attempted to solve them, this time, by simulated annealing.

Table 4 Experimental results obtained when solving the expanded QUBO instances for (G_4) on both the Chimera (top rows) and Pegasus (bottom rows) topologies by means of simulated annealing (30 runs) for several numbers of iterations per plateau of temperature. Note that the “D-Wave” line results from the random selection of 30 outputs from the 10000 runs that lead to Table 2

Thus, Table 4 provides the results obtained when solving the expanded QUBO instances for (G_4) on both the Chimera and Pegasus topologies by means of classical annealing (also considering several numbers of iterations per plateau, as in the previous section). This time, the results obtained on the D-Wave are competitive with those obtained by simulated annealing which means that the expanded instances are much harder to solve (by simulated annealing) than the raw ones, despite them being equivalent and despite the larger number of iterations per plateau (since there are more variables in the QUBO) i.e., the additional computing time, invested to solve them. In addition, probably not unsurprisingly, the denser Pegasus topology leads to smaller expanded QUBO instances than the Chimera one and provides better results (with simulated rather than quantum annealing as the first machines with that topology are just being released). Yet, although this topology is better, the results obtained remain very far from those obtained by simulated annealing, in “Resolution by simulated annealing” section, on the raw non-expanded QUBO instances. In terms of “computing” time, however, the D-Wave is several orders of magnitude faster. Indeed, when it takes less than a second to perform 10000 quantum annealing runs, solving the expanded ((approx 1000) variables) (G_4) QUBO by simulated annealing (with (n^2) iterations per plateau) now takes several minutes on an average laptop computer.Footnote 10

So, as a more general conclusion to this section, it appears that the D-Wave machine is competitive with a (heavy weight) simulated annealing algorithm with (n^2) iterations per plateau in terms of optimization quality and inherently several orders of magnitude faster. However (and that is a “big” however), it also appears that having to embed QUBO instances in either the Chimera or the Pegasus topologies tend to produce larger obfuscated QUBO which are much harder to solve by simulated annealing. This therefore hints that this is also counterproductive for quantum annealing and that these qubits interconnect topologies should be blamed, at least in part, for the relatively disappointing results reported in “Results summary” section.


This is a syndicated post. Read the original post at Source link .