# «Abstract. Semantical expansion aims at boosting minimisation heuris- tics for Boolean formulas. It provides an easy way of deciding whether or not a ...»

Semantical Expansion of Two-Level Boolean

Formulas for Minimisation

Martin Lange

Department of Computer Science

University of Aarhus, Denmark

Abstract. Semantical expansion aims at boosting minimisation heuris-

tics for Boolean formulas. It provides an easy way of deciding whether

or not a given minterm, resp. clause, can be added safely to a formula

in DNF, resp. CNF. We suggest to use incremental and complete SAT

solvers and instances of UNSAT for these decisions, and report on some preliminary empirical results.

1 Introduction Minimisation of Boolean formulas – the problem for turning such a formula into an equivalent, but somewhat smaller one – plays an important role in circuit design for example. A digital circuit is speciﬁed as as a truth table which is simply a multi-input multi-output Boolean function. This is, for each output, a propositional formula usually given in disjunctive normal form (DNF). This may be the most intuitive way of specifying the behaviour of the circuit to be built but not necessarily the smallest and, thus, cheapest. Minimisation of Boolean formulas is therefore useful for the reduction of hardware ressources and costs.

However, it is also said to have important applications in “design of control systems, design of built-in self-test equipment, during solutions of problems in the area of artiﬁcial intelligence, software engineering, etc.” [5].

The most famous procedure for minimising formulas in DNF are Karnaugh maps [7] and the Quine-McCluskey algorithm [10, 3]. Na¨ implementations of ıve these run into diﬃculties on large inputs which is not surprising given that (the decision problem variant of) minimisation of Boolean formulas is co-NP-hard: the unsatisﬁability test for a formula reduces directly to the question of whether or not it is equivalent to the minimal propositional formula false. Furthermore, both Karnaugh maps and the Quine-McCluskey algorithm need to solve instances of a minimal set cover problem which is known to be NP-hard. Note that this does not entail NP=co-NP because in Quine-McCluskey for example, the minimal cover to be chosen has to be among a possibly exponentially large set of prime implicants.

Boolean minimisers based on Quine-McCluskey minimisation like Espresso [11, 8, 12] and variants thereof for instance, thus implement this procedure in op- timised ways. Another natural choice of approach is to use heuristics, especially when speed is more important than ﬁnding an exact solution, or ﬁnding any solution is better than non-termination. This is done in the minimiser Boom for example [6].

Here we do not present another minimiser but introduce another step which can be integrated into minimisers in order to help these heuristics. It consists of adding arbitrary (but chosen according to some strategy) minterms – or clauses in case of formulas in conjunctive normal form (CNF) – that hopefully can be used to reduce others. This addition of course has to be sound, i.e. the resulting formula must be equivalent to the original one. We propose to check for equivalence (in these restricted cases) using an incremental SAT solver. Since this idea refers to the semantics of a propositional formula through the equivalence test we call this approach semantical expansion.

The use of SAT solvers for the minimisation problem has been proposed before [4]. There, a truth table of size k that is equivalent to the input formula is encoded in propositional logic and a variable assignment returned by the SAT solver contains the values in this table. The minimisation problem is then solved in the usual way using iterative calls with varying parameter k to the decision problem. Note, however, that this is completely diﬀerent to our approach. First of all, we do not encode the entire minimisation problem as a SAT instance, but only a restricted equivalence problem as an instance of UNSAT. We therefore need the SAT solver to be complete but not necessarily sound, whereas the other approach needs it to be sound, and completeness is only necessary if the minimisation problem should be solved exactly. Another main diﬀerence is the problem size: while the entire minimisation problem results in a SAT instance of quadratic size we get away with linear size. In fact, almost the entire formula to be minimised later can be passed directly to the SAT solver without any encoding.

Another approach that is related to boosting a minimisation heuristic by adding minterms uses mutations [5]. It is similar to the one presented here in the sense that it also suggests to generate a set of prime implicants from a wider range of minterms. However, the selection of minterms there is purely based on occurrences of literals in the formula. It is therefore a syntactic approach. It also only uses heuristics to determine whether or not certain literals can be included in or excluded from minterms whereas our approach using a complete SAT solver is exact in this respect.

The rest of the paper is organised as follows. Sect. 2 provides the technicalities of the minimisation framework. Sect. 3 introduces semantical expansion formally, discusses the use of incremental and complete SAT solvers, presents a small algorithm that expands formulas in either DNF or CNF, and ﬁnally discusses strategies for the generation of minterms. Sect. 4 then contains an empirical evaluation of the simplest way to use semantical expansion. Sect. 5 ﬁnishes the paper with ideas of how to improve and further evaluate this approach.

2 Minimisation of Two-Level Formulas The approach presented here works for Boolean formulas to be given in two-level normal form, i.e. either DNF i j lij or CNF i j lij where the lij are literals over some set of Boolean variables. For the sake of representational simplicity we stick to DNFs but stress the fact that, due to duality between these two kinds of normal form, the approach for CNFs is entirely analogous – c.f. the corresponding subsection below. We will therefore speak of minterms rather than clauses as the inner parts of the two-level formula, i.e. a conjunction of literals s.t. no two literals in it share the same variable.

Most minimisation procedures for Boolean formulas are based on a few basic equivalences that are used like rewriting rules from the bigger to the smaller side. For example, Quine-McCluskey minimisation is implemented in virtually every Boolean minimiser. It uses the equivalence (ϕ ∧ ψ) ∨ (ϕ ∧ ¬ψ) ≡ ϕ (1) for arbitrary formulas ϕ and ψ in order to reduce two minterms ϕ ∧ x and ϕ ∧ ¬x to the single minterm ϕ.

Another example of such an application is unit propagation which is one of the majorly used subroutines in modern SAT solvers. It is justiﬁed by the two equivalences (ϕ ∧ ¬ψ) ∨ ψ ≡ ϕ ∨ ψ (2) (ϕ ∧ ψ) ∨ ψ ≡ ψ (3) and used in particular when ψ is a literal in order to discard or reduce other minterms.

3 Semantical Expansion Minimisation of two-level Boolean formulas is well-known to be a hard problem, namely co-NP-hard since the minimal formula equivalent to valid DNF, resp.

unsatisﬁable CNF, is true, resp. false. It is therefore not surprising that there is no minimisation algorithm which simply applies rewrite-rules according to equivalences like those above. This would yield a polynomial time procedure if equivalences are always applied in a way that reduces the size of the formula. It would in fact be a greedy algorithm.

In Quine-McCluskey minimisation, which ﬁrst uses such a technique, the hard part is of course the second: ﬁnding a minimal set of prime implicants. In other words, Quine-McCluskey minimisation simply suggests an exponential reduction to the problem of ﬁnding a minimal set cover. Modern implementations of this procedure use heuristics to solve the latter problem which is well-known to be NP-hard.

There are types of heuristics other than greedy algorithms that have been shown to be useful for approximating NP-hard problems. In order to be successful, these heuristics need to be able to overcome local extrema. In simulated annealing for instance, steps that would conﬂict with a greedy strategy at some random choice.

In the framework of minimisation of Boolean formulas, such a step would correspond to an introduction of minterms rather than a removal. The following example shows the use of introducing minterms in a simple way.

Example 1. Consider the DNF

which is equivalent to x1 but this cannot be establish using the equivalences (1)– (3) from left to right only. However, if we add the minterm (x1 ∧ x2 ∧ x3 ∧ x4 ) then this can be shown using three applications of (1) and one application of (3). It is also easy to see that adding this minterm is sound: it logically implies the last minterm in the DNF above.

Surely this simple example is not a challenge for modern Boolean minimisers which can do reducing steps other than the three mentioned above. But remember that co-NP-hardness of the minimisation problem means that no ﬁxed set of reducing rewrite-rules can yield minimal formulas in every case. This not only shows that adding minterms may be considerable. It also raises the question after a generic justiﬁcation for adding minterms, i.e. one that does not only do occasional backwards application of an equivalence like the ones above.

As an answer to this we propose the following schema which is really a proof rule in propositional logic rather than an axiom like the equivalences above. The name semantical expansion is chosen because its application refers to validity, a semantical concept, whereas applications of the equivalences above can be seen as purely syntactic rewrite rules.

** if |= ψ → ϕ then ϕ ≡ ϕ ∨ ψ (4)**

Its soundness should be obvious.

This rule provides the justiﬁcation for the following procedure. Given a DNF i ϕi, one can add any minterm ψ for as long as ψ → i ϕi is a theorem in (classical) propositional logic. This raises three questions.

– How do we prove theorems in propositional logic? This is easily answered in the next subsection on incremental SAT solvers.

– How do we choose the minterms to be added? There is no simple and unique answer to this. Instead we can only devise various heuristic strategies. This will also be discussed below.

– How does this approach perform? There are various issues to consider. Since the strategies that decide which minterms to add are only heuristics there is naturally no guarantee that this leads to better minimisation results. Moreover, it is not even clear that in terms of minimal formulas, the expanded formulas do not behave worse. Finally, proving propositional theorems is also a co-NP-hard problem. Therefore there is a trade-oﬀ between time and minimal results to be considered. See Sect. 4 below for a starting discussion on this issues.

**3.1 Using Incremental SAT Solvers**

The SAT problem is to decide, given a propositional formula, whether or not it has a model. A SAT solver is a tool for deciding this problem. Since SAT is computationally hard, some are based on heuristics, others are randomised, etc.

There are two kinds of SAT solvers: complete and incomplete ones. A complete SAT solver reports unsatisﬁable formulas as such, unlike incomplete ones which do not necessarily ﬁnd unsatisﬁability.

It is well-known that a formula is valid iﬀ its negation is unsatisﬁable. Hence, complete SAT solvers can in principle be used in semantical expansion. Luckily, most SAT solvers expect the input to be in CNF. Note that rule (4) above requires the formula ¬ψ ∨ i ϕi to be tested for validity where ψ and all ϕi are minterms. Hence, ¬ψ is a maxterm or clause – the disjunction of the complemented literals occurring in ψ – which makes the entire formula a DNF. Its negation in normal form is naturally a CNF by swapping ∨ and ∧ and complementing every literal. However, the last step is unnecessary as the following lemma shows.

** Lemma 1. Let lij be literals from some doubly-indexed set over some set of propositional variables.**

Then i j lij is valid iﬀ i j lij is unsatisﬁable.

Proof. The ﬁrst step is the well-known connection between validity and unsatisability: i j lij is valid iﬀ i j ¬lij is unsatisﬁable. This is because the latter is just the negation of the former in negation normal form.

Now note that ϕ := i j ¬lij is unsatisﬁable iﬀ ϕ := i j lij is unsatisable. An assignment β of the variables in ϕ to values in {0, 1} that satisﬁes ϕ immediately yields a model β of ϕ too. It is deﬁned as β (x) = 1 − β(x). By symmetry, this also holds for the other direction.

Putting these two equivalences together yields the statement of the lemma.

Now note that it is usually meta-information that makes a Boolean formula in two-level normal form a DNF or a CNF. Both would simply be implemented as an array of arrays of integers for example, cf. the DIMACS format or the PLA format. Hence, in order to test a DNF for validity it can be passed on to a CNF-input and complete SAT solver without reversing the literals – provided that the data structures or interfaces are compatible.

Semantical expansion can of course be iterated, and adding more minterms may increase the chances of success in the minimisation process. However, it is advisable to test each minterm separately for addition because it is negated ﬁrst. A single minterm then simply blends into the DNF whereas a disjunction of minterms would be transformed into a conjunction which would manually have to be brought into DNF again at the possible expense of an exponential blow-up.

This is why incremental SAT solving is desirable.

An incremental SAT solver allows clauses to be added to and deleted from a CNF after solving, and uses information gathered in the ﬁrst run like learned clauses in order to speed up the solving of the amended formula. Incremental solving is supported by many complete modern SAT solvers like zChaff [9], MiniSAT [2], etc. This gives rise to a simple algorithm that handles semantical expansion of a DNF Φ given a set M of minterms to be tested for addition.

We assume both Φ and M to be given as a set of sets of literals interpreted as a DNF, i.e. a minterm is only a set of literals. We implicitly assume that it is non-contradictory, i.e. no two literals in this set have the same variable.