# «Abstract. Execution of most of the modern DPLL-based SAT solvers is guided by a number of heuristics. Decisions made during the search pro- cess are ...»

Instance-Based Selection of Policies for SAT

Solvers

Mladen Nikoli´, Filip Mari´, Predrag Janiˇi´

c c cc

Faculty of Mathematics, University of Belgrade,

Belgrade, Studentski Trg 16, Serbia

{nikolic, filip, janicic}@matf.bg.ac.rs

Abstract. Execution of most of the modern DPLL-based SAT solvers is

guided by a number of heuristics. Decisions made during the search pro-

cess are usually driven by some ﬁxed heuristic policies. Despite the out-

standing progress in SAT solving in recent years, there is still an appeal- ing lack of techniques for selecting policies appropriate for solving speciﬁc input formulae. In this paper we present a methodology for instance- based selection of solver’s policies that uses a data-mining classiﬁcation technique. The methodology also relies on analysis of relationships be- tween formulae, their families, and their suitable solving strategies. The evaluation results are very good, demonstrate practical usability of the methodology, and encourage further eﬀorts in this direction.

1 Introduction The propositional satisﬁability problem (SAT) is one of the fundamental prob- lems in computer science. It is the problem of deciding if there is a truth assign- ment under which a given propositional formula (in conjunctive normal form) evaluates to true. SAT was the ﬁrst problem proved to be NP-complete [Coo71] and it still has a central position in the ﬁeld of computational complexity. SAT problem is also very important in many practical domains such as electronic de- sign automation, software and hardware veriﬁcation, artiﬁcial intelligence, and operations research. Thanks to recent advances in propositional solving technol- ogy, SAT solvers (procedures that solve the SAT problem) are becoming a tool for attacking more and more practical problems.

A number of SAT solvers have been developed. The majority of state-of-the- art complete SAT solvers are based on the branch and backtrack algorithm called Davis-Putnam-Logemann-Loveland or the DPLL algorithm [DP60,DLL62]. Spec- tacular improvements in the performance of DPLL-based SAT solvers achieved in the last few years are due to (i) several conceptual enhancements on the orig- inal DPLL procedure, aimed at reducing the amount of explored search space (e.g., backjumping, conﬂict-driven lemma learning, restarts), (ii) better imple- mentation techniques (e.g., two-watched literals scheme for unit propagation), and (iii) smart heuristic components (which we focus on in this work). These This work was partially supported by Serbian Ministry of Science grant 144030.

advances make possible to decide the satisﬁability of industrial SAT problems with tens of thousands of variables andmillions of clauses.

Complex policies, heuristics that guide the search process, represent important parts of modern SAT solvers and are crucial for solver’s eﬃciency. These include policies for literal selection, for determining the clause database size, for choosing restart points, etc. Speciﬁc policies are usually parametrized by a number of numerical and categorial parameters. Single policy with diﬀerent parameter values can be treated as diﬀerent policies. SAT solving process is completely determined only when all its heuristic policies are set. Selected combinations of policies specify the solving strategy (or simply strategy).

Typically, every SAT solver uses a predetermined, hard-coded strategy and applies it on all its input formulae. However, in recent times, SAT solvers tend to implement multiple policies and the question arises as to how to choose a strategy that would give good performance for a speciﬁc SAT instance. Addressing this question is of crucial importance because the solving time for the same input formula can vary for several orders of magnitude depending on the solving strategy used. The problem of adapting a SAT solver to the input formula has been addressed for the ﬁrst time only recently. Our approach signiﬁcantly diﬀers from the only existing related approach we are aware of (as discussed in Sect. 5).

Propositional formulae can be clustered in families of formulae by their origin — hardware veriﬁcation problems (e.g., FPGA routing), manually crafted problems (e.g., graph coloring, Hanoi towers), or random generated problems (e.g., k-SAT). It is interesting to explore the behaviour of diﬀerent policies and solving strategies on families of formulae. The important question is whether one strategy shows the same or similar behaviour on similar formulae. If this is the case, and if one can automatically guess a family to which a given formula belongs, then this could be used for selecting an appropriate strategy for this particular formula. To implement this approach, one needs (i) a technique for classifying formulae based only on their syntax; (ii) information about behaviour of diﬀerent policies on various families of formulae.

The main message of this work is that intelligent selecting of solving policies, based on the syntax of the input formula, can signiﬁcantly improve eﬃciency of a SAT solver. The proposed methodology will not lead to optimal performance on each input formula, but the solving performance will be signiﬁcantly improved in average on multiple input formulae. Here, by improving eﬃciency of a SAT solver we mean increasing the number of formulae solvable within some time limit and decreasing the solving time.

**The proposed methodology relies on several hypotheses that will be investigated in the rest of the paper:**

(H1) Formulae of the same family (i.e., of similar origin) share some syntactical properties that can be used for automated formula classiﬁcation;

(H2) For each family of formulae there is only a small number of solving strategies that are appropriate — that show better performance on formulae belonging to that family then all other available strategies.

(H3) For formulae that are syntactically similar, the best strategies are also (in some sense) similar.

If the above hypotheses hold, then our methodology will be practically applicable. Namely, if the formula is correctly classiﬁed then it will be solved by a solving strategy suitable for a family that the formula belongs to. However, even if the formula is misclassiﬁed, it will be solved using a strategy similar to the optimal one.

The rest of the paper is organized as follows: in Sect. 2, a brief background information on SAT problem, SAT solvers, and their heuristic components is given. In Sect. 3, the proposed methodology is described. The experimental results are given in Sect. 4. In Sect. 5 related work is discussed. In Sect. 6 ﬁnal conclusions are drawn and some directions of possible further work are discussed.

2 Background Most of today’s state-of-the-art solvers are complex variations of the DPLL procedure. In the rest of the paper, we shall assume that the reader is familiar with the modern SAT solving techniques. More on these topics can be found, for example, in [NOT06,KG07,Mar08,GKSS07]. Although modern SAT solvers share common underlying algorithms and implementation techniques, their operation is guided by a number of heuristic policies that have to be selected in order to obtain deterministic solving strategies. The most important heuristic policies determine: (i) which literals to choose for branching, (ii) when to apply restarting, and (iii) when to forget some clauses that are learnt during the solving process.

In the rest of this section, policies that were used in our experiments will be described.

Literal selection policies. During the DPLL backtrack-search, literals used for branching should be somehow selected. This is the role of literal selection policies.

Most literal selection policies separately select a variable v (by using a variable selection policy) and only then choose its polarity, i.e., choose if it should be negated or not (by using a polarity selection policy).

**Some variable selection policies are the following1 :**

VSrandom — This policy randomly chooses a variable among all variables of the initial formula that are not deﬁned in the current valuation, i.e., assertion trail.

VSb,d,init — The goal of this policy (introduced in the solver MiniSAT [ES04]) minisat is to select a variable that was active in recent conﬂicts. In order to implement this, an activity score is assigned to each variable. On every conﬂict, the scores of the variables that occur in the conﬂict clause are bumped, i.e., increased by a bump factor given by the parameter b. Also, during the conict analysis process, on each resolution step all variables that occur in the 1 The policy names will be printed in subscripts and their parameters in superscripts.

explanation clause are bumped. To stimulate recent conﬂicts, on each conict all the scores are decayed, i.e., decreased by a decay factor given by the parameter d.

An important aspect of the MiniSAT variable selection policy is how to assign initial scores to variables. If the parameter init has the value zero, scores of all variables are set to zero, hence all variables have the same chance to be selected. If the parameter init has the value freq, the initial score of each variable is set to its number of occurrences in the initial formula F0.

VSp random ◦ VSx — This compound policy chooses a random variable with probability p and with probability 1 − p uses a given policy here denoted by VSx.

**Some polarity selection policies are the following:**

PSpositive — Always selects a non-negated literal.

PSnegative — Always selects a negated literal.

PSp random — A random selection which chooses a non-negated literal with probability p.

PSinit polarity caching — When using this policy (introduced in the solver RSAT [PD07] as phase caching), a preferred polarity is assigned to each variable and it is used for polarity selection. Whenever a literal is asserted to the current assertion trail (either as a decision or as a propagated literal), its polarity deﬁnes the future preferred polarity of its variable. When a literal is removed from the trail (during backjumping or restarting) its preferred polarity is not changed. If the parameter init has the value pos, then initial polarities of all variables are positive, it has the value neg then they are set to negative, and if it has the value freq, then preferred polarity of each variable is set to the polarity which is more frequent of the two in the initial formula.

Restart policies. Restart policies determine when to apply restarting. Most restart policies are based on conﬂict counting. On each conﬂict, the counter is increased. Restarting is applied whenever the counter reaches a certain threshold value. When this happens, the counter is reset and a new threshold is selected according to some speciﬁc policy. The sequence of threshold values must contain an increasing subsequence in order to ensure termination and completeness of

**the procedure. Some possible restart policies are the following:**

Rno restart — Restarting is not applied.

Rc0,q minisat — The initial threshold value is set to c0 and the threshold values form a geometric sequence with a quotient q [ES04].

Rk

**luby — Threshold values are positive integer multiples of the Luby series [LSZ93]:**

Forget policies. Forget policies determine (i) when to remove some clauses learnt during the solving process and (ii) which clauses to forget. In the rest of the paper, only the following policy will be used.

Fc0,q,p — When using this policy (introduced by the solver MiniSAT [ES04]), minisat forgetting is done when the current number of clauses in the clause database reaches a threshold value. Threshold values form a geometric sequence whose initial value is c0 percent of the number of clauses in the initial formula and whose coeﬃcient is q. It is interesting that the threshold value is not updated after the forgetting itself, but instead it is updated on each restart.

Another important aspect of a forget policy is what clauses to forget. The MiniSAT policy forgets p percent of learnt clauses that are allowed to be forgotten. Clauses are assigned activity scores which reﬂect their participation in conﬂicts, very similar to variable activity scores used in the MiniSAT variable selection policy. Clauses with the smallest activity scores are the ones that are being forgotten.

**3 Methodology**

Our methodology, when applied to a given SAT solver, selects an appropriate solving strategy (i.e., combination of policies) for each input formula. The proposed methodology can be applied to any DPLL-based SAT solver, provided it supports multiple policies. In our experiments, the ArgoSAT solver2 was used since it implements a large number of policies and since its modular architecture allows easy modiﬁcation of existing and implementation of new policies [Mar09].

The overall methodology consists of two phases.

Training phase. This phase consists of systematic solving of all formulae from a representative corpus, by using all candidate solving strategies. This allows selecting the best solving strategy (the one that solves the most formulae) for each family of formulae from the corpus. Proﬁles of all formulae from the corpus (i.e., their representation suitable for classiﬁcation) are also computed in this phase.

Exploitation phase. In this phase, the family of a given formula is guessed and the strategy that showed the best results on that family during the training phase is used for its solving. After the training, the system implementing the 2 http://argo.matf.bg.ac.yu/software/ methodology can be applied both to the formulae from the training corpus and to some other formulae.

Several issues still need to be addressed, as discussed below.

The choice of candidate solving strategies. In our case candidate strategies are deﬁned as all (60 = 3 · 5 · 4 · 1) possible combinations of the given policies. They are listed in the Table 1.

where P and P are instance proﬁles.

4 Experiments and Evaluation Experiments described in this section test the hypotheses that our approach relies on (given in Sect. 1), and also demonstrate a good overall quality of our methodology.