# «Abstract. The general intractability of the constraint satisfaction prob- lem (CSP) has motivated the study of the complexity of restricted cases of ...»

Constraint Satisfaction with Succinctly Speciﬁed

Relations

Hubie Chen and Martin Grohe

1

Departament de Tecnologia

Universitat Pompeu Fabra

Barcelona, Spain

hubie.chen@upf.edu

2

Institut f¨r Informatik

u

Humboldt-Universit¨t Berlin

a

Unter den Linden 6, D-10099 Berlin, Germany

grohe@informatik.hu-berlin.de

Abstract. The general intractability of the constraint satisfaction prob-

lem (CSP) has motivated the study of the complexity of restricted cases

of this problem. Thus far, the literature has primarily considered the for- mulation of the CSP where constraint relations are given explicitly. We initiate the systematic study of CSP complexity with succinctly speciﬁed constraint relations.

1. Introduction Constraint satisfaction problems give a uniform framework for a large number of algorithmic problems in many diﬀerent areas of computer science, for example, artiﬁcial intelligence, database systems, or programming languages. While in- tractable in general, many restricted constraint satisfaction problems are known to be eﬃciently solvable. Considerable eﬀort went into analysing the precise con- ditions that lead to tractable problems; recent results include [2, 8, 3, 16, 7, 6, 1, 5, 17].

An instance of a constraint satisfaction problem (CSP) is a triple (V, D, C) consisting of a set V of variables, a domain D, and a set C of constraints. The objective is to ﬁnd an assignment to the variables, of values from D, such that all constraints in C are satisﬁed. The constraints are expressions of the form Rx1... xk, where R is a k-ary relation on D and x1,..., xk are variables. A constraint is satisﬁed if the k-tuple of values assigned to the variables x1,..., xk belongs to the relation R. As a running example for this introduction, let us view SAT, the satisﬁability problem for CNF-formulas, as a constraint satisfaction over the domain {0, 1}. Constraints are given by the clauses of the input formula.

For example, the clause (x∨¬y∨¬z) corresponds to a constraint Rxyz, where R is the ternary relation {(0, 0, 0), (0, 0, 1), (0, 1, 0), (1, 0, 0), (1, 0, 1), (1, 1, 0), (1, 1, 1)} on the domain {0, 1}.

Two types of restriction on CSP-instances are commonly studied in the lit- erature: restrictions of the constraint language and structural restrictions. The Dagstuhl Seminar Proceedings 06401 Complexity of Constraints http://drops.dagstuhl.de/opus/volltexte/2006/802 former restrict the relations on the domain that are permitted in the constraints.

For example, HORN-SAT is the restriction of SAT where all constraint relations are speciﬁed by Horn clauses, that is, clauses with at most one negative literal.

It is known that HORN-SAT can be solved in polynomial time. SAT itself has a restricted constraint language where all constraint relations are speciﬁed by disjunctions of literals. The main open problem is the dichotomy conjecture by Feder and Vardi [11], which states that for each constraint language the restricted CSP is either in polynomial time or NP-complete. Currently, this problem still seems to be wide open.

Structural restrictions on CSP-instances are restrictions on the structure induced by the constraints on the variables. A well-known example is the restriction to instances of bounded tree-width. Here a graph on the variables is deﬁned by letting two variables be adjacent if they occur together in some constraint. It is known that for every k the restriction of the general CSP to instances where this graph has tree width at most k, is in polynomial time [10, 13]. The complexity of structural restrictions is better understood than that of constraint language restrictions. If the maximum arity of the constraint relations is bounded, a complete complexity theoretic classiﬁcation is known [16]; we will state it later in this paper. If the arity is unbounded, interesting classes of tractable problems are known [15, 14, 6, 1, 5, 17], but no complete complexity classiﬁcation is.

In this paper, we study both restrictions of the constraint language and structural restrictions. We focus on the case of constraint relations of unbounded arity. What is new here is that we pay attention to the way the constraint relations are speciﬁed in the problem instances.

In the complexity-theoretic investigations of constraint satisfaction problems it is usually assumed that the constraint relations occurring in a CSP-instance are speciﬁed simply by listing all the tuples in the relation, as we did above for the relation speciﬁed by the clause (x ∨ ¬y ∨ ¬z). We call this the explicit representation. In practice, the constraint relations are often represented implicitly.

For example, in SAT-instances, the clauses and not the relations they represent are given. Obviously, the implicit clausal representation is exponentially more succinct than the explicit representation, and this may aﬀect the complexity. As long as the arity of the constraint relations is bounded a priori, as in 3-SAT, it does not make much of diﬀerence, because the size of the explicit and any implicit representations diﬀer only in a polynomial factor in terms of the overall instance size. If the domain is ﬁxed, they even diﬀer only by a constant factor.

However, for CSPs of unbounded arity, it can make a big diﬀerence. What this means in the complexity theoretic context is that algorithms whose running time is polynomial in the size of the explicitly represented instances may become exponential if the instances are represented implicitly. In particular, this is the case for all recent algorithms that exploit a structural restriction called bounded hypertree width and related restrictions [6, 1, 5, 17]. Indeed, these algorithms have been criticised for precisely the reason that they are only polynomial relative to the explicit representation, which is perceived as unrealistic by some researchers.

While we do not share this criticism in general, we agree that there are many 2 examples of CSPs where implicit representations are more natural, such as SAT and systems of equalities or inequalities over some numerical domain. This paper initiates a systematic study of the complexity of CSPs with succinctly speciﬁed constraint relations.

Before we can state our main results, we have to get a bit more technical.

1.1. Succinctly speciﬁed constraint relations. How can we specify constraints implicitly, and how does this aﬀect the complexity of the CSPs? It will be convenient to consider the Boolean domain {0, 1} ﬁrst. An abstract implicit representation is to not specify the constraint relations at all, but just assume a membership oracle for each relation. That is, an algorithm may ask if a speciﬁc tuple of values belongs to the relation and get an answer in the next step. However, this may lead to CSPs being highly intractable just because their constraint relations are. Consider the family of CSP-instances In := ({v1,..., vn }, {0, 1}, {Rn v1... vn }). To solve such instances, the best an algorithm that knows nothing about Rn and only has access to a membership oracle can do is enumerate all tuples in {0, 1}n and query the oracle for each of them. Thus the running time will be exponential in the worst case, even though the instances In, having just one constraint, are very simple. This type of complexity is clearly not what we are interested in here. Therefore, specifying the constraint relations by membership oracles is “too implicit”; our implicit representation has to be a bit more explicit. A natural and somewhat generic representation of constraint relations over the Boolean domain is by Boolean circuits.

Now consider the family of instances IC := ({v1,..., vn }, {0, 1}, {RC v1... vn }), where RC is the n-ary relation speciﬁed by the Boolean circuit C with n inputs.

Again, this is a family of very simple instances with just one constraint. However, solving the instances in this family amounts to solving the Boolean satisﬁability problem, which is NP-complete. Therefore, it seems reasonable that an implicit representation has a tractable nonemptiness problem. (The nonemptiness problem for relations speciﬁed by circuits is the circuit satisﬁability problem.) This not only rules out the representation by arbitrary Boolean circuits, but actually the representation by every class of circuits that contains all CNF-formulas.

Thus, in some sense, the generic representation not ruled out by these considerations is the representation by DNF-formulas. This is the representation we shall study on this paper. Of course there are other natural representations that deserve further study, for example, the representation of the constraint relations by ordered binary decision diagrams, but we defer this to future work.

The DNF representation of constraint relations on the Boolean domain has a natural generalisation to arbitrary domains D: We say that a generalised DNF (GDNF) representation of a relation R ⊆ Dk is an expression of the form m (Pi1 × · · · × Pik ) () i=1 where Pij ⊆ D for 1 ≤ i ≤ m, 1 ≤ j ≤ k. Note that the GDNF enables us to represent relations of size Ω(Dk ) by expressions of size O(m · |D| · k). As the GDNF-representation is the only succinct representation that we study in this 3 paper, from now on we refer to CSPs with constraint relations represented in GDNF as succinctly represented CSPs (sCSPs).

Related previous work has studied restrictions on the SAT problem that lead to tractability [19], which in this discussion corresponds to the case where the domain is boolean and the constraint relations are simply given as a disjunction of literals. In contrast, this paper studies a more general representation and does not impose any size restrictions, other than ﬁniteness, on the domain.

1.2. Main Results. We study the complexity of both structural and constraint language restrictions of succinctly represented CSPs.

We give a complete complexity theoretic classiﬁcation for structural restriction, which generalises the classiﬁcation for the bounded arity case obtained in [16]. A structural restriction can be described by a class A of relational structures; we denote the corresponding restricted succinctly represented CSP by sCSP(A, −). We prove that, under the complexity theoretic assumption FPT = W[1], that sCSP(A, −) is in polynomial time if and only if the structures in A are homomorphically equivalent to structures whose incidence graph has bounded tree width (Theorem 8). We refer to this condition by saying that the class A has bounded incidence width modulo homomorphic equivalence.

Constraint language restrictions can also be described by a class B of structures, and we denote them by sCSP(−, B). We prove that two general tractability results can be generalised from the explicitly represented to succinctly represented CSPs. These results are formulated in the algebraic language of polymorphisms of the constraint language. We prove that sCSP(−, B) is in polynomial time if B is a class of relational structures having a near unanimity polymorphism (Theorem 12), or if B is a class of relational structures invariant under a set function (Theorem 13); the corresponding results for explicitly represented constraint relations are from [18, 9].

2. Preliminaries, Deﬁnitions, and Basic Facts We use [n] to denote the set containing the ﬁrst n positive integers, {1,..., n}.

2.1. Relational structures and homomorphisms. As observed by Feder and Vardi [11], constraint satisfaction problems may be viewed as homomorphism problems for relational structures. For the rest of this paper, it will be convenient for us to take this point of view. We review the relevant deﬁnitions.

A relational signature is a ﬁnite set of relation symbols, each of which has an associated arity. A relational structure A (over signature σ, for short: σ-structure) consists of a universe A and a relation RA over A for each relation symbol R (of σ), such that the arity of RA matches the arity associated to R. When A is a σ-structure and R ∈ σ, the elements of RA are called A-tuples. Throughout this paper, we assume that all relational structures under discussion are ﬁnite, that is, have a ﬁnite universe. We use boldface letters A, B,... to denote relational structures, and the corresponding non-boldface letters A, B,... to denote their universes. The arity of a vocabulary σ is the maximum of the arities of the relation symbols in σ, and the arity of a relational structure is the arity of its 4 vocabulary. A class A of relational structures has bounded arity if there is a k such that every structure in A has arity at most k.

A substructure of a relational structure A is a relational structure B over the same signature σ as A where B ⊆ A and RB ⊆ RA for all R ∈ σ. A homomorphism from a relational structure A to another relational structure B is a mapping h from the universe of A to the universe of B such that for every relation symbol R and every tuple (a1,..., ak ) ∈ RA, it holds that (h(a1 ),..., h(ak )) ∈ RB. (Here, k denotes the arity of R.)

2.2. Explicitly and Succinctly Represented Constraint Satisfaction Problems. With each CSP-instance I = (V, D, C) we associate two relational structures AI and BI as follows: The signature σI of AI and BI consists of a k-ary relation symbol R for each k-ary constraint relation RI ⊆ Dk of I. The universe of BI is D, and for each relation symbol R ∈ σI we let RB = RI. The universe of AI is V, for each k-ary relation symbol R ∈ σI we let RA = {(x1,..., xk ) | Rx1... xk ∈ C}. Then a mapping f from V = AI to D = BI is a satisfying assignment for I if and only if it is a homomorphism from AI to BI. Thus instance I is satisﬁable if and only if there is a homomorphism from AI to BI. Conversely, with every pair (A, B) of σ-structures we can associate a CSP-instance I such that A = AI and B = BI. From now on, we will view CSP-instances as pairs (A, B) of relational structures of the same signature. For succinctly represented instances, the relations of the structure B are represented in GDNF.