# «On Continuous Nondeterminism and State Minimality Jiˇ´ Ad´mek, Robert S. R. Myers, Henning Urbat rı a Institut f¨r Theoretische Informatik u ...»

MFPS 2014

On Continuous Nondeterminism and

State Minimality

Jiˇ´ Ad´mek, Robert S. R. Myers, Henning Urbat

rı a

Institut f¨r Theoretische Informatik

u

Technische Universit¨t Braunschweig

a

Germany

Stefan Milius

Lehrstuhl f¨r Theoretische Informatik

u

Friedrich-Alexander-Universit¨t Erlangen-N¨rnberg

a u

Germany

Abstract

This paper is devoted to the study of nondeterministic closure automata, that is, nondeterministic ﬁnite

automata (nfas) equipped with a strict closure operator on the set of states and continuous transition structure. We prove that for each regular language L there is a unique minimal nondeterministic closure automaton whose underlying nfa accepts L. Here minimality means no proper sub or quotient automata exist, just as it does in the case of minimal dfas. Moreover, in the important case where the closure operator of this machine is topological, its underlying nfa is shown to be state-minimal. The basis of these results is an equivalence between the categories of ﬁnite semilattices and ﬁnite strict closure spaces.

Keywords: Canonical Nondeterministic Automata, State Minimality, Closure Spaces, Semilattices 1 Introduction Why are state-minimal deterministic ﬁnite automata (dfas) easy to construct, whilst no eﬃcient minimization procedure for nondeterministic ﬁnite automata (nfas) is known? Let us start with the observation that minimal dfas are built inside the category Setf of ﬁnite sets and functions and are characterized by having no proper subautomata (reachability) and no proper quotient automata (simplicity). Nfas can be regarded as dfas interpreted in the category Relf of ﬁnite sets and relations, and so one might hope to build minimal nfas in the same way as minimal dfas, but now in Relf. However, there is a signiﬁcant diﬀerence: Setf is both ﬁnitely complete and cocomplete, yet Relf does not have coequalizers, i.e., canonical quotients. The lack of such canonical constructions provides evidence for the lack of canonical state- minimal nfas.

This suggests the following strategy: form the cocompletion of Relf obtained by freely adding canonical quotients, which turns out to be the category JSLf of ﬁnite This paper is electronically published in Electronic Notes in Theoretical Computer Science URL: ww

join-semilattices (see Appendix), and build minimal automata in this larger category. Every nfa may be viewed as a dfa in JSLf via the usual subset construction.

In order to obtain more eﬃcient presentations of nfas, avoiding the full power set of states, we make use of a categorical equivalence between JSLf and the category Clf of ﬁnite strict closure spaces [2]. The objects of the latter are ﬁnite sets Z equipped with a strict closure operator (i.e., an extensive, monotone and idempotent map clZ : PZ → PZ preserving the empty set), and the morphisms are continuous relations, see Deﬁnitions 2.9 and 2.10 below. For example, every ﬁnite topological space induces a ﬁnite strict closure space; these closures are called topological.

Just as nfas may be viewed as deterministic automata interpreted in Relf, nondeterministic closure automata (ncas) are deterministic automata interpreted in Clf : an nca is an nfa with a strict closure operator on its set of states, continuous transition relations, an open set of ﬁnal states and a closed set of initial states.

Since the category Clf has the same relevant properties as Setf, we derive for each regular language L ⊆ Σ∗ the existence of a unique minimal nca N (L) whose underlying nfa (forgetting the closure operator) accepts L. It is minimal in the sense that it has no proper subautomata (reachability) and no proper quotient automata (simplicity), and can be constructed in a way very much analogous to Brzozowski’s classical construction of the minimal dfa [6]: starting with any nca N accepting L, one has N (L) = reach ◦ rev ◦ reach ◦ rev(N ) where reach and rev are continuous versions of the reachable subset construction and the reversal operation for nfas, respectively.

The states of N (L) are the prime derivatives of L, i.e., those non-empty derivatives w−1 L = {v ∈ Σ∗ : wv ∈ L} of L that do not arise as a union of other derivatives. The underlying nfa of N (L) accepts L, thus it is natural to ask when

**this nfa is state-minimal. Our main result is:**

If the closure of N (L) is topological then the underlying nfa is state-minimal.

In other words, we identify a natural class of regular languages for which canonical state-minimal nondeterministic acceptors exist.

Related Work. Our paper is inspired by the work of Denis, Lemay and Terlutte [7] who deﬁne a canonical nondeterministic acceptor for each regular language L. In fact, the underlying nfa of our minimal nca N (L) is precisely their ‘canonical residual ﬁnite state automaton’, and our Brzozowski construction of N (L) in Section 3 generalizes their construction in [7, Theorem 5.2]. The main conceptual diﬀerence is that the latter works on the level of nfas, while our construction takes the continuous structure of nondeterministic closure automata into account. We hope to convince the reader that ncas provide the proper setting in which to study these canonical nfas and their construction.

We have introduced nondeterministic closure automata in [2] where we demonstrated that ncas – as well as related machines like the ´tomaton of Brzozowski and a Tamm [5] – are instances of a uniform coalgebraic construction. There we also gave various simple criteria for nondeterministic state minimality. The present paper can be understood as an in-depth study of ncas, extending the results from [2] and [7] 2 ´ Adamek, Milius, Myers, Urbat in two ways: ﬁrstly, we provide a richer and more conceptual way of constructing the minimal nca N (L) (compared to [7]) by working with closures. Secondly, we prove that the underlying nfa of N (L) is state-minimal provided that N (L) has topological closure, thereby generalizing a much weaker criterion from [2].

2 From Deterministic JSL-Automata to Nondeterministic Closure Automata In this section we consider deterministic automata interpreted in the category of join-semilattices, and explain how they induce nondeterministic closure automata.

We shall assume familiarity with basic concepts from category theory (categories, functors, duality and equivalence).

Notation 2.1 Throughout this paper we ﬁx an alphabet Σ. The composition of relations R ⊆ A×B and S ⊆ B×C is S◦R = {(a, c) : ∃b ∈ B.(a, b) ∈ R∧(b, c) ∈ S}.

Moreover R[A ] = {b ∈ B : ∃a ∈ A.(a, b) ∈ R} denotes the R-image of a subset A ⊆ A, and we write R[a] instead of R[{a}].

Let us ﬁrst recall deterministic and nondeterministic ﬁnite automata and provide them with suitable morphisms.

Deﬁnition 2.2 (1) A nondeterministic ﬁnite automaton (nfa) N = (Z, Ra, F ) consists of a ﬁnite set Z of states, transition relations Ra ⊆ Z × Z for every a ∈ Σ, and a set F ⊆ Z of ﬁnal states. A pointed nfa (N, I) is additionally equipped with a set of initial states I ⊆ Z.

(2) Nfas form a category Nfa whose morphisms B : (Z, Ra, F ) → (Z,, Ra, F ) are relations B ⊆ Z × Z that preserve and reﬂect transitions (i.e. Ra ◦ B = B ◦ Ra ) and moreover z ∈ F iﬀ B[z] ∩ F = ∅. Likewise we have the category Nfa∗ of pointed nfas, whose morphisms B : (N, I) → (N, I ) are additionally required to satisfy B[I] = I.

** Remark 2.3 Our choice of Nfa-morphisms B is sound: for each z ∈ Z the pointed nfas (Z, Ra, F, {z}) and (Z, Ra, F, B[z]) accept the same language.**

Deﬁnition 2.4 (1) A deterministic ﬁnite automaton (dfa) is an nfa D = (Q, δa, F ) whose transition relations δa : Q → Q are functions. A pointed dfa (D, q0 ) is a dfa equipped with a single initial state q0 ∈ Q. Morphisms of (pointed) dfas are precisely the Nfa- (resp. Nfa∗ -)morphisms that are functions.

(2) A deterministic automaton (da) is deﬁned in analogy to to (1), except that the set of states is not required to be ﬁnite.

We are mainly interested in d(f)as carrying a semilattice structure.

Notation 2.5 Let JSL denote the category of (join-)semilattices with a bottom element ⊥, whose morphisms are ⊥-preserving semilattice homomorphisms. JSLf is the full subcategory of ﬁnite semilattices.

One can view the ﬁnal states of a da (Q, δa, F ) as a predicate f : Q → {0, 1} with F = f −1 ({1}). This suggests the following deﬁnition: a JSL-da is a da whose state set Q carries a semilattice structure, such that the transitions δa : Q → Q and 3 ´ Adamek, Milius, Myers, Urbat the ﬁnal state predicate f : Q → 2 are semilattice morphisms. Here 2 denotes the 2-chain 0 1. Note that to give a morphism f : Q → 2 means precisely to give a prime ﬁlter F ⊆ Q, i.e., ⊥Q ∈ F and q ∨Q q ∈ F iﬀ q ∈ F or q ∈ F. Indeed, / given f, the set F = f −1 ({1}) is a prime ﬁlter, and conversely every prime ﬁlter of

**Q arises in this way. Therefore:**

Deﬁnition 2.6 (a) A JSL-da is a triple D = (Q, δa, F ) where Q is a semilattice of states, the a-transitions δa : Q → Q are semilattice homomorphisms for a ∈ Σ, and the ﬁnal states F ⊆ Q form a prime ﬁlter. Given another JSLda D = (Q, δa, F ), a morphism h : D → D is a semilattice homomorphism h : Q → Q such that δa ◦ h = h ◦ δa and q ∈ F iﬀ h(q) ∈ F. We denote by Jda the category of all JSL-das, and by Jdfa the full subcategory of JSL-dfas, that is, JSL-das with a ﬁnite set of states.

(b) A pointed JSL-da (D, q0 ) is a JSL-da D = (Q, δa, F ) with an initial state q0 ∈ Q.

Morphisms of pointed JSL-das must additionally preserve the initial state. Jda∗ denotes the category of pointed JSL-das, and Jdfa∗ the full subcategory of pointed JSL-dfas.

(c) The language accepted by a pointed JSL-da (D, q0 ) is the language accepted by its underlying da, that is, the set

where δw : Q → Q is the usual inductive extension of the transition function given by δε = idQ and δwa = δa ◦ δw.

** Example 2.7 (1) Take any nfa N = (Z, Ra, F ).**

The usual determinization via the subset construction is a JSL-dfa. Indeed, the states PZ form a semilattice w.r.t. union, the transitions preserve binary unions and the empty set, and the ﬁnal states {A ⊆ Z : A ∩ F = ∅} form a prime ﬁlter.

(2) Let PΣ∗ be the semilattice (w.r.t. union) of all languages over Σ. It carries the structure of a JSL-da whose transitions are given by L → a−1 L (a ∈ Σ) and whose ﬁnal states F are precisely the languages containing the empty word.

This automaton DPΣ∗ = (PΣ∗, a−1 (−), F ) is easily seen to be the ﬁnal JSLda: every JSL-da has a unique Jda-morphism into DPΣ∗, namely the function mapping each state to the language it accepts.

(3) Let DReg(Σ) be the subautomaton of DPΣ∗ whose states are the regular languages over Σ. It can be characterized up to isomorphism by the property that every JSL-dfa has a unique Jda-morphism into DReg(Σ).

** Remark 2.8 Readers familiar with the theory of coalgebras will immediately notice that JSL-das correspond to coalgebras for the functor T = 2×IdΣ : JSL → JSL.**

The examples (2) and (3) above then state precisely that DPΣ∗ is the ﬁnal T -coalgebra and DReg(Σ) is the ﬁnal locally ﬁnite T -coalgebra, see [9].

In [2] it was proved that the category JSLf of ﬁnite semilattices is equivalent to the category of ﬁnite strict closure spaces. We recall the necessary concepts.

A closure space (Z, clZ ) is a set with a closure deﬁned on it. It is ﬁnite if Z is ﬁnite and strict if clZ (∅) = ∅. A subset A ⊆ Z is closed if clZ (A) = A and open if ¯ its complement A ⊆ Z is closed. We write clZ (z) for clZ ({z}).

Deﬁnition 2.10 Let Z1 and Z2 be ﬁnite strict closure spaces. Then a relation

**B ⊆ Z1 × Z2 is said to be continuous if:**

(i) For each z ∈ Z1, the image B[z] ⊆ Z2 is closed in Z2.

(ii) B[clZ1 (A)] ⊆ clZ2 (B[A]) for all subsets A ⊆ Z1.

Given two continuous relations B1 : Z1 → Z2 and B2 : Z2 → Z3 we deﬁne their

**continuous composition as follows:**

** B2 • B1 = {(z1, z3 ) ∈ Z1 × Z3 : z3 ∈ clZ3 (B2 ◦ B1 [z1 ])} : Z1 → Z3.**

That is, one forms the usual relational composition and takes the closure in Z3. The continuous identity is deﬁned by idZ = {(z, z ) ∈ Z × Z : z ∈ clZ (z)}.

Deﬁnition 2.11 Let Clf denote the category of ﬁnite strict closure spaces and continuous relations, with continuous composition and identities.

** Remark 2.12 Strict closure spaces can be regarded as generalized topological spaces.**

Indeed, every topological space Z induces a strict closure space (Z, clZ ) where clZ is the usual topological closure operator. It preserves ﬁnite unions, i.e., clZ (A ∪ B) = clZ (A) ∪ clZ (B) for all A, B ⊆ Z. (∗) Conversely, every strict closure space Z satisfying (∗) arises from a topological space.

Moreover, if B : Z1 → Z2 is a function between topological spaces and Z2 is a T1 space, then B is continuous in the sense of topology iﬀ it is continuous in the sense of Deﬁnition 2.10.

Deﬁnition 2.13 A closure clZ satisfying (∗) is called topological.

Deﬁnition 2.14 Let Q be a ﬁnite semilattice, so that it is a lattice with meet

** Lemma 2.15 (see [2]) The categories of ﬁnite semilattices and ﬁnite strict closure spaces are equivalent.**

Indeed, the following functor G : JSLf → Clf is an

**equivalence:**

for any semilattice homomorphism f : Q → Q.

** Remark 2.16 The associated equivalence C : Clf → JSLf maps a ﬁnite strict closure space Z to the semilattice CZ of all closed subsets of Z w.**

r.t. inclusion order. Its bottom is ∅ and it has joins A ∨CZ B = clZ (A ∪ B) (the meet being intersection). A continuous relation B : Z1 → Z2 is mapped to