# «System Fi a Higher-Order Polymorphic λ-Calculus with Erasable Term-Indices Ki Yung Ahn1, Tim Sheard1, Marcelo Fiore2, and Andrew M. Pitts2 ...»

System Fi

a Higher-Order Polymorphic λ-Calculus

with Erasable Term-Indices

Ki Yung Ahn1, Tim Sheard1, Marcelo Fiore2, and Andrew M. Pitts2

Portland State University, Portland, Oregon, USA

{kya,sheard}@cs.pdx.edu

University of Cambridge, Cambridge, UK

{Marcelo.Fiore,Andrew.Pitts}@cl.cam.ac.uk

Abstract. We introduce a foundational lambda calculus, System Fi, for

studying programming languages with term-indexed datatypes – higherkinded datatypes whose indices range over data such as natural numbers or lists. System Fi is an extension of System Fω that introduces the minimal features needed to support term-indexing. We show that System Fi provides a theory for analysing programs with term-indexed types and also argue that it constitutes a basis for the design of logicallysound light-weight dependent programming languages. We establish erasure properties of Fi -types that capture the idea that term-indices are discardable in that they are irrelevant for computation. Index erasure projects typing in System Fi to typing in System Fω. So, System Fi inherits strong normalization and logical consistency from System Fω.

Keywords: term-indexed data types, generalized algebraic data types, higher-order polymorphism, type-constructor polymorphism, higher-kinded types, impredicative encoding, strong normalization, logical consistency 1 Introduction We are interested in the use of indexed types to state and maintain program properties. A type parameter (like Int in (List Int)) usually tells us something about data stored in values of that type. A type-index (like 3 in (Vector Int 3)) states an inductive property of values with that type. For example, values of type (Vector Int 3) have three elements.

Indexed types come in two ﬂavors: type-indexed and term-indexed types.

An example of type-indexing is a deﬁnition of a representation type [8] using

**GADTs in Haskell:**

data TypeRep t where RepInt :: TypeRep Int RepBool :: TypeRep Bool RepPair :: TypeRep a - TypeRep b - TypeRep (a, b ) supported by NSF grant 0910500.

Here, a value of type (TypeRep t) is isomorphic in shape with the type-index t.

For example, (RepPair RepInt RepBool) :: TypeRep (Int,Bool).

An example of Term-indices are datatypes with indices ranging over data structures, such as natural numbers (like Z, (S Z)) or lists (like Nil or (Cons Z Nil)). A classic example of a term-index is the second parameter to the lengthindexed list type Vec (as in (Vec Int (S Z))).

In languages such as Haskell3 or OCaml [10], which support GADTs with only type-indexing, term-indices are simulated (or faked) by reﬂecting data at the type-level with uninhabited type constructors. For example, data S n data Z data Vec t n where Cons :: a - Vec a n - Vec a ( S n ) Nil :: Vec a Z This simulation comes with a number of problems. First, there is no way to say that types such as (S Int) are ill-formed, and second the costs associated with duplicating the constructors of data to be used as term-indices. Nevertheless, GADTs with “faked” term-indices have become extremely popular as a lightweight, type-based mechanism to raise the conﬁdence of users that software systems maintain important properties.

Our approach in this direction is to design a new foundational calculus, System Fi, for functional programming languages with term-indexed datatypes. In a nutshell, System Fi is obtained by minimally extending System Fω with typeindexed kinds. Notably, this yields a logical calculus that is expressive enough to embed non-dependent term-indexed datatypes and their eliminators. Our contributions in this development are as follows.

– Identifying the features that are needed in a higher-order polymorphic λcalculus to embed term-indexed datatypes (Sect. 2), in isolation from other features normally associated with such calculi (e.g., general recursion, large elimination, dependent types).

– The design of the calculus, System Fi (Sect. 4), and its use to study properties of languages with term-indexed datatypes, including the embedding of term-indexed datatypes into the calculus (Sect. 6) using Church or Mendler style encodings, and proofs about these encodings. For instance, one can use System Fi to prove that the Mendler-style eliminators for GADTs [3] are normalizing.

– Showing that System Fi enjoys a simple erasure property (Sect. 5.2) and inherits meta-theoretic results, strong normalization and logical consistency, from Fω (Sect. 5.3).

2 Motivation: from System Fω to System Fi, and back It is well known that datatypes can be embedded into polymorphic lambda calculi by means of functional encodings [5].

see Sect. 7 for a very recent GHC extension, which enable true term-indices.

**In System F, one can embed regular datatypes, like homogeneous lists:**

Haskell: data List a = Cons a (List a) | Nil ∀X.(A → X → X) → X → X System F: List A Cons λw.λx.λy.λz. y w (x y z), Nil λy.λz.z In such regular datatypes, constructors have algebraic structure that directly translates into polymorphic operations on

**Abstract**

types as encapsulated by universal quantiﬁcation over types (of kind ∗).

In the more expressive System Fω (where one can abstract over type constructors of any kind), one can encode more general type-indexed datatypes that go beyond the regular datatypes. For example, one can embed powerlists with heterogeneous elements in which an element of type a is followed by an element

**of the product type (a,a):**

Haskell: data Powl a = PCons a (Powl(a,a)) | PNil

-- PCons 1 (PCons (2,3) (PCons ((3,4),(1,2)) PNil)) :: Powl Int λA∗.∀X ∗→∗.(A → X(A × A) → XA) → XA → XA System Fω : Powl Note the non-regular occurrence (Powl(a,a)) in the deﬁnition of (Powl a), and the use of universal quantiﬁcation over higher-order kinds (∀X ∗→∗ ). The term encodings for PCons and PNil are exactly the same as the term encodings for Cons and Nil, but have diﬀerent types.

What about term-indexed datatypes? What extensions to System Fω are needed to embed term-indices as well as type-indices? Our answer is System Fi.

In a functional language supporting term-indexed datatypes, we envisage that the classic example of homogeneous length-indexed lists would be deﬁned

**along the following lines (in Nax4 -like syntax):**

data Nat = S Nat | Z data Vec : * - Nat - * where VCons : a - Vec a { i } - Vec a { S i } VNil : Vec a { Z } Here the type constructor Vec is deﬁned to admit parameterisation by both type and term-indices. For instance, the type (Vec (List Nat) {S (S Z)}) is that of two-dimensional vectors of natural numbers. By design, our syntax directly reﬂects the diﬀerence between type and term-indexing by enclosing the latter in curly braces. We also make this distinction in System Fi, where it is useful within the type system to guarantee the static nature of term-indexing.

**The encoding of the vector datatype in System Fi is as follows:**

λA∗.λiNat.∀X Nat→∗.(∀j Nat.A → X{j} → X{S j}) → X{Z} → X{i} Vec where Nat, Z, and S respectively encode the natural number type and its two constructors, zero and successor. Again, the term encodings for VCons and VNil are exactly the same as the encodings for Cons and Nil, but have diﬀerent types.

Without going into the details of the formalism, which are given in the next section, one sees that such a calculus incorporating term-indexing structure needs four additional constructs (see Fig. 1 for the highlighted extended syntax).

We are developing a language called Nax whose theory is based on System Fi.

1. Type-indexed kinding (A → κ), as in (Nat→*) in the example above, where the compile-time nature of term-indexing will be reﬂected in the typing rules, enforcing that A be a closed type (rule (Ri) in Fig. 2).

2. Term-index abstraction λiA.F (as λiNat. · · · in the example above) for constructing (or introducing) term-indexed kinds (rule (λi) in Fig. 2).

3. Term-index application F {s} (as X{Z}, X{j}, and X{S j} in the example above) for destructing (or eliminating) term-indexed kinds, where the compile-time nature of indexing will be reﬂected in the typing rules, enforceing that the index be statically typed (rule (@i) in Fig. 2).

4. Term-index polymorphism ∀iA.B (as ∀j Nat. · · · in the example above) where the compile-time nature of polymorphic term-indexing will be reﬂected in the typing rules enforcing that the variable i be static of closed type A (rule (∀Ii) in Fig. 2).

As described above, System Fi maintains a clear-cut separation between typeindexing and term-indexing. This adds a level of abstraction to System Fω and yields types that in addition to parametric polymorphism also keep track of inductive invariants using term-indices. All term-index information can be erased, since it is only used at compile-time. It is possible to project any well-typed System Fi term into a well-typed System Fω term. For instance, the erasure of the Fi -type Vec is the Fω -type List. This is established in Sect. 5 and used to deduce the strong normalization of System Fi.

**3 Why Term-Indexed Calculi? (rather than dependent types)**

We claim that a moderate extension to the polymorphic calculus (Fω ) is a better candidate than a dependently typed calculus for the basis of a practical programming system. We hope to design a uniﬁed system for programming as well as reasoning. Language designs based on indexed types can beneﬁt from existing compiler technology and type inference algorithms for functional programming languages. In addition, theories for term-indexd datatypes are simpler than theories for full-ﬂedged dependent datatypes, because term-indexd datatypes can be encoded as functions (using Church-like encodings).

The implementation technology for functional programming languages based on polymorphic calculi is quite mature. The industrial strength Glasgow Haskell Compiler (GHC), whose intermediate core language is an extension of Fω, is used by thousands every day. Our term-indexed calculus Fi is closely related to Fω by an index-erasure property. The hope is that a language implementation based on Fi can beneﬁt from these technologies. We have built a language implementation of these ideas, which we call Nax.

Type inference algorithms for functional programming languages are often based on certain restrictions of the Curry-style polymorphic lambda calculi.

These restrictions are designed to avoid higher-order uniﬁcation during type inference. We have developed a conservative extension of Hindley–Milner type inference for Nax. This was possible because Nax is based on a restricted Fi. Dependently typed languages, on the other hand, are often based on bidirectional type checking, which requires annotations on top level deﬁnitions, rather than Hindley–Milner-style type inference.

In dependent type theories, datatypes are usually introduced as primitive constructs (with axioms), rather than as functional encodings (e.g., Church encodings). One can give functional encodings for datatypes in a dependent type theory, but one soon realizes that the induction principles (or, dependent eliminators) for those datatypes cannot be derived within the pure dependent calculi [11]. So, dependently typed reasoning systems support datatypes as primitives.

For instance, Coq is based on Calculus of Inductive Constructions, which extends Calculus of Constructions [7] with dependent datatypes and their induction principles.

In contrast, in polymorphic type theories, all imaginable datatypes within the calculi have functional encodings (e.g., Church encodings). For instance, Fω need not introduce datatypes as primitive constructs, since Fω can embed all these datatypes, including non-regular recursive datatypes with type indices.

Another reason to use Fi is to extend the application of Mendler-style recursion schemes, which are well-understood in the context of polymorphic lambda calculi like Fω. Researchers have thought about (though not published)5 Mendlerstyle primitive recursion over dependently-typed functions over positive datatypes (i.e., datatypes that have a map), but not for negative (or, mixed-variant) datatypes. In System Fi, we can embed Mendler-style recursion schemes, (just as we embedded them in Fω ) that are also well-deﬁned for negative datatypes.

4 System Fi System Fi is a higher-order polymorphic lambda calculus designed to extend System Fω by the inclusion of term-indices. The syntax and rules of System Fi are described in Figs. 1, 2 and 3. The extensions new to System Fi, which are not originally part of System Fω, are highlighted by grey boxes. Eliding all the grey boxes from Figs. 1, 2 and 3, one obtains a version of System Fω with Curry-style terms and the typing context separated into two parts (type-level context ∆ and term-level context Γ ).

We assume readers to be familiar with System Fω and focus on describing the new constructs of Fi, which appear in grey boxes.

Kinds (Fig. 1). The key extension to Fω is the addition of term-indexed arrow kinds of the form A → κ. This allows type constructors to have terms as indices.

The rest of the development of Fi ﬂows naturally from this single extension.

dependent kinds (i.e., kinds depending on type-level or value-level bindings).

That is, A should be a closed type of kind ∗, which does not contain any free type variables or index variables. For example, (List X → ∗) is not a well-sorted kind since X appears free, while ((∀X ∗. List X) → ∗) is a well-sorted kind.

Typing contexts (Fig. 1). Typing contexts are split into two parts. Type level contexts (∆) for type-level (static) bindings, and term-level contexts (Γ ) for term-level (dynamic) bindings. A new form of index variable binding (iA ) can appear in type-level contexts in addition to the traditional type variable bindings (X κ ). There is only one form of term-level binding (x : A) that appears in termlevel contexts. Note, both x and i represent the same syntactic category of “Type Variables”. The distinction between x and i is only a convention for the sake of readability.