# The LRSX Unification Tool

Synopsis
The LRSX Unification Tool implements a new algorithm for unification of meta-expression in lambda calculi with recursive and shared bindings (i.e. so-called letrec-expressions).

The underlying meta-language and the unification algorithm are designed to capture usual call-by-need lambda-calculi included their reduction strategies. Besides the core utility to unify expressions of the meta-language, the tool can overlap reduction rules with program transformations on a given calculus description as input, and compute the corresponding critical pairs,

The LRSX Unification Tool is implemented in Haskell and packaged via Cabal and comes with a 3-clause BSD-like LICENSE.

• The source-distribution is available here: lrsx-0.3.0.4.tar.gz
• To build lrsx unzip the archive, then run cabal, i.e.:

``` tar -xzf lrsx-0.3.0.4.tar.gz cd lrsx-0.3.0.4/ cabal configure cabal build ```
This will create the executable `lrsx` in `dist/build/lrsx/`, e.g. type

``` dist/build/lrsx/lrsx help ```

to see the help-text of the LRSX Unification tool.
• The executable `lrsx` can be installed via typing `cabal install` If this completes successfully you will have the lrsx-binary in `~/.cabal/bin`.
Documentation

Using LSRX

The LRSX Unification Tool prints several help-texts by calling `lrsx help` or more command specific help by typing `lrsx command help`.

LRSX mainly supports two kinds of input files:

• Unification equations with constraints:
For those input files call `lrsx unify `file, where file contains the input. LRSX will unify the input and print the output to stdout.
• Program calculus descriptions and overlap commands:
• `lrsx overlap `file computes the critical pairs, requested by the commands in file.
Both formats share a common syntax for expressions and constraints as described below.

Unification equations with constraints

The allowed syntax can be represented by the following grammar in ENBF, where nonterminals are in bold font and the side condition that in derivations from EXPRESSION the symbol `HOLE` does not occur, except for deriviation starting with CONTEXT where the symbol `HOLE` occurs exactly once.

 CONSTRAINED_EQUATIONS ::= EQUATIONS | EQUATIONS `where` CONSTRAINTS EQUATIONS ::= EQUATION | EQUATION `,`EQUATIONS EQUATION ::= EXPRESSION `=?=` EXPRESSION EXPRESSION ::= `letrec` ENVIRONMENT in EXPRESSION | `letrec` `{` ENVIRONMENT `}` `in` EXPRESSION | SEXPRESSION SEXPRESSION ::= FUNSYMWITHINFO BINDEXPR1...BINDEXPRN | ATOM FUNSYMWITHINFO ::= FUNSYM | FUNSYM`[`STRICTPOSITIONS`]` STRICTPOSITIONS ::= INTEGER,...,INTEGER BINDEXPR ::= ATOM | VARIABLE`.`BINDEXPR | `*`VARIABLE | `(*`VARIABLE`)` ATOM ::= `var` VARIABLE | `(` EXPRESSION `)` | METAVARIABLE | CONTEXTVAR `[` EXPRESSION `]` | `HOLE` ENVIRONMENT ::= ENVITEM | ENVITEM `;` ENVIRONMENT ENVITEM ::= METAVARIABLE | METAVARIABLE `[` VARIABLE `,` EXPRESSION `]` | METAVARIABLE `|` VARIABLE `,` VARIABLE `|` | VARIABLE `=` EXPRESSION VARIABLE ::= CONCRETEVAR | METAVARIABLE CONCRETEVAR ::= (`a`|...|`z`)(`a`|...|`z`|`A`|...|`Z`|`0`|...|`9`)* FUNSYM ::= CONCRETEVAR METAVARIABLE ::= (`B`|`D`|...|`S`|`U`|...|`Z`)(`a`|...|`z`|`A`|...|`Z`|`0`|...|`9`)* CONTEXTVAR ::= ACONTEXT | TCONTEXT | CCONTEXT ACONTEXT ::= `A`(`a`|...|`z`|`A`|...|`Z`|`0`|...|`9`)* TCONTEXT ::= `T`(`a`|...|`z`|`A`|...|`Z`|`0`|...|`9`)* CCONTEXT ::= `C`(`a`|...|`z`|`A`|...|`Z`|`0`|...|`9`)* CONSTRAINTS ::= CONSTRAINT | CONSTRAINT `,` CONSTRAINTS CONSTRAINT ::= METAVAR `/= {}` | CONTEXTVAR `/= []` | `(` EXPRESSION `,` CONTEXT `)` | `(` ENVIRONMENT `,` CONTEXT `)` CONTEXT ::= EXPRESSION

An informal explanation of the syntax is

• A valid input is a sequence of equations where equations are separated by `,` and after the equations a sequence of constraints may occur which is introduced by the keyword `where`
• Equations are written with `=?=` as equation symbols, and both sides of one equation are LRSX-expressions.
• LRSX-expressions are meta-expressions which comprise concrete syntax and meta-variables, symbols beginning with upper-case letters are unification variables: unification variables beginning with
• `A` are interpreted as application context variables
• `T` are interpreted as top context variables
• `C` are interpreted as arbitrary context variables
Other unification variables are
• expression variables (usually denoted by `S`)
• lambda- or let-variable variables (usually denoted by `X` or `Y`)
• letrec-environment variables (usually denoted by `E`)
• environment chain variables with special syntax
• Varname`[y,s]` represents a (non-empty) application context chain of the form `y=A[y1];y2=A[y2];...;yn=A[s]` (usually denoted by `EE[y,s]`)
• Varname`|y1,yn|` represents a (non-empty) variable-to-variable chain of the form `y1 = (var y2);y2=(var y3);...;yn-1=(var yn)` (usually denoted by `VV|y1,yn|`)
The syntax of expressions includes:
• the special function symbol `var` to lift let- and lambda-variables to expressions,
• other function symbols are recognized by the input, where function symbols must begin with a lower case letter. The algorithm assumes the first argument of a function symbol to be strict, while all other positions are nonstrict. An example is `(app S1 S2)` which introduces the function symbol `app` with arity 2 However, strictness can be added by providing a list of strict positions, i.e. `app[1,2]` defined `app` to be strict in both arguments.
• Higher-order expressions are written as `x1.x2....xn.s`, i.e. lambda abstraction is a function symbol applied to a higher-order expression with on binder, which can be written as `\(var.body)` but also without parentheses as `\var.body`.
• Variable-positions are denoted by the special symbol `*`, e.g. in `app S *X` the function symbol `app` only allows variables as second argument.
• `letrec` Environment `in` body where body is the in-expression and Environment is a meta-letrec-environment: environments may be enclosed by `{` and `}`. Parts of the environment are separated by `;` and allowed parts are
• a meta-variable for an environment (usually denoted by `E`)
• an environment chain (see above, usually denoted by `EE[y,s]` or `VV|y1,yn|`)
• single bindings, written as `var = expr`

Expressions must fulfill the weak distinct variable convention which requires that names of bound variables are different from names of free variables, However, it is allowed that bound (let- or lambda-) variables occur several times, where it is forbidden, that the same let-variable is bound twice in the same letrec-environment.

The following constraints can be added to the input equations (separated by `,`):
• non-emptiness constraints for contexts written as `CtxtName /= []` with the semantics that the context represented by `CtxtName` must not be the empty context.
• non-emptiness constraints for environment variables written as `EnvName /= {}` with the the semantics that the part of the environment represented by `EnvName` is not an empty environment
• non-capture constraints written as `(expr,ctxt)` or `[env,ctxt]` with the meaning that the free variables of expression `expr` (environment `env`, resp.) is not captured by the hole of the context `ctxt`. The syntax for contexts is the same as for expressions, where additionally the terminal `HOLE` can (and must be) used to denote the context hole. An example is `(var Y,\X.HOLE)` which ensures that the variable `Y` may not be instantiated with the same variable as `X` is instantiated, since in this case `Y` is captured by the hole of `\X.HOLE`.

Program calculus descriptions and overlap commands

The grammar for the syntax of program calculus descriptions and commands extends the above grammar as follows, where the start symbol is RC

 RC ::= PRULES COMMANDS PRULES ::= RULES | RULES `TRANSFORMATIONS` RULES RULES ::= RULE | RULES RULE ::= `{` RULENAME `}` EXPRESSION `==>` EXPRESSION | `{` RULENAME `}` EXPRESSION `==>` EXPRESSION `where` CONSTRAINTS RULENAME ::= `SR,`RNAME`,`Number | RNAME`,`Number RNAME ::= (`a`|...|`z`)(`a`|...|`z`|`A`|...|`Z`|`0`...`9`|`_`)* COMMANDS ::= COMMAND | COMMAND COMMANDS COMMAND ::= `overlap` `(`RULENAME`)`.DIR `and` `(`RULENAME`)`.DIR | `overlap` `(`RULENAME`)`.DIR `all` DIR ::= `l` | `r`

The calculus description begins with a sequence of rules followed by a sequence of commands.

The sequence of rules may be partitioned into two sets of rules where the first set are standard reductions and the second set are program transformations. The sets are separated by the keyword `TRANSFORMATIONS`.

Rules are written as `{rulename} left hand side ==> right hand side` followed by an optional `where`-clause to add constraints to the rule. Both sides of the rule are expressions. The `rulename` should be unique for every rule and it must start with a lower-case letter. If it starts with `SR,` then it denotes a standard reduction, otherwise it denotes a transformation rule. The rulename also have a number. If no number is given, then the number is always 1.
Commands are of the form `overlap` `(`rulename1,Number`)`.dir1 `and` `(`rulename2,Number`)`.dir2 where `dir` is either `l` or `r`, to express which side of the rule should be used for overlapping. For computing forking diagrams the left hand side of a transformation has to be overlapped with the left hand sides of standard reductions, while for computing commuting diagrams, the right hand side of a transformation hat to be overlapped with the left hand sides of standard reductions.
As a short form also the command `overlap (rulename1,Number).dir1 all` is available which computes all overlaps of rulename1.dir1 with all left hand sides of all standard reductions (i.e. the rules which are named with `SR,...` ).

Examples

The source distribution comes with several examples which can be found in `examples`.

Examples for Unification

The following table lists the example-input files and the corresponding output generated by the LRSX Unification Tool:

 Example Output example1.inp Output example2.inp Output example3.inp Output example4.inp Output example5.inp Output example6.inp Output example7.inp Output example8.inp Output example9.inp Output example10.inp Output example11.inp Output example12.inp Output example13.inp Output example14.inp Output

Examples for Computing Overlaps

We provide two examples of program calculi with recursive let:
• The calculus LNEED is a minimal lambda-calculus with letrec. Its syntax comprises abstractions, applications, variables and letrec.

The description of the calculus and several program transformations can be found in the file LNEED.inp.

Computing forking overlaps can be done with the input file lneed_fork.inp which requires LNEED.inp. in the same directory. Running
`lrsx overlap lneed_fork.inp`
produces the critical pairs which can be found in the following text-file .

Computing commuting overlaps can be done with the input file lneed_commute.inp which requires LNEED.inp. in the same directory. Running
``` lrsx overlap lneed_commute.inp```
produces the critical pairs which can be found in the following text-file .
• The calculus LR extends the calculus LNEED by data constructors, case-expressions and Haskell's seq-operator (LR can be seen as an untyped core language of Haskell) The description of the calculus and several program transformations can be found in the file LR.inp.

Computing forking overlaps can be done with the input file lr_fork.inp which requires LR.inp. in the same directory. Running
``` lrsx overlap LR_fork.inp```
produces the critical pairs which can be found in the following text-file (~389 MB!).

Computing commuting overlaps can be done with the input file lr_commute.inp which requires lr.inp. in the same directory. Running
``` lrsx overlap lr_commute.inp```
produces the critical pairs which can be found in the following text-file (~455 MB!).
Contact
Dr. David Sabel
Computer Science Institute
Computer Science and Mathematics Department
Johann Wolfgang Goethe-University Frankfurt am Main
(c) 2016 PD Dr. David Sabel, Imprint, Last modified: \$Author: sabel \$, \$Date: 2016/07/17 11:03:32 \$