
 Synopsis
 The LRSX Unification Tool implements a new algorithm for unification of metaexpression in lambda calculi with recursive and shared bindings (i.e. socalled letrecexpressions).
The underlying metalanguage and the unification algorithm are designed to capture usual callbyneed lambdacalculi included their reduction strategies.
Besides the core utility to unify expressions of the metalanguage, the tool can overlap reduction rules with program transformations on a given calculus description as input,
and compute the corresponding critical pairs,
The LRSX Unification Tool is implemented in Haskell and packaged via Cabal and comes with
a 3clause BSDlike LICENSE.
 Download and Building


The sourcedistribution is available here:
lrsx0.3.0.4.tar.gz
 A recent installation of the Glasgow Haskell Compiler, the base libraries, and CabalInstall is required (see https://www.haskell.org/downloads)

To build lrsx unzip the archive, then run cabal, i.e.:
tar xzf lrsx0.3.0.4.tar.gz
cd lrsx0.3.0.4/
cabal configure
cabal build
This will create the executable lrsx in dist/build/lrsx/ , e.g. type
dist/build/lrsx/lrsx help
to see the helptext of the LRSX Unification tool.
 The executable
lrsx can be installed via typing cabal install
If this completes successfully you will have the lrsxbinary in ~/.cabal/bin .
 Documentation

Using LSRX
The LRSX Unification Tool prints several helptexts by calling lrsx help or more command specific help by typing lrsx command help .
LRSX mainly supports two kinds of input files:
 Unification equations with constraints:
For those input files call lrsx unify file, where file contains the input.
LRSX will unify the input and print the output to stdout.
 Program calculus descriptions and overlap commands:
lrsx overlap file computes the critical pairs, requested by the commands in file.
Both formats share a common syntax for expressions and constraints as described below.
Unification equations with constraints
The allowed syntax can be represented by the following grammar in ENBF,
where nonterminals are in bold font and the side condition that
in derivations from EXPRESSION the symbol HOLE does not occur, except
for deriviation starting with CONTEXT where the symbol HOLE occurs exactly once.
CONSTRAINED_EQUATIONS  ::=  EQUATIONS 
   EQUATIONS where CONSTRAINTS 
EQUATIONS  ::=  EQUATION 
   EQUATION , EQUATIONS 
EQUATION  ::=  EXPRESSION =?= EXPRESSION 
EXPRESSION  ::=  letrec ENVIRONMENT in EXPRESSION 
   letrec { ENVIRONMENT } in EXPRESSION 
   SEXPRESSION 
SEXPRESSION  ::=  FUNSYMWITHINFO BINDEXPR_{1}...BINDEXPR_{N} 
   ATOM 
FUNSYMWITHINFO  ::=  FUNSYM  FUNSYM[ STRICTPOSITIONS] 
STRICTPOSITIONS  ::=  INTEGER,...,INTEGER 
BINDEXPR  ::=  ATOM  VARIABLE. BINDEXPR 
   * VARIABLE  (* VARIABLE) 
ATOM  ::=  var VARIABLE 
   ( EXPRESSION ) 
   METAVARIABLE 
   CONTEXTVAR [ EXPRESSION ] 
   HOLE 
ENVIRONMENT  ::=  ENVITEM 
   ENVITEM ; ENVIRONMENT 
ENVITEM  ::=  METAVARIABLE 
   METAVARIABLE [ VARIABLE , EXPRESSION ] 
   METAVARIABLE  VARIABLE , VARIABLE  
   VARIABLE = EXPRESSION 
VARIABLE  ::=  CONCRETEVAR  METAVARIABLE 
CONCRETEVAR  ::=  (a ...z )(a ...z A ...Z 0 ...9 )* 
FUNSYM  ::=  CONCRETEVAR 
METAVARIABLE  ::=  (B D ...S U ...Z )(a ...z A ...Z 0 ...9 )* 
CONTEXTVAR  ::=  ACONTEXT  TCONTEXT  CCONTEXT 
ACONTEXT  ::=  A (a ...z A ...Z 0 ...9 )* 
TCONTEXT  ::=  T (a ...z A ...Z 0 ...9 )* 
CCONTEXT  ::=  C (a ...z A ...Z 0 ...9 )* 
CONSTRAINTS  ::=  CONSTRAINT 
   CONSTRAINT , CONSTRAINTS 
CONSTRAINT  ::=  METAVAR /= {} 
   CONTEXTVAR /= [] 
   ( EXPRESSION , CONTEXT ) 
   ( ENVIRONMENT , CONTEXT ) 
CONTEXT  ::=  EXPRESSION 
An informal explanation of the syntax is
 A valid input is a sequence of equations where equations are separated by
,
and after the equations a sequence of constraints may occur which is
introduced by the keyword where

Equations are written with
=?= as equation symbols, and both sides of one
equation are LRSXexpressions.

LRSXexpressions are metaexpressions which comprise concrete syntax and metavariables,
symbols beginning with uppercase letters are unification variables:
unification variables beginning with
A are interpreted as application context variables
T are interpreted as top context variables
C are interpreted as arbitrary context variables
Other unification variables are
 expression variables (usually denoted by
S )
 lambda or letvariable variables (usually denoted by
X or Y )
 letrecenvironment variables (usually denoted by
E )
 environment chain variables with special syntax
 Varname
[y,s]
represents a (nonempty) application context chain of
the form y=A[y1];y2=A[y2];...;yn=A[s] (usually denoted by EE[y,s] )

Varname
y1,yn represents a (nonempty) variabletovariable chain of
the form y1 = (var y2);y2=(var y3);...;yn1=(var yn) (usually denoted by VVy1,yn )
The syntax of expressions includes:
 the special function symbol
var to lift
let and lambdavariables to expressions,
 other function symbols are recognized by the input,
where function symbols must begin with a lower case letter.
The algorithm assumes the first argument of a function symbol to be strict, while all
other positions are nonstrict.
An example is
(app S1 S2) which introduces the function symbol app with arity 2
However, strictness can be added by providing a list of strict positions, i.e. app[1,2] defined app
to be strict in both arguments.
 Higherorder expressions are written as
x1.x2....xn.s , i.e. lambda abstraction is a function symbol applied to
a higherorder expression with on binder, which can be written as \(var.body) but also without parentheses as \var.body .
 Variablepositions are denoted by the special symbol
* , e.g. in app S *X the function symbol
app only allows variables as second argument.
letrec Environment in body where body is the inexpression and Environment is a metaletrecenvironment:
environments may be enclosed by { and } . Parts of the environment are separated by ; and allowed parts are
 a metavariable for an environment (usually denoted by
E )
 an environment chain (see above, usually denoted by
EE[y,s] or VVy1,yn )
 single bindings, written as
var = expr
Expressions must fulfill the weak distinct variable convention which requires that
names of bound variables are different from names of free variables, However, it is allowed that bound (let or lambda) variables occur several times,
where it is forbidden, that the same letvariable is bound twice in the same letrecenvironment.
The following constraints can be added to the input equations (separated by , ):

nonemptiness constraints for contexts written as
CtxtName /= [] with the semantics
that the context represented by CtxtName must not be the empty context.

nonemptiness constraints for environment variables written as
EnvName /= {} with the
the semantics that the part of the environment represented by EnvName is not an empty environment

noncapture constraints written as
(expr,ctxt) or [env,ctxt] with the meaning that
the free variables of expression expr (environment env , resp.) is not captured by the hole of
the context ctxt . The syntax for contexts is the same as for expressions, where additionally
the terminal HOLE can (and must be) used to denote the context hole.
An example is (var Y,\X.HOLE) which ensures that the variable Y may not be instantiated with the
same variable as X is instantiated, since in this case Y is captured by the hole of \X.HOLE .
Program calculus descriptions and overlap commands
The grammar for the syntax of program calculus descriptions and
commands extends the above grammar as follows, where the start
symbol is RC
RC  ::=  PRULES COMMANDS 
PRULES  ::=  RULES  RULES TRANSFORMATIONS RULES 
RULES  ::=  RULE  RULES 
RULE  ::=  { RULENAME } EXPRESSION ==> EXPRESSION 
   { RULENAME } EXPRESSION ==> EXPRESSION where CONSTRAINTS 
RULENAME  ::=  SR, RNAME, Number  RNAME, Number

RNAME  ::=  (a ...z )(a ...z A ...Z 0 ...9 _ )* 
COMMANDS  ::=  COMMAND  COMMAND COMMANDS 
COMMAND  ::=  overlap ( RULENAME) .DIR and ( RULENAME) .DIR 
   overlap ( RULENAME) .DIR all 
DIR  ::=  l  r 
The calculus description begins with a sequence of rules followed
by a sequence of commands.
The sequence of rules may be partitioned into two sets of rules
where the first set are standard reductions and the second set are
program transformations. The sets are separated by the keyword TRANSFORMATIONS .
Rules are written as {rulename} left hand side ==> right hand side
followed by an optional where clause to add constraints to the rule.
Both sides of the rule are expressions. The rulename should be unique
for every rule and it must start with a lowercase letter. If it starts with SR, then it denotes a standard reduction, otherwise it denotes a transformation rule. The rulename also have a number. If no number is given, then the number is always 1.
Commands are of the form
overlap ( rulename1,Number) .dir1
and
( rulename2,Number) .dir2
where dir is either l or r , to express which side of the rule should
be used for overlapping.
For computing forking diagrams the left hand side of a transformation
has to be overlapped with the left hand sides of standard reductions,
while for computing commuting diagrams, the right hand side of a
transformation hat to be overlapped with the left hand sides of
standard reductions.
As a short form also the command
overlap (rulename1,Number).dir1 all
is available which computes all overlaps of rulename1.dir1 with
all left hand sides of all standard reductions (i.e. the rules which are named with SR,...
).
 Examples
The source distribution comes with several examples which can be found in examples .
Examples for Unification
The following table lists the exampleinput files and the corresponding output generated by the LRSX Unification Tool:
Examples for Computing Overlaps
We provide two examples of program calculi with recursive let:
 The calculus LNEED is a minimal lambdacalculus with letrec.
Its syntax comprises abstractions, applications, variables and letrec.
The description of the calculus and several program transformations
can be found in the file
LNEED.inp.
Computing forking overlaps can be done with the input file
lneed_fork.inp
which requires
LNEED.inp.
in the same directory.
Running
lrsx overlap lneed_fork.inp
produces the critical pairs which can be found
in the following textfile
.
Computing commuting overlaps can be done with the input file
lneed_commute.inp
which requires
LNEED.inp.
in the same directory.
Running
lrsx overlap lneed_commute.inp
produces the critical pairs which can be found
in the following textfile
.

The calculus LR extends the calculus LNEED by data constructors,
caseexpressions and Haskell's seqoperator (LR can be seen as an untyped core language of Haskell)
The description of the calculus and several program transformations
can be found in the file
LR.inp.
Computing forking overlaps can be done with the input file
lr_fork.inp
which requires
LR.inp.
in the same directory.
Running
lrsx overlap LR_fork.inp
produces the critical pairs which can be found
in the following textfile
(~389 MB!).
Computing commuting overlaps can be done with the input file
lr_commute.inp
which requires
lr.inp.
in the same directory.
Running
lrsx overlap lr_commute.inp
produces the critical pairs which can be found
in the following textfile
(~455 MB!).
 Contact

Dr. David Sabel
Computer Science Institute
Computer Science and Mathematics Department
Johann Wolfgang GoetheUniversity Frankfurt am Main
