Constraint Logic Programming

Dick Pountain

Were you to ask me which programming paradigm is likely to gain most in commercial significance over the next 5 years I'd have to pick Constraint Logic Programming (CLP), even though it's perhaps currently one of the least known and understood. That's because CLP has the power to tackle those difficult combinatorial problems encountered for instance in job scheduling, timetabling, and routing which stretch conventional programming techniques beyond their breaking point. Though CLP is still the subject of intensive research, it's already being used by large corporations such as manufacturers Michelin and Dassault, the French railway authority SNCF, airlines Swissair, SAS and Cathay Pacific, and Hong Kong International Terminals, the world's largest privately-owned container terminal.

Children Of Prolog

As the name suggests CLP is descended from logic programming, which shot to fame via the Prolog language as a consequence of the Japanese '5th Generation' project and the expert systems boom of the mid-80's. Commercial acceptance of Prolog was hindered by its relatively poor efficiency compared to procedural languages like C, and its use has declined in recent years. Now CLP languages make logic programs execute very efficiently by focussing on a particular problem domain.

Prolog is based on first-order predicate logic and the objects that it manipulates are pure symbols with no intrinsic meaning: for example in the Prolog proposition 'likes(jim, baseball)' the constants 'jim' and 'baseball' have no deeper interpretation beyond syntactic identity, ie.jim=jim. Execution of a Prolog program proceeds by searching a database of such facts to find those values that will satisfy a user's query, using a process called 'unification' based on syntactic identity. Since Prolog tries to find the set of all solutions to a query, during this search many dead-ends may get explored and then abandoned by backtracking to an earlier state and trying a different branch. For complex problems this search process can become very greedy in both space and time, which is the root of Prolog's inefficiency.

In a CLP language this purely abstract logical framework is supplemented by objects that have meaning in an application domain - for example the integers or the real numbers, along with their associated algebraic operations (eg. addition and multiplication) and predicates (eg. =, <, and >). Hence there isn't a single CLP language, but a whole family of them defined for different application domains. A CLP programmer introduces arithmetic expressions called 'constraints' (eg. X > 0 or Y+Z < 15) into programs, which have to be satisfied for succesful execution of the program. [For a more formal explanation of how CLP works see BYTE, August 1987]

In such a CLP system the simple unification algorithm that lies at the heart of Prolog must be augmented by a dedicated 'solver' for the particular domain of application, which can decide at any moment whether the remaining constraints are solvable. For efficiency's sake, solvers for CLP systems need to be incremental, so that adding a new constraint to an already solved set does not force them all to be re-solved. Constraint solving algorithms are quite well understood from other branches of computing; you'll have used one if you've ever done 'goal-seeking' in your Excel spreadsheet. For example a useful solver for linear rational constraints is the well-known simplex method.

Another significant way in which CLP differs from Prolog is that it's perfectly happy to do mathematics with uninstantiated variables, so that in the absence of complete information the answer might be a symbolic expression like 10 - X, or even a constraint like X > 23.

Constrained Search

A CLP program still needs to search a database of facts, but it can use constraints to rule out many possible outcomes (ie. to prune away large parts of the search tree) resulting in enormously improved efficiency, comparable to custom solutions written in C.

We all use facts as constraints to guide reasoning as a key part of everyday 'common sense'. For example a few minutes ago I was phoned by a PR person asking if I'm interested in document management, because if so there's a press briefing next Wednesday in London. A glance at my calendar reveals I'm in Cambridge all next Wednesday - end of conversation. We no longer needed to explore the space of my interest (or lack thereof) in document management because an absolute geographical constraint had lopped off that branch. Without such constraints every little decision might set off an avalanche of philosophical speculation.

Herbert A. Simon, Nobel laureate and theorist of heuristic problem-solving, has used popular word-for-number puzzles to illustrate this pruning process. For example in the puzzle DONALD + GERALD = ROBERT there are 3,628,800 possible assignments of digits to letters and it would take you several years to solve it by unconstrained search. Yet most of us can solve it in just minutes by incrementally applying constraints (eg. T must be even) to rule out more and more options. Listing 1 shows a typical CLP program to solve this puzzle, written in Eclipse by Mark Wallace of IC Parc (see below.)

Slaying NP-Hard Dragons

This constrained search ability makes CLP languages good at precisely those problems that conventional programming techniques find hardest; NP-hard search problems where the time needed for an unconstrained search increases exponentially (or worse) with the problem size.

Consider the simple problem of a harbour which needs to schedule the loading and unloading of 10 ships using only 5 berths. There are many criteria for choosing the berth for a particular ship: some berths are too small for some ships; some ships need to be turned round faster than others; some berths cost more than others; ships' intended cargoes are stacked nearer to certain berths, etc, etc. You can find the optimal schedule by trying all permutations of ships in berths and calculating the cost of each, which means considering 5^10 (or around 10 million) alternatives. Assuming that your computer can try an alternative every millisecond, then the whole problem is solved in around 3 minutes. A decade later, business has been good and the harbour has expanded to 10 berths, with 20 ships to unload. Determining the optimal schedule now means trying 10^20 alternatives, which will take 3000 million years on the same computer - of course you can stump up for an accelerator card and cut that to 300 million years if you're worried about the lifetime of the solar system.

There are many other problems in planning, scheduling, packing etc that exhibit this kind of unreasonable scaling behaviour and so cannot be solved optimally by an exhaustive search. So how do you solve them? A naive but tempting approach is to divide the harbour in two and schedule each half using the old program, taking 6 minutes in all. Unfortunately such a schedule is unlikely to be anywhere near optimal, and worse, you won't even know how far from optimal it is and how much money you are wasting. Actually running the 3000 million year program for six minutes and choosing the cheapest alternative so far would give just as good (or bad) a result.

Where CLP languages score for this class of problem is that you can explicitly employ all the real-world constraints that are particular to the problem, and so reduce the search space enormously - in our harbour example, adding a constraint like 'shipLength < berthLength' might immediately remove millions of possibilities.

Languages like CHIP and Eclipse offer direct control over the search strategy (eg. via the 'deleteff' function in Listing 1 -- if this still doesn't yield an optimal solution in reasonable time you must then deploy approximation algorithms to reach a solution that lies close to the optimum with a high degree of probability. Researchers are working hard to integrate algorithms like hill-climbing, simulated annealing, and genetic algorithms into the newer CLP languages.

Don't get the idea that CLP can perform magic; you need a great deal of experience before you can choose the correct algorithms and correct expression of the constraints to get a good solution for big problems. Nevertheless the interactive nature and very high expressive power of CLP languages makes it easy to experiment with different combinations, and results in much shorter and more maintainable programs than using a procedural language.

Some CLP Implementations

The founding work on the CLP scheme was done at Monash University in Melbourne, Australia by J.Jaffar and J.L.Lassez around 1987. They created the CLP(R) system which works on the domain of real linear arithmetic. This system, extended as CLP(X) is still being developed at Monash, and IBM Yorktown Heights and Carnegie Mellon University in the USA.

In Europe CLP research was originally concentrated at the ECRC (European Computer-Industry Research Centre) in Munich, and it lead to what is probably the most widely known CLP system at present, CHIP (Constraint Handling in Prolog). CHIP is marketed commercially by the French company Cosytech, and CHIP derivatives are sold by two of ECRC's industrial sponsors Bull (as Charme) and ICL (as Decision Power). CHIP provides constraint solvers over finite arithmetic, linear rational and boolean domains.

In 1990 in Marseille, France, Alain Colmerauer (one of the founding fathers of Prolog and CLP) created Prolog III, a CLP language over the domains of linear rational arithmetic, booleans and finite strings or lists.

More recently London's Imperial College has set up IC Parc, a university-industry collaboration that tackles hard planning problems (eg. work-force management, routing and resource allocation) using CLP techniques. IC Parc's language Eclipse began life at ECRC and shares many features with CHIP, but where CHIP's constraint solvers are hard-coded in C, Eclipse's are written in itself for easier modification. Eclipse also employs the powerful generalized propagation technique (see Figure 2) and a parallel version of Eclipse is currently under development.

Interesting non-Prolog based CLP languages include Trilogy from the Vancouver-based Complete Logic Systems and Oz, an object-oriented concurrent CLP being developed at DFKI (German Research Center for Artificial Intelligence) in Kaiserlautern.

Listing 1


/* An Eclipse program to solve the GERALD+DONALD=ROBERT word puzzle*/

lib(fd).

addNum(Num1,Num2,Result) :-
   addwithcarries(Num1,Num2,0,Result).

addwithcarries(Num1*Digit1,Num2*Digit2,Carryin,Num3*Sum) :-
   Carry::0..1,
   addDigit(Carryin,Digit1,Digit2,Sum,Carry),
   addwithcarries(Num1,Num2,Carry,Num3).

addwithcarries(Digit1,Digit2,Carryin,Digit3) :-
   is_domain(Digit1), is_domain(Digit2),
   addDigit(Carryin,Digit1,Digit2,Digit3,0).

addDigit(C,N1,N2,Sum,CIn) :- Sum #= C+N1+N2-10*CIn.

solve :-
   Letters=[D,O,N,A,L,G,E,R,B,T],
   Letters::0..9,
   alldistinct(Letters),
   addNum(D*O*N*A*L*D,G*E*R*A*L*D,R*O*B*E*R*T),
   label(Letters),
   writeln([D,O,N,A,L,D]),
   writeln([G,E,R,A,L,D]),
   writeln([R,O,B,E,R,T]).

label([]).
label(List) :-
   deleteff(Digit,List,Rest),
   indomain(Digit),
   label(Rest).

Figure 2 - Constraint Propagation

A technique that further boosts the power of constraint solving, called 'generalized constraint propagation', has been recently introduced by Thierry Le Provost and Mark Wallace of ECRC. Generalized propagation is a way of extracting all the possible information from a set of constraints.

As a small example let's consider a truth table definition of the boolean predicate 'and' in Prolog :-

and(true,true,true).
and(true,false,false).
and(false,true,false).
and(false,false,false).

Whenever 'and(X,Y,Z)' is encountered in a logic program, this table may be searched repeatedly to try different values for X,Y and Z. But in a system using generalized propagation, the expression 'propagate and(X,Y,Z)' executes as though it were defined by :-

if      X = false then Z = false
elseif  X = true  then Z = Y
elseif  Y = false then Z = false
elseif  Y = true  then Z = X
elseif  Z = true  then X = true, Y = true
elseif  X = Y     then Z = X
endif

The propagation system exploits all the implied information in the truth table to provide a more efficient, though less purely logical, route to a solution. What's more, if none of X,Y,Z has a value yet, 'propagate and(X,Y,Z)' doesn't fail but instead suspends itself until one of them gets a value; it is data-driven.

HTML version (c) Michael Jampel, February 1995.

Go to the Constraints Home Page.

Michael Jampel