Prolog union for A U B U C

Question

union(A, B, C, U) :-
   union(A, B, V),
   union(C, V, U).

Your definition of union/3 can be improved by replacing

... not(element(X,L)), ...

by

... maplist(dif(X),L), ...

or

... non_member(X, L), ....

non_member(_X, []).
non_member(X, [E|Es]) :-
   dif(X, E),
   non_member(X, Es).

Here is a case where the difference shows:

?- union([A],[B],[C,D]).
A = C,
B = D,
dif(C, D).

How must [A] and [B] look like such that their union contains 2 elements?

The answer is: they must be different.

Your original version fails for this query, yet, it succeeds for a specialized instance like:

?- A = 1, B = 2, union([A],[B],[C,D]).

So it succeeds for this, but fails for a generalization of it. Therefore it is not a pure, logical relation.

So is everything fine and perfect with dif/2? Unfortunately not. @TudorBerariu has good reason to go for a cut, since it reflects some of the intention we have about the relation. The cut effectively reflects two key intentions

that the alternative of not being a member is now excluded, which is true for certain modes, like Arg1 and Arg2 being both sufficiently instantiated terms. A safe approximation would be ground terms.
that there is no need to look at further elements in the list Arg2, which again is only true if Arg1 and Arg2 are sufficiently instantiated.

Problems only show when terms are not sufficiently instantiated..

The drawback of OP’s definition and the one above, is that both are unnecessarily too general which can be observed with repeated elements in Arg2:

?- union([a,a],[a,a],Zs).
Zs = [a, a] ;
Zs = [a, a] ;
Zs = [a, a] ;
Zs = [a, a] ;
false.

In fact, we get |Arg2|^|Arg1|-1 redundant answers. So the cut had some good reason to be there.

Another reason why union/3 as it stands is not very efficient is that for the (intended) ground case it leaves open unnecessary choice points. Again, @TudorBerariu’s solution does not have this problem:

?- union([a],[a],Zs).
Zs = [a] ;    %    <--- Prolog does not know that there is nothing left.
false.

Eliminating redundancy

The actual culprit for that many redundant answers is the first rule. element(a,[a,a]) (commonly called member/2) will succeed twice.

union([X|Y],L,S) :- element(X,L), union(Y,L,S).
                    ^^^^^^^^^^^^

Here is an improved definition:

memberd(X, [X|_Ys]).
memberd(X, [Y|Ys]) :-
   dif(X,Y),          % new!
   memberd(X, Ys).

The recursive rule, reading it right-to-left, reads as follows:

Assume memberd(X, Ys) is true already for some X and Ys. Given that, and given that we have a fitting Y which is different from X. Then

we can conclude that also memberd(X, [Y|Ys]) is true.

So this has eliminated the redundant solutions. But our definition is still not very efficient: it still has to visit Arg2 twice for each element, and then it is unable to conclude that no alternatives are left. In any case: resist to place a cut to remove this.

Introducing determinism via reification.

Compare the definitions of memberd/2 and non_member/2. Although they describe “the opposite” of each other, they look very similar:

non_member(_X, []).
non_member(X, [Y|Ys]) :-
   dif(X,Y),
   non_member(X, Ys).

memberd(X, [X|_Ys]).
memberd(X, [Y|Ys]) :-
   dif(X,Y),         
   memberd(X, Ys).

The recursive rule is the same! Only the fact is a different one. Let’s merge them into one definition – with an additional argument telling whether we mean memberd (true) or non_member (false):

memberd_t(_X, [], false).
memberd_t(X, [X|_Ys], true).
memberd_t(X, [Y|Ys], Truth) :-
   dif(X, Y),
   memberd_t(X, Ys, Truth).

Now, our definition gets a bit more compact:

unionp([], Ys, Ys).
unionp([X|Xs], Ys, Zs0) :-
  if_( memberd_t(X, Ys), Zs0 = Zs, Zs0 = [X|Zs] ),
  unionp(Xs, Ys, Zs).

memberd_t(_X, [], false).          % see below
memberd_t(X, [Y|Ys], Truth) :-
   if_( X = Y, Truth=true, memberd_t(X, Ys, Truth) ).

Note the difference between if_(If_1, Then_0, Else_0) and the if-then-else control construct ( If_0 -> Then_0 ; Else_0 ). While If_1 may succeed several times with different truth values (that is, it can be both true and false), the control construct makes If_0 succeed only once for being true only.

if_(If_1, Then_0, Else_0) :-
   call(If_1, T),
   (  T == true -> call(Then_0)
   ;  T == false -> call(Else_0)
   ;  nonvar(T) -> throw(error(type_error(boolean,T),_))
   ;  /* var(T) */ throw(error(instantiation_error,_))
   ).

=(X, Y, T) :-
   (  X == Y -> T = true
   ;  X \= Y -> T = false
   ;  T = true, X = Y
   ;  T = false,
      dif(X, Y)                             % ISO extension
      % throw(error(instantiation_error,_)) % ISO strict
   ).

equal_t(X, Y, T) :-
   =(X, Y, T).

To ensure that memberd_t/3 will always profit from first-argument indexing, rather use the following definition (thanks to @WillNess):

memberd_t(E, Xs, T) :-
   i_memberd_t(Xs, E, T).

i_memberd_t([], _E, false).
i_memberd_t([X|Xs], E, T) :-
   if_( X = E, T = true, i_memberd_t(Xs, E, T) ).

Eliminating redundancy

Introducing determinism via reification.

Leave a Comment Cancel reply