As a general comment, you will find more information about it under the name library(pio) . It is also useful to use it cleanly:
:- use_module(library(pio)).
Your example is too complicated, so I will only consider a slightly simpler case - a list of numbers separated by a newline:
nats ([]) -> [].
nats ([N | Ns]) -> nat (N), newline, nats (Ns).
So how can you test this effectively? (What is your question 3) The main point of library(pio) is that you can use regular DCGs to process files. But for testing in small, you can still use simple phrase/2 . Therefore I:
? - phrase (nats (Ns), "1 \ n").
Ns = [1];
false
You see a hint ; ? This means: Prolog was unable to decide whether further answers could be calculated, so it leaves one or more selection points open. And this is only for one digit . You can imagine how things accumulate.
Let dig deeper:
? - phrase (digit (D), "1").
D = 1;
false
Again evil ; ! To make this work, everything had to be determined. There are three ways:
Use cuts (and lose your soul)
I wish you good luck - best after repeated element:
trace_file_phrase ([]) -> [].
trace_file_phrase ([T | Ts]) ->
trace_phrase (T),
!,% ugly, but ...
trace_file_phrase (Ts).
(This should answer question 1)
But wait a minute! What's bad about it? ! ? As long as there is exactly one answer to the question trace_phrase//1 , everything is perfect. It is only if there are more answers (or actually solutions) that the cut can remove precious answers. How do you know if there are more solutions? Oh no. And you will not see them, as they have already been cut off.
call_semidet/1
Here's what you can do to prevent this from happening. This only works for free purposes without side effects that can be caused twice without any effects:
call_semidet (Goal): -
(call_nth (Goal, 2)
-> throw (error (mode_error (semidet, Goal), _))
; once (Goal)
)
call_nth/2 used call_nth/2 , as defined in another post. (As an optimization, an implementation could avoid calling Goal twice when there is no pick-point open ...) Just to understand how this works:
? - phrase (nat (N), "1234").
N = 1234;
false
? - call_semidet (phrase (nat (N), "1234")).
N = 1234.
? - call_semidet ((X = 1; X = 2)).
ERROR: Unknown error term: mode_error (semidet, (2 = 1; 2 = 2))
So, this makes your little grammar an effective determination! Therefore, there is no need to reformulate anything!
There is currently no integration of this into grammar. You can do this very low level or rather pure use of library(lambda) .
phrase_semidet (NT) ->
call (S0 ^ S ^ call_semidet (phrase (NT, S0, S))).
Please note that in this particular case we do not use \ to rename.
trace_file_phrase ([]) -> [].
trace_file_phrase ([T | Ts]) ->
phrase_semidet (trace_phrase (T)),
trace_file_phrase (Ts).
Indexing index
Finally, a very laborious but clean way would be to rewrite everything in order to better profit from indexing (and perhaps help improve indexing in general ...) But this is a long way to go. To start:
digit (D) -> [C],
{c_digit (C, D)}.
c_digit (0'0.0).
c_digit (0'1,1).
c_digit (0'2,2).
c_digit (0'3,3).
c_digit (0'4,4).
c_digit (0'5.5).
c_digit (0'6.6).
c_digit (0'7.7).
c_digit (0'8.8).
c_digit (0'9.9).
This gives you now:
? - phrase (digit (D), "1").
D = 1.
But you have another source of non-determinism, which is more likely due to how you define grammar. In nat//2 you see:
nat (N, N) -> [].
nat (A, N) -> digit (D), ....
The first rule always applies, i.e. "1234\n" will be parsed as "1" "12" "123" "1234" only the next newline//0 understands that it will be enough for the last - and then stick to it.
You can rewrite something for this, but then the code is no longer a pure small spectrum that you liked, can you? Perhaps this may change in the future.
eg. indexing in SWI is much better than before, maybe things are developing here too ....
The goal of library(pio) was to start this process. Compare this to Haskell - we are far from interact efficiency! But there is no inherent value:
... -> [] | [_], ....
? - phrase_from_file ((..., "searchstring", ...), fichier).
as effective as grep, space-wise. That is, he works in a constant space. Therefore, I hope the code will work more in the future.
Edit: BTW, library(pio) already has impact efficiency: GC phases have been greatly improved, very much like Wadler. Elimination of some space leaks; ndash paper a quarter century ago. Things are evolving ...