Prolog - DCG binary processing

It seems to me that I need to process binary data with DCG in a list of bytes. However, to make it work as a whole, you need to use bitwise operations, which means is/2 , which means that the order of instantiation is a problem that can confuse the use of DCG for analysis and generation. The idea here is to serialize / deserialize binary data, but I think this example is simple enough to illustrate the problem.

Let me illustrate with some code. Suppose I have a binary protocol. I want to read two 4-bit integers from a byte. My naive self tries this:

 two_four_bit_ints(High, Low) --> [B], { High is B >> 4, Low is B /\ 0xF }. 

This seems to work for parsing:

 ?- phrase(two_four_bit_ints(H,L), [255]). H = L, L = 15. ?- phrase(two_four_bit_ints(H,L), [0]). H = L, L = 0. ?- phrase(two_four_bit_ints(H,L), [15]). H = 0, L = 15. ?- phrase(two_four_bit_ints(H,L), [240]). H = 15, L = 0. 

But this will not generate:

 ?- phrase(two_four_bit_ints(15,15), [X]). ERROR: is/2: Arguments are not sufficiently instantiated ?- phrase(two_four_bit_ints(15,15), X). ERROR: is/2: Arguments are not sufficiently instantiated 

Not sure what to do. I support someone shouting "Use clpfd ", but it doesn’t seem to support bit change operations, and I will be concerned about the performance effect of calling such a powerful system in the middle of low-level code.

Since I don’t see many helpers for binary code, is there any other more preferable way to do binary extraction / coding in Prolog? I use only SWI, so I am happy to accept offers that are not ISO portable, but if they are portable, that’s good too. I rather hoped to find that someone was porting something like Erlang's bit syntax, but had no search for luck.

+6
source share
4 answers

Better binary data support will be very nice to have a feature in Prolog. However, the relational nature of Prolog makes the overall solution rather complex. Thus, you are faced with a serious decision: either you map some library of other languages ​​directly to Prolog, thereby ignoring the relational nature of Prolog (and ideally protecting all borders with pure creation errors) or you choose a more relational approach.

When choosing a more relational solution, you can use existing libraries like library(clfd) or implement the entire restriction mechanism yourself. With some clever limitations, you can get away with a much simpler approach, but I doubt it will work. Trade-offs in the areas of accuracy and efficiency. Please note that clpfd systems in SICStus or SWI needed literally decades to reach their quality level.

Regardless of how you accept, some notes:

Efficiency library(clpfd)

library(clpfd) in SWI-Prolog has been specially optimized to be (in some cases) comparable in performance to traditional (is)/2 . To see this, compile the rule:

 list_len([_|Es], N0) :- N0 #> 0, N1 #= N0-1, list_len(Es, N1). 

and look at the generated code using listing(list_len) :

 list_len([_|C], A) :- ( integer(A) -> A>=0+1 ; clpfd:clpfd_geq(A, 1) ), ( integer(A) -> B is A+ -1 ; clpfd:clpfd_equal(B, A-1) ), list_len(C, B). 

Effectively, inline elements for parsed expressions such as (is)/2 and (>=)/2 are used for those cases that directly correspond to these primitive operations.

To fully simulate bitrate operations, you will need (div)/2 , which is currently only supported by SICStus library(clpfd) , but not SWI. So an additional headache awaits you here. But as long as you use unsigned non-negative values, the problem does not occur. For general shifts you will need (^)/2 , which is supported by SWI but not SICStus.

And here is the CLPFD version:

 two_four_bit_ints(High, Low) --> [B], { B in 0..255, Low in 0..15, High in 0..15, B #= Low + High*16 }. 

Please note that your original program inadvertently determines behavior in rather unintended situations, for example B = -1234 , B = 1+1 . You could add between(0, 255, B) , but then you could easily get combinatorial enumerations (read: explosions).

Existing library(clpfd) implementations can be significantly improved for such cases even more, but to improve them you must use them!

I / O and pio

ISO Prolog supports elementary I / O operations in

  • bytes ( get_byte/1 ),
  • codes ( get_code/1 ) and
  • characters ( get_char/1 ).

If you want to use DCG, you will definitely want to use library(pio) . Currently, the SWI library(pio) only supports codes .

+4
source

In SWI-Prolog, many bitwise operations are now supported in CLP (FD).

Please try the following with the latest version of git. It is enough to replace (is)/2 with (#=)/2 in your code:

  two_four_bit_ints (High, Low) ->
   [B]
   {
     High # = B >> 4 ,
     Low # = B / \ 0xF
   }.

Your first 4 sample queries work exactly the same as before, and should be reasonably efficient:

  ? - phrase (two_four_bit_ints (H, L), [255]).
 H = L, L = 15.

 ? - phrase (two_four_bit_ints (H, L), [0]).
 H = L, L = 0.

 ? - phrase (two_four_bit_ints (H, L), [15]).
 H = 0
 L = 15.

 ? - phrase (two_four_bit_ints (H, L), [240]).
 H = 15
 L = 0.

Note that CLP (FD) constraints are directly compiled into low-level predicates if they are used in modes that are also supported by primitive arithmetic.

One of the benefits of using CLP (FD) constraints is that the other 2 queries now work:

  ? - phrase (two_four_bit_ints (15,15), [X]).
 15 # = X / \ 15, 
  15 # = X >> 4 .

 ? - phrase (two_four_bit_ints (15,15), X).
 X = [_G1048], 
  15 # = _ G1048 / \ 15, 
  15 # = _ G1048 >> 4 .

At least you can use this for further reasoning.

In fact, now the most common query also works:

  ? - phrase (two_four_bit_ints (A, B), X).
 X = [_G1270], 
  A # = _ G1270 >> 4, 
  B # = _ G1270 / \ 15 .

In some cases, a more powerful distribution is possible. I will consider this as soon as such a need arises.

+3
source

To be more explicit than I can in a comment, taking as an example parsing an integer and converting an integer to a string, you can say:

 foo_parse(Number) --> digits(Ds), { number_codes(Number, Ds) }. foo_generate(Number) --> { number_codes(Number, Ds) }, Ds. 

You could have avoided this by having var(Number) "guard" for the first of two sentences, complete with its own cut, etc., but I'm not sure it becomes much easier to write, read or use. In any case, the two DCGs will probably be called from different contexts.

So, for your case, the generation will look something like this:

 fourbit_fourbit_generate(High, Low) --> { D is (High << 4) + Low }, [D]. 

Just my opinion.

+1
source

it can work

 two_four_bit_ints(High, Low) --> [B], { integer(B) % suggestion by @false, instead of nonvar(B) -> High is B >> 4, Low is B /\ 0xF ; B is (High << 4) \/ Low }. 

remember that DCG is just a prolog, but you can see them as executable semantic grammars.

+1
source

Source: https://habr.com/ru/post/977568/


All Articles