(it...">

Parsing lines with a schema

I am trying to write a simple parser that creates an sxml expression from a string, e. g.

"This is a [Test]" ===> (item "This is a" (subitem "Test")) 

Anyone interested in square brackets in this example can take a look at the so-called Leiden conventions .

This is the code that I have written so far:

 (define my-sequence '("this" "[" "is" "a" "]" "test")) (define (left-square-bracket? item) (or (equal? item "[") (eq? item #\x005b))) (define (right-square-bracket? item) (or (equal? item "]") (eq? item #\x005d))) (define (parse-sequence sequence) (cond ((null? sequence) '()) ((left-square-bracket? (car sequence)) (let ((subsequence (get-subsequence (cdr sequence)))) (list subsequence))) (else (cons (car sequence) (parse-sequence (cdr sequence)))))) (define (get-subsequence sequence) (if (right-square-bracket? (car sequence)) '() (cons (car sequence) (get-subsequence (cdr sequence))))) 

Evaluation (parse-sequence my-sequence) gives ("this" ("is" "a")) . A nested expression was created, but the program terminated without evaluating the last element of "test" . The question is, how do I get back from get-subsequence to parse-sequence ?

Any help is appreciated, thanks a lot in advance! :)

+4
source share
2 answers

To solve your initial questions, how to return multiple values: use the "values" form. The following is an example implementation where the internal procedure returns both the remaining list to be processed and the result. It repeats when you open the brackets.

 (define (parse-sequence lst) (define (parse-seq lst) (let loop ((lst lst) (res null)) (cond ((null? lst) (values null res)) ((string=? (car lst) "[") (let-values ([(lst2 res2) (parse-seq (cdr lst))]) (loop lst2 (append res (list res2))))) ((string=? (car lst) "]") (values (cdr lst) res)) (else (loop (cdr lst) (append res (list (car lst)))))))) (let-values ([(lst res) (parse-seq lst)]) res)) 

then

 (parse-sequence '("this" "is" "a" "test")) (parse-sequence '("this" "[" "is" "a" "]" "test")) (parse-sequence '("this" "[" "is" "[" "a" "]" "]" "test")) 

will give

 '("this" "is" "a" "test") '("this" ("is" "a") "test") '("this" ("is" ("a")) "test") 
+2
source

I made some progress using open-input-string in combination with read-char:

  (define my-sequence (open-input-string "this [is a] test"))

 (define (parse-sequence sequence)
   `(item
     , @ (let loop ((next-char (read-char sequence)))
         (cond ((eof-object? next-char) '())
               ((left-square-bracket? next-char)
                (let ((subsequence (get-subsequence sequence)))
                  (cons subsequence
                        (loop (read-char sequence)))))
               (else
                (cons next-char
                      (loop (read-char sequence))))))))

 (define (get-subsequence sequence)
   `(subitem
     , @ (let loop ((next-char (read-char sequence)))
         (if (right-square-bracket? next-char)
             '()
             (cons next-char
                   (loop (read-char sequence)))))))

 (parse-sequence my-sequence)
 ===> (item # \ t # \ h # \ i # \ s # \ space (subitem # \ i # \ s # \ space # \ a) # \ space # \ t # \ e # \ s # \ t)

Now the work continues, step by step. :)

Any comments and suggestions are still welcome. :)

0
source

Source: https://habr.com/ru/post/1446524/


All Articles