Treetop SGF Parsing

I'm currently trying to write a Treetop grammar to parse Simple Game Format files, and it basically works. However, there are several issues that have arisen.

  • I'm not sure how to really access the Treetop structure generated after the parsing.
  • Is there a better way to handle capturing all characters than the rule of my characters?
  • There is a case for comments that I cannot write correctly.

    C [player1 [4k \]: hi player2 [3k \]: hi!]

I can’t wrap my head around how to handle a nested C [] node structure with [] inside them.

Below is my current progress.

SGF-grammar.treetop

grammar SgfGrammar
rule node
    '(' chunk* ')' {
        def value
            text_value
        end
    }
end

rule chunk
    ';' property_set* {
        def value
            text_value
        end
    }
end

rule property_set
    property ('[' property_data ']')* / property '[' property_data ']' {
        def value
            text_value
        end
    }
end

rule property_data
    chars '[' (!'\]' . )* '\]' chars / chars / empty {
        def value
            text_value
        end
    }
end

rule property
    [A-Z]+ / [A-Z] {
        def value
            text_value
        end
    }
end

rule chars
    [a-zA-Z0-9_/\-:;|'"\\<>(){}!@#$%^&\*\+\-,\.\?!= \r\n\t]*
end

rule empty
    ''
end
end

And my test case currently excluding C [] nodes with the above nested parenthesis problem:

example.rb

require 'rubygems'
require 'treetop'
require 'sgf-grammar'

parser = SgfGrammarParser.new
parser.parse("(;GM[1]FF[4]CA[UTF-8]AP[CGoban:3]ST[2]
RU[Japanese]SZ[19]KM[0.50]TM[1800]OT[5x30 byo-yomi]
PW[stoic]PB[bojo]WR[3k]BR[4k]DT[2008-11-30]RE[B+2.50])")
+3
1
  • SyntaxNodes ( , parser.failure_reason). ( ), , , , .

" node?"? . [x] :

rule url_prefix
    protocol "://" host_name {
       def example
           assert element[0] == protocol
           assert element[2] == host_name
           unless protocol.text_value == "http"
               print "#{protocol.text_value} not supported" 
               end
           end
       }

:

rule phone_number
    "(" area_code:( digit digit digit ) ")" ...

.

  1. , . , (.), .

  2. , , , , :

rule comment
    "C" balanced_square_bracket_string
    end
rule balanced_square_bracket_string
    "[" ( [^\[\]]  / balanced_square_bracket_string )* "]"
    end

, _square.

P.S. Google, .

+3

Source: https://habr.com/ru/post/1703989/


All Articles