Regular expression match using cl-ppcre?

Trying to parse the following text file:

prefix1 prefix2 name1( type1 name1, type2 name2 ); 

with the following regular expression:
\\s*prefix1\\s*prefix2\\s*(\\w[\\w\\d_]*).*\\(\\s*([^\\)]*\\))\\s*;\\s*
as a result, I get the following two groups (registers):

 "name1( " 

and

 "( type1 name1, type2 name2 )" 

(here are the string restrictions, \ n are included)

I cannot understand why the first group (\w[\w\d_]*) corresponds to the following part .* . Moreover, I can not get rid of the unnecessary tail!

What's my mistake?

ADD: Regular expression parsed:

 (cl-ppcre::parse-string "\\s*prefix1\\s*prefix2\\s*(\\w[\\w\\d_]*).*\\(\\s*([^\\)]*\\))\\s*;\\s*") (:SEQUENCE (:GREEDY-REPETITION 0 NIL :WHITESPACE-CHAR-CLASS) "prefix1" (:GREEDY-REPETITION 0 NIL :WHITESPACE-CHAR-CLASS) "prefix2" (:GREEDY-REPETITION 0 NIL :WHITESPACE-CHAR-CLASS) (:REGISTER (:SEQUENCE :WORD-CHAR-CLASS (:GREEDY-REPETITION 0 NIL (:CHAR-CLASS :WORD-CHAR-CLASS :DIGIT-CLASS #\_)))) (:GREEDY-REPETITION 0 NIL :EVERYTHING) #\( (:GREEDY-REPETITION 0 NIL :WHITESPACE-CHAR-CLASS) (:REGISTER (:SEQUENCE (:GREEDY-REPETITION 0 NIL (:INVERTED-CHAR-CLASS #\))) #\))) (:GREEDY-REPETITION 0 NIL :WHITESPACE-CHAR-CLASS) #\; (:GREEDY-REPETITION 0 NIL :WHITESPACE-CHAR-CLASS)) 

ADD 2: Full source:

 ;; Requirements: ;; cl-ppcre (defparameter *name-and-parameters-list* (cl-ppcre::create-scanner "\\s*prefix1\\s*prefix2\\s*(\\w[\\w\\d_]*)\\s*\\(\\s*([^\\)]*\\))\\s*;\\s*")) (defparameter *filename* "c:/pva/home/test.txt") (defun read-txt-without-comments (file-name) "Would epically fail in case the file format changes, because currently it expects the \"/*\" and \"*/\" sequences to be on the separate line." (let ((fstr (make-array '(0) :element-type 'base-char :fill-pointer 0 :adjustable t))) (with-output-to-string (s fstr) (let ((comment nil)) (with-open-file (input-stream file-name :direction :input) (do ((line (read-line input-stream nil 'eof) (read-line input-stream nil 'eof))) ((eql line 'eof)) (multiple-value-bind (start-comment-from) (cl-ppcre:scan ".*/\\*" line) (multiple-value-bind (end-comment-from) (cl-ppcre:scan ".*\\*/" line) (if start-comment-from (setf comment t)) (if (not comment) (format s "~A~%" line)) (if end-comment-from (setf comment nil)))))))) fstr)) (let* ((string (read-txt-without-comments "c:/pva/home/test.txt"))) (multiple-value-bind (abcd) (cl-ppcre::scan *name-and-parameters-list* string) (format t "~a ~a ~a ~a~%|~a|~%|~a|~%" abcd (subseq string (svref c 0) (svref c 1)) (subseq string (svref d 0) (svref d 1))))) 

ADD 3: Full input:

 prefix1 prefix2 name1( type1 name1, type2 name2 ); prefix1 prefix2 name2( type3 name1, type2 name2 ); 
+4
source share
1 answer

This works for me with the recent cl-ppcre , as you expected:

 (cl-ppcre:register-groups-bind (name argument) ("\\s*prefix1\\s*prefix2\\s*(\\w[\\w\\d_]*).*\\(\\s*([^\\)]*\\))\\s*;\\s*" "prefix1 prefix2 name1( type1 name1, type2 name2 );" :sharedp t) (list name argument)) ("name1" "type1 name1, type2 name2 )") 

Perhaps show a little more code?

+2
source

Source: https://habr.com/ru/post/1485582/


All Articles