A regular expression that can break a string with parentheses that are the same

I know that a regular expression can be used to write checkers that check pairs of start and end characters for brackets:

eg. a.[b.[cd]].e output values a , [b.[cd]] and e

How can I write a regex that can identify the start and end brackets that are the same character

eg. a.|b.|cd||.e will give the values a , |b.|cd|| and e

Update

Thanks for all the comments. I have to give some context to this issue. I basically want to simulate javascript syntax

 a.hello is a["hello"] or a.hello a.|hello| is a[hello] a.|bc|de||.f.|g| is a[bc[de]].f[g] 

So what I would like to do is split the characters into:

  [`a`, `|bc|de||`, `f`, `|g|`] 

and then repeat them if they are quoted

I have a pipeless syntax implementation here:

https://github.com/zcaudate/purnam

I really hope not to use the parser mainly, because I don’t know how and I don’t think it justifies the necessary complexity. But if the regex can't cut it, I might need to.

+4
source share
1 answer

Thanks to @ m.buettner and @rafal, this is my code in clojure:

There is normal-mode and pipe-mode . Following the instructions of M. Butner:

Helpers:

 (defn conj-if-str [arr s] (if (empty? s) arr (conj arr s))) (defmacro case-let [[var bound] & body] `(let [~var ~bound] (case ~var ~@body ))) 

Pipe Mode:

 (declare split-dotted) ;; normal mode declaration (defn split-dotted-pipe ;; pipe mode ([output current ss] (split-dotted-pipe output current ss 0)) ([output current ss level] (case-let [ch (first ss)] nil (throw (Exception. "Cannot have an unpaired pipe")) \| (case level 0 (trampoline split-dotted (conj output (str current "|")) "" (next ss)) (recur output (str current "|") (next ss) (dec level))) \. (case-let [nch (second ss)] nil (throw (Exception. "Incomplete dotted symbol")) \| (recur output (str current ".|") (nnext ss) (inc level)) (recur output (str current "." nch) (nnext ss) level)) (recur output (str current ch) (next ss) level)))) 

Normal mode:

 (defn split-dotted ([ss] (split-dotted [] "" ss)) ([output current ss] (case-let [ch (first ss)] nil (conj-if-str output current) \. (case-let [nch (second ss)] nil (throw (Exception. "Cannot have . at the end of a dotted symbol")) \| (trampoline split-dotted-pipe (conj-if-str output current) "|" (nnext ss)) (recur (conj-if-str output current) (str nch) (nnext ss))) \| (throw (Exception. "Cannot have | during split mode")) (recur output (str current ch) (next ss))))) 

Tests:

 (fact "split-dotted" (js/split-dotted "a") => ["a"] (js/split-dotted "ab") => ["a" "b"] (js/split-dotted "abc") => ["a" "b" "c"] (js/split-dotted "a.||") => ["a" "||"] (js/split-dotted "a.|b|.c") => ["a" "|b|" "c"] (js/split-dotted "a.|b|.|c|") => ["a" "|b|" "|c|"] (js/split-dotted "a.|bc|.|d|") => ["a" "|bc|" "|d|"] (js/split-dotted "a.|b.|c||.|d|") => ["a" "|b.|c||" "|d|"] (js/split-dotted "a.|b.|c||.|d|") => ["a" "|b.|c||" "|d|"] (js/split-dotted "a.|b.|cd|e|||.|d|") => ["a" "|b.|cd|e|||" "|d|"]) (fact "split-dotted exceptions" (js/split-dotted "|a|") => (throws Exception) (js/split-dotted "a.") => (throws Exception) (js/split-dotted "a.|||") => (throws Exception) (js/split-dotted "a.|b.||") => (throws Exception)) 
+1
source

Source: https://habr.com/ru/post/1480224/


All Articles