How to parse simple inline markup (i.e. * bold *) in Python?

How to implement a parser (in Python) for a subset of wikitext that modifies text, namely:

*bold*, /italics/, _underline_ 

I convert it to LaTeX, so the conversion comes from:

Hello, *world*! Let /go/.

in

Hello \textbf{world}! Let \textit{go}.

Although there is nothing specific about the conversion to LaTeX (in particular, nested cases such as "* bold / italics" whatami / "=>" textbf {bold \ textit {italics} whatami} ").

I looked at existing markup libraries , but they are (a) not exactly the Viking language I would like, and (b) this would seem to be the problem.

I reviewed the reverse engineering of Creoleparser , but I would like to know what others have suggestions before I take on this effort.

Thanks!

+3
1

, :

>>> import re
>>> str = "Hello, *world*! Let /go/."
>>> str = re.sub(r"\*([^\*]*)\*", r"\textbf{\1}", str)
>>> str = re.sub(r"/([^/]*)/",   r"\textit{\1}", str)
>>> str
"Hello, \textbf{world}! Let \textit{go}."
+6

Source: https://habr.com/ru/post/1702829/


All Articles