Remove non-literal characters from the beginning and end of a line

I need to delete all non-letter characters from the beginning to the end of the word, but keep them if they appear between two letters.

For instance:

'123foo456' --> 'foo' '2foo1c#BAR' --> 'foo1c#BAR' 

I tried using re.sub() , but I could not write a regex.

+5
source share
6 answers

like this?

 re.sub('^[^a-zA-Z]*|[^a-zA-Z]*$','',s) 

s is the input line.

+6
source

You can use str.strip for this:

 In [1]: import string In [4]: '123foo456'.strip(string.digits) Out[4]: 'foo' In [5]: '2foo1c#BAR'.strip(string.digits) Out[5]: 'foo1c#BAR' 

As Matt points out in the comments (thanks, Matt), this only removes the numbers. To remove any non-letter character,

Define what you mean by nebukt:

 In [22]: allchars = string.maketrans('', '') In [23]: nonletter = allchars.translate(allchars, string.letters) 

and then split:

 In [18]: '2foo1c#BAR'.strip(nonletter) Out[18]: 'foo1c#BAR' 
+6
source

With your two examples, I was able to create a regular expression using the non-greedy Python syntax, as described here . I broke the entrance into three parts: non-letters, exclusively letters, and then not letters to the end. Here is a test run:

 1:[123] 2:[foo] 3:[456] 1:[2] 2:[foo1c#BAR] 3:[] 

Here's the regex:

 ^([^A-Za-z]*)(.*?)([^A-Za-z]*)$ 

And mo.group(2) what you want, where mo is a MatchObject.

+2
source

For Unicode Compatibility:

 ^\PL+|\PL+$ 

\PL means not a letter

+2
source

Try the following:

 re.sub(r'^[^a-zA-Z]*(.*?)[^a-zA-Z]*$', '\1', string); 

Parentheses capture everything between non-letter lines at the beginning and end of a line. ? guarantees that . also does not fix any letters without a letter. Then the replacement simply prints the captured group.

0
source

result = re.sub('(.*?)([az].*[az])(.*)', '\\2', '23WERT#3T67', flags=re.IGNORECASE)

0
source

Source: https://habr.com/ru/post/1439435/


All Articles