Using Unicode Regular Expression Word Character in Python

The following matches in Idle, but do not match when run in a method in a module file:

import re
re.search('\\bשלום\\b','שלום עולם',re.UNICODE)

while the following matches in both cases:

import re
re.search('שלום','שלום עולם',re.UNICODE)

(Please note that stackoverflow mistakenly switches the first and second elements in the line above, since this is the right to the left language)

How can I execute the first code in a py file?

Update. What I had to write for the first segment is that it matches Idle, but does not match when launched in the eclipse console with PyDev.

+3
source share
1 answer

Seems to work for me when I use unicode strings:

# -*- coding: utf-8 -*-

import re
match = re.search(u'\\bשלום\\b', u'שלום עולם', re.U)

: http://codepad.org/xWz5cZj5

+2

Source: https://habr.com/ru/post/1750144/


All Articles