Regex replacement in Python

Consider a Python snippet:

import re
str = 'that that kitty is cute'

# Anchor at beginning of string
rgexp_start = r'^(.*) \1'
print(re.sub(rgexp_start, r'\1', str))    

# Do NOT anchor at beginning of string
rgexp = r'(.*) \1'
print(re.sub(rgexp, r'\1', str))   

Fingerprints:

that kitty is cute
thatkittyiscute

Why does the second regular expression remove all spaces? As an additional question, consider a piece of JavaScript code:

var str = 'that that kitty is cute';
var rgexp_start = /^(.*) \1/;
alert(str.replace(rgexp_start, '$1'));

var rgexp = /(.*) \1/;
alert(str.replace(rgexp, '$1'));

What gives twice:

that kitty is cute

Why is JavaScript different from Python when handling the same regular expression ?

+4
source share
2 answers

Javascript behavior is different from the fact that you did not include the flag in globaleither gthe Javascript regex (which is enabled by default in python).

If you use the same regular expression with a flag gas:

var rgexp = /(.*) \1/g;
console.log(str.replace(rgexp, '$1'));

Then it will print:

thatkittyiscute

What is the behavior python.

btw, :

(\S+) \1

, :

that kitty is cute

\S+ .

+2

, re.sub , .

, r'^(.*) \1' , , . , , , , , , '^that that', .

In[]: 'that that kitty is cute'

'^that that' -> 'that'

Out[]: 'that kitty is cute'

r'(.*) \1', .* 0 . , . , , '^that that '( ), '', , '' , 3 . , ' ' ( '' ( ) ) ''.

In[]: 'that that kitty is cute'

'that that' -> 'that'
' '         -> ''
' '         -> ''
' '         -> ''

Out[]: 'thatkittyiscute'

, python b/w JS, anubhava, , JS ; , .

+3

Source: https://habr.com/ru/post/1681547/


All Articles