Replacing the entire numeric value with a formatted string

I am trying to do the following:

Find out all the numerical values ​​in the string.

input_string = "高露潔光感白輕悅薄荷牙膏100   79.80"

numbers = re.finditer(r'[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?',input_string)

for number in numbers:
    print ("{}    start > {}, end > {}".format(number.group(), number.start(0), number.end(0)))

'''Output'''
>>100    start > 12, end > 15
>>79.80    start > 18, end > 23

And then I want to replace the entire integer and float value with a specific format:

INT_(number of digit) and FLT(number of decimal places)

eg. 100 -> INT_3 // 79.80 -> FLT_2

So the wait output line looks like this:

"高露潔光感白輕悅薄荷牙膏INT_3   FLT2"

But the string replacing the substring method in Python looks weird, and I can't archive what I want to do.

So I'm trying to use a substring that adds substring methods

string[:number.start(0)] + "INT_%s"%len(number.group()) +.....

which looks stupid and, most importantly, I still can't get it to work.

Can someone give me some advice on this issue?

+4
source share
3 answers

re.sub , :

import re
def repl(match):
    chunks = match.group(1).split(".")
    if len(chunks) == 2:
        return "FLT_{}".format(len(chunks[1]))
    else:
        return "INT_{}".format(len(chunks[0]))

input_string = "高露潔光感白輕悅薄荷牙膏100   79.80"
result = re.sub(r'[-+]?([0-9]*\.?[0-9]+)(?:[eE][-+]?[0-9]+)?',repl,input_string)
print(result)

Python

  • regex (([0-9]*\.?[0-9]+)), repl
  • repl 1 ., , float/double, , else, .
+4

, , :

import re

def repl(m):
    if m.group(1) is None: #int
        return ("INT_%i"%len(m.group(2)))        
    else: #float
        return ("FLT_%i"%(len(m.group(2))))

input_string = "高露潔光感白輕悅薄荷牙膏100   79.80"

numbers = re.sub(r'[-+]?([0-9]*\.)?([0-9]+)([eE][-+]?[0-9]+)?',repl,input_string)        

print(numbers)
  • group 0 - , ​​( float int)
  • 1 - . ., else None
  • 2 - ., ,
  • 3 , else None

python

def parse(m):
    s=m.group(0)
    if m.group(1) is not None or m.group(3) is not None: # if there is a dot or an exponential part it must be a float
        return float(s)
    else:
        return int(s)
+2

, - (, ). , , , .

import re
input_string = u"高露潔光感白輕悅薄荷牙膏100   79.80"

numbers = re.finditer(r'[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?',input_string)

s = input_string
for m in list(numbers)[::-1]:
    num = m.group(0)
    if '.' in num:
        s = "%sFLT_%s%s" % (s[:m.start(0)],str(len(num)-num.index('.')-1),s[m.end(0):])
    else:
        s = "%sINT_%s%s" % (s[:m.start(0)],str(len(num)), s[m.end(0):])
print(s)

, .

, ints, float, . , . int, , , , .

, python. Python , . , , .

, , , , , , . , .

, , re.sub() .

+1

Source: https://habr.com/ru/post/1652321/


All Articles