482.75 I used t...">

What does [^.] * Mean in a regular expression?

I am trying to get 482.75 from the following text: <span id="yfs_l84_aapl">482.75</span>

I used the regex = '<span id="yfs_l84_[^.]*">(.+?)</span>' : regex = '<span id="yfs_l84_[^.]*">(.+?)</span>' and it worked.

But I don’t understand why [^.] * Might match aapl here? I understand it. means any character except a newline; and ^ means negative. So, [^.] Must be a new line, and [^.] * There must be any number of new lines. However, this theory contradicts the realities of the real world.

Any help is appreciated and thanks in advance.


The python code I use is:

 import urllib import re htmlfile = urllib.urlopen("http://finance.yahoo.com/q?s=AAPL&ql=0") htmltext = htmlfile.read() regex = '<span id="yfs_l84_[^.]*">(.+?)</span>' pattern = re.compile(regex) price = re.findall(pattern, htmltext) print "the price of of aapl is", price[0] 
+6
source share
2 answers

In [] . means only a point. And the leader ^ means "nothing but ...".

So [^.]* Matches zero or more non-tones.

+18
source

. a dot in a symbolic match means literally literally.

The character matching specification uses different syntax and special characters (- dash for range, ^ for negation). Other template syntaxes are not applicable.

+2
source

Source: https://habr.com/ru/post/954924/


All Articles