Interesting behavior when there is "LF" in a Python script, an error in the interpreter?

Question

Interesting behavior when there is "LF" in a Python script, an error in the interpreter?

When there is a script in Python +U000a(the so-called "LF"), some funny things will happen.

On Linux, I tried several cases (see below) with both Python2 and Python3, and I get the following output:

'LF' makes the next token in this visible line ignored, but will execute the next line.

Is it included in the Python specification, or is it an error in the interpreter?

For me, this is at least a problem for the parser, since "LF" should not have semantics, as indicated above; but for daily use there is no big problem.

Since "LF" is not printed, I attached a screenshot where it ^@represents "LF" (+ U000a). For those interested in trying, I provided a gist (git needs to clone it).

update as per comments

import test_2works as described with python2 REPL, but calls ValueError: source code string cannot contain null bytesin python3; while the work is done directly as described.

+4

python

Hongxu Chen Jun 16 '17 at 15:49

source share

1 answer

Dietrich Epp · Accepted Answer · 2017-06-16T16:14:12+0000

LF U + 000A. , NUL, U + 0000, ^@ vim, emacs less - LF ^J, , LF . , Python NUL- , NUL- , , C.

, -, Python. Python,

import parser
parser.st2list(parser.expr('"hello"\0"goodbye"'))

:

TypeError: expr() argument 1 must be str without null characters, not str

, , , . Python.

, . import , NUL , . , , Python , fgets() (. tokenizer.c: 1022), . NUL-, .

Interesting behavior when there is "LF" in a Python script, an error in the interpreter?

update as per comments

More articles: