"requires an integer" when opening () a file as utf-8?

I have a file that I am trying to open in python with the following line:

f = open("C:/data/lastfm-dataset-360k/test_data.tsv", "r", "utf-8") 

Calling this gives me an error

TypeError: integer required

I deleted all other codes except one line and still get the error. What have I done wrong and how can I open it correctly?

+6
source share
5 answers

From the documentation for open() :

open(name[, mode[, buffering]])

[...]

The optional buffering argument specifies the files needed for the buffer size: 0 means unbuffered, 1 means line buffering, any other positive value means using a buffer (approximately) of this size. Negative buffering means using the default system, which is usually string buffered for tty devices and fully buffered for other files. If omitted, the default system is used.

It seems you are trying to pass open() string describing the encoding of the file as the third argument. Do not do this.

+11
source

You are using the wrong opening.

 >>> help(open) Help on built-in function open in module __builtin__: open(...) open(name[, mode[, buffering]]) -> file object Open a file using the file() type, returns a file object. This is the preferred way to open a file. See file.__doc__ for further information. 

As you can see, it expects the buffering parameter to be an integer.

What you might want is codecs.open :

 open(filename, mode='rb', encoding=None, errors='strict', buffering=1) 
+7
source

In reference documents:

 open(...) open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True) -> file object 

you need encoding='utf-8' ; python thinks you're passing an argument for buffering.

+2
source

The final open parameter is the size of the buffer, not the encoding of the file.

File streams are more or less agnostic-encoded (with the exception of translating a new line to files that are not opened in binary mode), you must process the encoding elsewhere (for example, when you receive data with read() , you can interpret this as utf-8 using decode method).

+1
source

This resolved my issue, i.e. providing the encoding (utf-8) when opening the file

  with open('tomorrow.txt', mode='w', encoding='UTF-8', errors='strict', buffering=1) as file: file.write(result) 
0
source

Source: https://habr.com/ru/post/912195/


All Articles