Do I need to close files that do not have links to them?

As a complete beginner in programming, I am trying to understand the basic concepts of opening and closing files. One exercise I'm doing is creating a script that allows me to copy contents from one file to another.

in_file = open(from_file) indata = in_file.read() out_file = open(to_file, 'w') out_file.write(indata) out_file.close() in_file.close() 

I tried to shorten this code and came up with the following:

 indata = open(from_file).read() open(to_file, 'w').write(indata) 

This works and looks a little more efficient for me. However, this is also me embarrassed. I think I left links to open files; there was no need for in_file and out_file variables. However, this leaves me with two open files, but has nothing to do with them? How to close them, or is it not necessary?

Any help that sheds some light on this topic is greatly appreciated.

+49
python file
Mar 16 '16 at 20:20
source share
6 answers

You asked about the "basic concepts", so let's take them from above: when you open the file, your program gets access to the system resource , that is, to something outside the program its own memory space. This is basically a bit of magic provided by the operating system (system call, in Unix terminology). Hidden inside a file object is a reference to a β€œfile descriptor,” the actual OS resource associated with an open file. Closing the file informs the system about the release of this resource.

As an OS resource, the number of files that can be saved in the process is limited:. So far, the limit for each process has been around 20 on Unix. Currently, the OS X mailbox imposes a limit of 256 open files (although this is a designated limit and can be raised). Other systems can set limits of several thousand or tens of thousands (per user, not per process in this case). When your program ends, all resources are automatically freed. Therefore, if your program opens several files, does something with them and exits, you may be inaccurate and you will never know the difference. But if your program opens thousands of files, you will be able to open open files to avoid exceeding OS limits.

Another advantage of closing files before exiting your process: if you opened a file for writing, closing it first will "clear its output buffer." This means that i / o libraries optimize disk usage by collecting ("buffering") what you write out and saving it to disk in batches. If you write text to a file and immediately try to open and read it without closing the first output descriptor, you will find that not everything was written. In addition, if your program closes too quickly (with a signal or sometimes even through a normal output), the output will never be short-lived.

There are already many other answers to the question of how to release files, so here is just a short list of approaches:

  • Explicitly with close() . (Note for python newbies: don't forget about parens! My students love to write in_file.close , which does nothing.)

  • Recommended: Implicitly, opening files using the with statement. The close() method is called when the end of the with block is reached, even in case of abnormal termination (from an exception).

     with open("data.txt") as in_file: data = in_file.read() 
  • Implicitly using a link manager or garbage collector if your python engine implements it. This is not recommended as it is not fully portable; see other answers for details. This is why the with statement was added in python.

  • Implicitly, when your program ends. If the file is open for output, this may cause the program to exit before everything is flushed to disk.

+35
Mar 17 '16 at 14:16
source share

The pythonic way to handle this is to use with context manager :

 with open(from_file) as in_file, open(to_file, 'w') as out_file: indata = in_file.read() out_file.write(indata) 

Used with such files, with ensures that all necessary cleaning will be performed for you, even if read() or write() errors are read() .

+55
Mar 16 '16 at 20:25
source share

By default, the python interpreter, CPython, uses reference counting. This means that when there is no reference to the object, it receives garbage collection, that is, it is cleared.

In your case, doing

 open(to_file, 'w').write(indata) 

will create an object file for to_file , but will not assign it a name - this means that there is no link to it. You cannot manipulate an object after this line.

CPython will detect this and clear the object after using it. In the case of a file, this means automatic closure. Basically, this is normal and your program will not leak memory.

The "problem" is the implementation mechanism of the CPython interpreter. The language standard does not explicitly guarantee this! If you use an alternative interpreter such as pypy, automatic file closing may be delayed indefinitely. This includes other implicit actions, such as flushing a record when closing.

This problem also applies to other resources, for example. network sockets. It is good practice to always explicitly handle such external resources. Starting with python 2.6, the with statement makes this elegant:

 with open(to_file, 'w') as out_file: out_file.write(in_data) 



TL; DR: it works, but please do not do this.

+33
Mar 16 '16 at 21:16
source share

The answers are still absolutely correct when working on python. You must use the context manager with open() . This is a great built-in feature and helps reduce the overall programming task (opening and closing a file).

However, since you are a beginner and do not have access to context managers and automatic reference counting throughout your career, I will consider the issue from a general programming perspective.

The first version of your code is excellent. You open the file, save the link, read it from the file and close it. Here's how a lot of code is written when the language does not provide a shortcut to the task. The only thing I would like to improve is to move close() to where you open and read the file. After opening and reading the file, you have the contents in memory and you no longer need the file to be opened.

 in_file = open(from_file) indata = in_file.read() out_file.close() out_file = open(to_file, 'w') out_file.write(indata) in_file.close() 
+8
Mar 17 '16 at 13:36
source share

It is good to use the with keyword when working with file objects. This has the advantage that the file is properly closed after it is typed, even if an exception occurs in the path. It is also much shorter than writing equivalent try-finally blocks:

 >>> with open('workfile', 'r') as f: ... read_data = f.read() >>> f.closed True 
+7
Mar 16 '16 at 20:26
source share

A safe way to open files without worrying about not closing them is as follows:

 with open(from_file, 'r') as in_file: in_data = in_file.read() with open(to_file, 'w') as out_file: outfile.write(in_data) 
+5
Mar 16 '16 at 20:26
source share



All Articles