Sed -i deals with files that it does not change

Question

Sed -i deals with files that it does not change

Someone on our server did sed -i 's/$var >> $var2/$var > $var2/ * to change the inserts to overwrite in some bash scripts in the shared directory. It's okay, at first it was tested with grep , and it returned the expected results that only its files would be affected.

He launched the script, and now 1200 files from 1400 in the folder have a new changed date, but, as far as we can judge, only a small part of the files were actually changed.

Why sed "touches" a file that it does not change.
Why would this “touch” part of the files, and not all of them.
Actually it changed something (perhaps some finite empty space or something completely unexpected because of $ in sed regular expression)?

+5

ksh sed

Jnevill Nov 21 '14 at 21:54

source share

2 answers

I am using the following workaround, that is, looking at each file individually, use grep to check if the file contains a string, and then use sed. Not very nice, but it works ...

 for i in *;do grep mytext $i && sed -i -e 's/mytext/replacement/g' $i;done

+5

centic Jul 24 '15 at 11:36

source share

John1024 · Accepted Answer · 2014-11-22T08:00:10+0000

When GNU sed successfully edits the file in place, its timestamp is updated. To understand why, let's see how in-place editing is done:

A temporary file is created to store the output.
sed processes the input file, sending the output to a temporary file.
If the backup file extension was specified, the input file is renamed to the backup file.
Whether a backup is created or not, the temporary output is moved ( rename ) to the input file.

GNU sed does not track whether any changes have been made to the file. Everything in the temporary output file is moved to the input file through rename .

There is a nice advantage to this procedure: POSIX requires that rename be atomic . Therefore, the input file is never in a distorted state: it is either a source file or a modified file and is never shared.

As a result of this procedure, any file that processes successfully will change its timestamp.

Example

Consider this inputfile :

 $ cat inputfile this is a test.

Now, under strace supervision, run sed -i on it to ensure no changes:

 $ strace sed -i 's/XXX/YYY/' inputfile

The edited result is as follows:

 execve("/bin/sed", ["sed", "-i", "s/XXX/YYY/", "inputfile"], [/* 55 vars */]) = 0 [...snip...] open("inputfile", O_RDONLY) = 4 [...snip...] open("./sediWWqLI", O_RDWR|O_CREAT|O_EXCL, 0600) = 6 [...snip...] read(4, "this is\na test.\n", 4096) = 16 write(6, "this is\n", 8) = 8 write(6, "a test.\n", 8) = 8 read(4, "", 4096) = 0 [...snip...] close(4) = 0 [...snip...] close(6) = 0 [...snip...] rename("./sediWWqLI", "inputfile") = 0

As you can see, sed opens the input file inputfile in file descriptor 4. Then it creates a temporary ./sediWWqLI file in file descriptor 6 to store the output. It reads from the input file and writes it unchanged to the output file. When this is done, a rename call is made to overwrite the inputfile , changing its timestamp.

GNU `sed` source code

The corresponding source code is located in the execute.c file of the sed directory of the source . From version 4.2.1:

  ck_fclose (input->fp); ck_fclose (output_file.fp); if (strcmp(in_place_extension, "*") != 0) { char *backup_file_name = get_backup_file_name(target_name); ck_rename (target_name, backup_file_name, input->out_file_name); free (backup_file_name); } ck_rename (input->out_file_name, target_name, input->out_file_name); free (input->out_file_name);

ck_rename is a coverage function for the stdio rename function. The source for ck_rename is in sed/utils.c .

As you can see, the flag is not saved to determine whether the file was really modified or not. rename is called independently.

Files whose timestamps have not been updated

As for 200 of the 1,400 files whose timestamps have not changed, this would mean that sed somehow failed to execute these files. One possibility might be a permission issue.

`sed -i` and symbolic links

As mklement0 noted, applying sed -i to a symbolic link produces an unexpected result. sed -i does not update the file pointed to by the symlink . Instead, sed -i overwrites the symlink with the new regular file.

This is the result of a call that sed makes to STDIO rename . As described in man 2 rename :

If newpath refers to a symbolic link, the link will be overwritten.

mklement0 reports that this also applies to (BSD) sed on Mac OSX 10.10.

Sed -i deals with files that it does not change

Example

GNU sed source code

Files whose timestamps have not been updated

sed -i and symbolic links

More articles:

GNU `sed` source code

`sed -i` and symbolic links