Got a great response from one of the Apple developers. My existing model will be migrated to Core Data over the next few weeks. (StackOverflow got confused with some lists / formatting, but it is for the most part still very readable.)
I will begin my answer with a word on the publication of HFS Plus. Since being published on Mac OS X 10.2.x, the guarantee of the correctness of the Mac OS X file system has been that - regardless of kernel panic, power failures, etc. - file system actions will lead to one of two results:
o either the operation will be turned over by the log, in which case the operation will be completed successfully
o or the operation will be discarded, in which case it will be as if the operation was never performed
This warranty has two critical limitations:
o It applies to individual file system operations (creates, deletes, moves, etc.), and not to groups of operations.
o It applies only to the structure of the logical file system, and not to the data inside the files.
In short, the purpose of a journal is to prevent general corruption of the file system, rather than damage to a specific file.
With this in mind, let's look at the behavior - [NSData writeToFile: options: error:]. His behavior can be very complex, but in a typical case it is quite simple. One way to learn this is to write code and see its behavior in the file system. For example, here are some test codes:
- (IBAction)testAction:(id)sender { BOOL success; NSData * d; struct stat sb; d = [@"Hello Cruel World!" dataUsingEncoding:NSUTF8StringEncoding]; assert(d != nil); (void) stat("/foo", &sb); success = [d writeToFile:@"/tmp/WriteTest.txt" options:NSDataWritingAtomic error:NULL]; (void) stat("/foo", &sb); assert(success); }
Two calls are just markers; they make it easy to see which file system operations are generated using -writeToFile: options: error :.
You can see the behavior of the file system using:
$ sudo fs_usage -f filesys -w WriteTest
where "WriteTest" is the name of my test program.
Here is an excerpt from the result of fs_usage:
14: 33: 10.317 stat [2] / foo 14: 33: 10.317 lstat64 private / tmp / WriteTest.txt 14: 33: 10.317 open F = 5 (RWC__E) private /tmp/.dat2f56.000 14: 33: 10.317 record F = 5 B = 0x12 14: 33: 10.317 fsync F = 5 14: 33: 10.317 close F = 5 14: 33: 10.318 rename private / tmp / .dat2f56.000 14: 33: 10.318 chmod private / tmp / WriteTest. txt 14: 33: 10.318 stat [2] / foo
You can clearly see the "stat" calls that surround -writeToFile: options: error: call, which means that all things between these calls are generated using -writeToFile: options: error :.
What is he doing? Well, this is actually quite simple:
It creates, writes to, fsyncs and closes the temporary file containing the data.
It renames the temporary file on top of the file you are writing.
Resets the permissions of the destination file.
All-in-one is a pretty standard UNIX-style safe save. But the question is, how does this affect data integrity? The main thing to note is that fsync does not guarantee that all data has been returned to disk before returning. This problem has a long and complex history, but the summary is that fsync is called too many times, too many performance-sensitive locations, in order to make this guarantee. This means that all file corruption issues that you see are possible, as described below:
o "iProcrastinate_Bad_2.ipr" and "iProcrastinate_Bad_3.ipr" simply contain incorrect data. This can happen as follows:
a. selects a set of blocks on disk b. adds them to the file with. extends file length; copies data written to buffer cache
The fsyncs application will close the file. The kernel responds by scheduling data blocks that should be written as soon as possible.
The application renames the temporary file on top of the real file.
Panic system core.
When the system reboots, the changes from steps 1, 2a..2c, 3, and 4 are restored from the log, which means that you have a valid file containing invalid data.
o "iProcrastinate_Bad_1.ipr" is just a small change above. If you open the file with a hex editor, you will find that it looks good, except for the data range with an offset of 0x6000..0x61ff, which seems to contain data that is completely unrelated to your application. It is noteworthy that the length of this data, 0x200 bytes, is exactly one block of the disk. Thus, it seems that the kernel was able to write all user data to disk, with the exception of this one block.
So where does this leave you? It is unlikely that [NSData writeToFile: options: error:] will ever become more reliable than it is; as I mentioned earlier, such changes tend to negatively affect overall system performance. This means that your application will have to take care of this problem.
In this regard, there are three common ways to strengthen your application:
a. F_FULLFSYNC. You can transfer the file to persistent storage by calling it using the F_FULLFSYNC selector. You can use this in your application by replacing - [NSData writeToFile: options: error:] with your code, which is called F_FULLFSYNC instead of fsync.
The most obvious drawback of this approach is that F_FULLFSYNC is very slow.
B. journalalling - Another option for adopting a more robust file format that may support journaling. A good example of this is SQLite, which can be used directly or through Core Data.
C. Safe preservation. Finally, you can implement a more secure save mechanism with a backup file. Before calling - [NSData writeToFile: options: error:], to write a file, you can rename the previous file to another name and leave this file just in case. If after opening the main file you find that it is damaged, you will automatically return to the backup.
Of these approaches, my preference is for B, and especially for B with core data, because Core Data offers many benefits beyond data integrity. However, for a quick fix, option C is probably the best choice.
Let me know if you have any questions about this.