CouchDB uses the file format for attachment only. The code never, never, executes fseek(3) . Any cut-off fragment of a .couch file that starts from the beginning is a valid database file. (CouchDB scans back from the end to find its "title").
The cost of this architecture records a lot of duplicate data every time you make changes. Basically, the couch writes your new data to the end of the file, and then writes all the metadata updates needed to include that data in the data tree, and writes a new header to fix it all forever.
Thus, you get a lot of duplicate metadata (internal nodes of the b-tree, etc.), not to mention the old document data, creating in the .couch file. Once again, this needs to be paid for a bulletproof technique without ever rewriting any data.
Compact scans only data from the old .couch file and writes only that to the new .couch file. B-trees are balanced, there are no more old documents. It is beautiful and clean.
source share