CatchDB filtered replication: Amend doc_id after the first full replication

NPM Package Manager (node) uses CouchDB to store meta information and package archives in a CouchDB instance at http://registry.npmjs.org/registry .. I use the following replication document (CouchDB 1.1.0) to replicate a subset of the registry to my corporate CouchDB :

{ "_id": "fetch-npm-registry", "doc_ids": [ "coffee-script", "nodeunit" ], "source": "http://couchdb.mycompany.com:5984/registry", "target": "registry", }

[BTW, handling CouchApp is at https://github.com/isaacs/npmjs.org (also with full installation instructions)].

If I want to add another dependency to one of my packages, my naive thought was that I just change the list of doc_ids (say, ["coffee-script", "nodeunit", "npm"] ) and start replication again.

However this one does not work . Replication is completed immediately, and the package that I wanted to add to the replication (in this case "npm" ) is missing.

[The workaround known to me is to delete the target database, replicate, and - because I also use this local registry to publish my own packages - re-publish the local packages. Sigh]


Amendment 11/18/2011

Here's what I think of what is happening (not a CouchDB internal affairs expert at all, but maybe there is some truth):

After the first successful replication, CouchDB saves the last (highest?) Sequence identifier of the last document that it replicated in a hidden document in the database (I once knew how to access them, pointers are welcome). Then, when I change the doc_ids , this cached information about the last successful replication (sequence ID) is not canceled (or not cleared). Then, when he said, to repeat again with the same database, he compares the sequence identifiers and decides that everything is in order.

+4
source share
2 answers

1) http://registry.npmjs.org/registry not a registry database, but http://registry.npmjs.org is.

2)

{"_id": "fetch-npm-registry", "doc_ids": ["coffee-script", "nodeunit"], "source": " http://couchdb.mycompany.com:5984/registry ", " target ":" registry ",}

Are you sure you are copying from http://couchdb.mycompany.com:5984/registry and not http://registry.npmjs.org ?

+1
source

I tried it with Couchbase Single Server 2.0 Developer Preview 5. There it works. My curl command was (but takes quite a few minutes):

 curl -X POST 'http://localhost:5984/_replicate' -H 'Content-Type: application/json' -d '{"doc_ids": [ "coffee-script", "nodeunit", "npm" ], "source": "http://registry.npmjs.org:5984/registry", "target": "registry", "create_target": true} 

It is based on some version of the Apache COuchDB trunk.

Could you try it with Apache CouchDB 1.1.1. I remember that there was a replicator error with empty identifiers (and there is such a document in the npm repository) that was fixed.

Cheers, Volker

0
source

Source: https://habr.com/ru/post/1381540/


All Articles