I want to replace and index with zero downtime, as described in the ES documentation .
I do like this:
- creating a new index
my_index_v2 with new data - update new index
- then replacing them in an atomic operation by running the following query:
POST /_aliases
{ "actions": [ { "remove": { "index": "*", "alias": "my_index" }}, { "add": { "index": "my_index_v2", "alias": "my_index" }} ] }
This works as expected, unless it randomly fails with a 404 response. Error message:
{ "error": { "root_cause": ... (same) "type": "index_not_found_exception", "reason": "no such index", "resource.type": "index_or_alias", "resource.id": "my_unrelated_index_v13", "index": "my_unrelated_index_v13" }, "status": 404 }
- Then, and only if it worked, we will now remove the unused indexes that were associated with this and only this alias.
The entire operation occurs periodically every few minutes. Similar operations with the described ones can occur simultaneously in the cluster, on other aliases / indexes. The error occurs randomly every few hours.
Is there a reason these operations will interfere with each other? What's happening?
EDIT: Refined the DELETE step at the end.
source share