Elasticsearch search returns different documents

Question

Elasticsearch search returns different documents

Some background of the elasticsearch instance:

One node, on one machine
The specific index consists of 2.6 billion documents of 1.23 TB in size.
The index is divided into 4 fragments.
Heap size set to 30 GB
The server has 256 GB of RAM and 40 cores.
Elasticsearch (version 1.4.3) is the only thing that works on this server.

I want to return all documents with a specific name. The attribute name is displayed:

"name": {
                    "type": "string",
                    "index": "not_analyzed"
                }

I tried using a different type of search; filter, query_string, term. All with the same result. The current request looks like this:

    {   "query": {
            "query_string": {
                "default_field" : "name",
                "query": "test_run_435_tc"
            }
        },
        "size" : 10000000
    }

The problem is that the request does not return the correct number of documents on the first try. I know that the index contains about 45,000 documents with the name "test_run_435_tc".

, 5000 . , . 3-4 .

elasticsearch-py .

, elasticsearch , .

elasticsearch ? elasticsearch - ? , .

, :

"": 10000000 , , .

"size": 0 :

 {u'_shards': {u'failed': 0, u'successful': 4, u'total': 4},
  u'hits': {u'hits': [], u'max_score': 0.0, u'total': 28754},
  u'timed_out': True,
  u'took': 130}

"": 0, :

 {u'_shards': {u'failed': 0, u'successful': 4, u'total': 4},
  u'hits': {u'hits': [], u'max_score': 0.0, u'total': 39223},
  u'timed_out': True,
  u'took': 134}

, , "size": 0, .....? timeout = 100000 & search_type = count :

{
"took": 525,
"timed_out": false,
"_shards": {
    "total": 4,
    "successful": 4,
    "failed": 0
},
"hits": {
    "total": 49501,
    "max_score": 0,
    "hits": []
}
}

, 49501 "hits_total", !

+4

search elasticsearch

ajgustafsson 24 . '15 11:55

1

Prabin Meitei · Answer 1 · 2015-04-24T15:39:19+0000

, . . python, , - ..

, ( search_type), .

@moliware .

, .

, . Search_type , .

, , 100000 . , .

, . , node 30gb . , . 32 - , java. 256 (humongous) ram, .

.

Elasticsearch search returns different documents

More articles: