I have a Solr index of about 5 million documents on 8 GB using Solr 4.7.0. I need to group in Solr, but consider it too slow. Here is the group configuration:
group=on
group.facet=on
group.field=workId
group.ngroups=on
The machine has enough memory at 24 GB and 4 GB is allocated for Solr itself. Requests typically take about 1200 ms, compared to 90 ms when grouping is turned off.
I came across a plugin called CollapsingQParserPlugin that uses a filter request to remove all but one group.
fq = {! collapse field = workId}
It is designed for indexes with many unique groups. I have about 3.8 million. This approach is much faster, in about 120 ms. This is a great solution for me, except for one. Since it filters out other members of the group, only faces from a representative document are taken into account. For example, if I have the following three documents:
"docs": [
{
"id": "1",
"workId": "abc",
"type": "book"
},
{
"id": "2",
"workId": "abc",
"type": "ebook"
},
{
"id": "3",
"workId": "abc",
"type": "ebook"
}
]
after folding, only the results are displayed in the results. As the other two filters are filtered out, the number of facets looks like
"type": ["book":1]
instead
"type": ["book":1, "ebook":1]
Is there a way to get count group.facet using a reject request request?