MySQL query using HAVING incorrectly limits results

I have an advanced search form that offers many ways to filter your search. Here's a simplified idea (does not include entering keyword text or searching for a date range or other selection menus):

Topic: <select><option>any</option><option>all</option></select> [] Aging [] Environment [] Health [] Hunger [] Poverty Document type: <select><option>any</option><option>all</option></select> [] Case Study [] Policy Brief [] Whitepaper 

If someone selects "any" when they select more than one type of topic or document, the request should include, for example, topic = "Aging" OR topic = "Health".

If someone selects "everything" when he selects more than one type or type of document, the request should include, for example, topic = "Aging" AND topic = "Health".

The default is AND between these different filters. Therefore, when searching for all documents classified in the "Aging" section and all documents classified as a technical document, the query: topic = "Aging" And doctype = "whitepaper".

Problem: We have a query that works when a search is run for "any". But when the search is "everything", according to the MySQL EXPLAIN command, we have the "impossible WHERE" .: (

Here is a query that works when someone selects "any" for the subject and type of document:

 SELECT DISTINCT * FROM research JOIN link_resource_doctype ON link_resource_doctype.resource_id = research.research_id JOIN doctype ON doctype.id = link_resource_doctype.doctype_id JOIN link_resource_issue_area ON link_resource_issue_area.resource_id = research.research_id JOIN issue_area ON issue_area.id = link_resource_issue_area.issue_area_id WHERE approved = '1' AND (doctype.identifier = 'case_study' OR doctype.identifier = 'whitepaper') AND (issue_area.identifier = 'aging' OR issue_area.identifier = 'health') 

And here is the same query that does not work when someone selects β€œeverything” for the topic and type of document (this also does not work if someone selects only the topic or only the type of document):

 SELECT DISTINCT * FROM research JOIN link_resource_doctype ON link_resource_doctype.resource_id = research.research_id JOIN doctype ON doctype.id = link_resource_doctype.doctype_id JOIN link_resource_issue_area ON link_resource_issue_area.resource_id = research.research_id JOIN issue_area ON issue_area.id = link_resource_issue_area.issue_area_id WHERE approved = '1' AND (doctype.identifier = 'case_study' AND doctype.identifier = 'whitepaper') AND (issue_area.identifier = 'aging' AND issue_area.identifier = 'health') 

Possible solution, but there is a problem: I came across this message in Stackoverflow - Select a line belonging to several categories - which contains a query that, I think, can solve our problem when someone selects "everything." Here he is:

 SELECT DISTINCT * FROM research JOIN link_issue_area ON link_issue_area.resource_id = research.research_id JOIN link_doctype ON link_doctype.resource_id = research.research_id WHERE issue_area.identifier IN ('aging', 'health') AND doctype_id.identifier IN ('case_study', 'whitepaper') GROUP BY research.research_id HAVING COUNT(DISTINCT issue_area.identifier) = 2 AND COUNT(DISTINCT doctype.identifier) = 2 

Problem: This query works for "any" as well as for "everything", except for one problem. Say the document is classified into Aging, Health, and Poverty, but the seeker only checked Aging and Health. A document that is classified into two topics, as well as Poverty that has not been verified, will not appear in the list of search results. I think this is because of HAVING COUNT (DISTINCT issue_area.identifier) ​​= 2 - 2 excludes any document, actually has COUNT, which is more than 2. Is there any work for this? Or is the best request to use here?

Any ideas, ideas, help are greatly appreciated! Thanks!

Here's the SQLfiddle that gets it all: http://sqlfiddle.com/#!2/847362/1

+4
source share
2 answers

I really don't understand this question because you are not showing the expected result. But from what I understand, this is what I have done so far. Comment on what the error is:

Possible solution, but may work:

 SELECT research.research_id AS resource_id, research.title FROM research JOIN link_issue_area ON link_issue_area.resource_id = research.research_id JOIN link_doctype ON link_doctype.resource_id = research.research_id JOIN (SELECT resource_id, COUNT(DISTINCT issue_area_id) AS ISSUE_COUNT FROM link_issue_area GROUP BY resource_id) TB1_COUNT ON TB1_COUNT.resource_id = research.research_id JOIN (SELECT resource_id, COUNT(DISTINCT doctype_id) AS DOCTYPE_COUNT FROM link_doctype GROUP BY resource_id) TB2_COUNT ON TB2_COUNT.resource_id = research.research_id WHERE issue_area_id IN (5,10) AND doctype_id IN (3,18) AND TB1_COUNT.ISSUE_COUNT = 2 AND TB2_COUNT.DOCTYPE_COUNT = 2 GROUP BY resource_id LIMIT 0,1 

Here is SQLFiddle

0
source

An existing query from your SQLFiddle is almost all you need if you only enable HAVING conditions dynamically, where all parameters are required. For instance:

 SELECT research.research_id AS resource_id, research.title FROM research JOIN link_issue_area ON link_issue_area.resource_id = research.research_id JOIN link_doctype ON link_doctype.resource_id = research.research_id WHERE issue_area_id IN (5,10) /* dynamically-generated list of issues */ AND doctype_id IN (3,18) /* dynamically-generated list of doc types */ GROUP BY resource_id HAVING 1=1 AND COUNT(DISTINCT issue_area_id) = 2 /* dynamically-generated count of user-selected issues - only included when all specified issues required */ AND COUNT(DISTINCT doctype_id) = 2 /* dynamically-generated count of user-selected doc types - only included when all specified types required*/ 

Enabling the dummy condition 1=1 means that you can always include the HAVING clause, even if none of the parameters is all .

So, your dynamically generated request for returning resources having all problems 5 and 10, and all types of documents 3 and 18 will look like this:

 SELECT research.research_id AS resource_id, research.title FROM research JOIN link_issue_area ON link_issue_area.resource_id = research.research_id JOIN link_doctype ON link_doctype.resource_id = research.research_id WHERE issue_area_id IN (5,10) AND doctype_id IN (3,18) GROUP BY resource_id HAVING 1=1 AND COUNT(DISTINCT issue_area_id) = 2 AND COUNT(DISTINCT doctype_id) = 2 

SQLFiddle here .

While a dynamically generated request to return resources having any problems 10 and 20, and any types of documents 15 and 18 will look like this:

 SELECT research.research_id AS resource_id, research.title FROM research JOIN link_issue_area ON link_issue_area.resource_id = research.research_id JOIN link_doctype ON link_doctype.resource_id = research.research_id WHERE issue_area_id IN (10,20) AND doctype_id IN (15,18) GROUP BY resource_id HAVING 1=1 

SQLFiddle here .

0
source

Source: https://habr.com/ru/post/1500619/


All Articles