Verify that the recipe contains an ingredient - MYSQL

Hello to all. I have problems with the performance of the / php request. It seems like I'm just looping through too many result sets in inner loops in my php. I am sure there is a more efficient way to do this. Any help is greatly appreciated.

I have a table containing 3500 recipes ([recipe]):
rid | recipe_name

And another table that contains 600 different ingredients ([ingredients])
iid | i_name

Each recipe has x number of ingredients associated with it, and I use a nice join table to create an association ([recipe_ingredients])
uid | rid | iid
(where uid is only the unique identifier of the table)

For instance:

 rid: 1 | recipe_name: Lemon Tart ..... iid: 99 | i_name: lemon curd iid: 154 | i_name: flour ..... 1 | 1 | 99 2 | 1 | 154 

The query I'm trying to run allows the user to enter what ingredients they have, and he will tell you everything that you can do with these ingredients. He does not have to use all the ingredients, but you need to have all the ingredients for the recipe.

For example, if I had flour, egg, salt, milk and lemon curd, I could make Pancakes and Lemon Cake (if we assume that lemon tart has no other ingredients :)), but I could not 'Risotto' (since I didn’t have rice or anything else that was in it).

In my PHP, I have an array containing all the ingredients that the user has. They currently run this through each recipe (cycle 1), and then check all the ingredients in this recipe to see if each ingredient is in my array of ingredients (cycle 2). As soon as he finds the ingredient in the recipe, it is not in my array, he says no and goes on to the next recipe. If so, it stores the disk in a new array, which I use later to display the results.

But if we look at the effectiveness of this, if I assume 3500 recipes, and Ive got 40 ingredients in my array, the worst case scenario is 3500 x 40n, where n = the number of ingredients in the recipe. The best case is 3500 x 40 (does not find the ingredient for the first time for each recipe, so it comes out).

I think that my whole approach to this is wrong, and I think that there should be some kind of smart sql that I miss here. Any thoughts? I can always create a sql statement from the component array that I have.

Thanks a lot in advance, greatly appreciate

+4
source share
5 answers

I would advise storing the counter of the number of ingredients for the recipe in the recipe table just for the sake of efficiency (he will make the request faster if he does not need to calculate this information every time). This is denormalization, which is bad for data integrity, but good for performance. You should be aware that this can lead to data inconsistencies if the recipes are updated and you are not careful to make sure that the number is updated in all relevant places. I assumed that you did this with the new column set as ing_count in the recipe table.

Make sure that you avoid the values ​​for NAME1, NAME2, etc., if provided through user input - otherwise you are at risk of SQL injection.

 select recipe.rid, recipe.recipe_name, recipe.ing_count, count(ri) as ing_match_count from recipe_ingredients ri inner join (select iid from ingredients where i.name='NAME1' or i.name='NAME2' or i.NAME='NAME3') ing on ri.iid = ing.iid inner join recipe on recipe.rid = ri.rid group by recipe.rid, recipe.recipe_name, recipe.ing_count having ing_match_count = recipe.ing_count 

If you do not want to store the number of recipes, you can do something like this:

 select recipe.rid, recipe.recipe_name, count(*) as ing_count, count(ing.iid) as ing_match_count from recipe_ingredients ri inner join (select iid from ingredients where i.name='NAME1' or i.name='NAME2' or i.NAME='NAME3') ing on ri.iid = ing.iid right outer join recipe on recipe.rid = ri.rid group by recipe.rid, recipe.recipe_name having ing_match_count = ing_count 
+2
source

You can request the type "ANY":

 select recipes.rid, count(recipe_ingredients.iid) as cnt from recipes left join recipe_ingredients on recipes.rid = recipe_ingredients.rid where recipes_ingredients in any (the,list,of,ingredients,the,user,hash) group by recipes.rid having cnt > some_threshold_amount order by cnt desc 

Do it from my mind, but basically pull out any recipes that list at least one of the user-provided ingredients, are sorted by the total number of ingredients, and then only recipes with more than the threshold amount of ingredients are returned.

I was probably mistaken in the threshold category - a hidden suspicion that he would consider the ingredients of the recipes, not provided by the user, but the rest of the request should be a good start for what you need.

+1
source

Question: why is your query not sql directly? You can optimize by eliminating the wrong recipes:

  • first, eliminate recipes that have more ingredients than custom ingredients.
  • make recursive greed:
    • choose the first rid | iid
    • if it is in custom components continue
    • if not, exclude from the Recipe_Ingredients table all rows with the value rid => new_table
    • reboot with new_table | stop new_table count = 0

It should have the best statistical results.

Hope this helps

0
source

Something like that:

 SELECT r.*, COUNT(ri.iid) AS count FROM recipe r INNER JOIN recipe_ingredient ri ON r.rid = ri.rid INNER JOIN ingredient i ON i.iid = ri.iid WHERE i.name IN ('milk', 'flour') GROUP BY r.rid HAVING count = 2 

This is pretty easy to understand. count keep the number of ingredients in the list (milk, flour) that correspond to each recipe. If count matches the number of ingredients in the WHERE clause (in this case: 2), return the recipe.

0
source
 SELECT irl.ingredient_amount, r . * , i.thumbnail FROM recipes r LEFT JOIN recipe_images i ON ( i.recipe_id = r.recipe_id ) LEFT JOIN ingredients_recipes_link irl ON ( irl.recipe_id = r.recipe_id ) WHERE irl.recipe_id IN ( SELECT recipe_id FROM `ingredients_recipes_link` WHERE ingredient_id IN ( 24, 21, 22 ) HAVING count( * ) =3 ) GROUP BY r.recipe_id 
0
source

Source: https://habr.com/ru/post/1346941/


All Articles