Find matching object based on dynamic comparisons

I'm not sure if the subject really makes sense, but I was not sure how to say it. Here's the setting: I have an Item that has many ItemLogic , each of which has one Field . Each Item has, say, 25 ItemLogic entities. The logic determines whether this Item matches the given input from the form. For example, Field X has a value greater than A and Field Y has a value equal to B , etc. For each of the 25 fields.

In the current version of the application, all related objects are queried and looped, returning the first matching element, where all ItemLogic were true . It's a little expensive, but simple code, and it has never had such items. Still.

The application should now filter 3000 items to find a match. The previous query had at least two connections and took about 45 seconds on our SQL instance. This is too long.

The stored procedure seems natural, but here's the catch: the data is dynamic for each set of elements, it comes in as a string value and often needs to be distinguished as another type (DateTime or int most often) to perform actual comparisons, and some logic is ignored, not compares. This is a lot of extra overhead in a stored procedure, at least on how it affects me.

Alternatively, I could trim the data, but it will not save for the weak guy trying to match the last item in the collection.

What are some approaches that could be taken to expedite the match?

Scheme and some examples of data:

 CREATE TABLE [dbo].[Items]( [Id] [int] IDENTITY(1,1) NOT NULL, [Name] [nvarchar](255) NOT NULL ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] CREATE TABLE [dbo].[ItemLogic]( [Id] [int] IDENTITY(1,1) NOT NULL, [ItemId] [int] NOT NULL, [FieldId] [int] NOT NULL, [Value] [nvarchar](max) NULL, [Comparison] [int] NOT NULL ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] CREATE TABLE [dbo].[Fields]( [Id] [int] IDENTITY(1,1) NOT NULL, [Value] [nvarchar](max) NOT NULL, [Type] [int] NOT NULL, ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] INSERT INTO [dbo].[Fields] (Value, Type) VALUES ('abc', 0), ('def', 0), ('123', 1) INSERT INTO [dbo].[Items] (Name) VALUES ('Item 1'), ('Item 2'), ('Item 3') INSERT INTO [dbo].[ItemLogic] (ItemId, FieldId, Value, Comparison) VALUES (1, 1, 'xyz', 1), (1, 2, 'qrs', 1), (1, 3, '200', 0), (2, 1, 'abc', 1), (2, 2, 'xyz', 1), (2, 3, '123', 2), (3, 1, 'abc', 1), (3, 2, 'def', 1), (3, 2, '100', 0) 

In the Comparison field, this is an enumeration match: 0 = Greater Than, 1 = Equal, 2 = Ignore. For the Type field, this is an enumeration match: 0 = string, 1 = int.

The expected result of the above match should be to return Item 3 .

+5
source share
4 answers

He will never be fast. However, here is the simplest and most compact solution I can imagine:

 SELECT * FROM Items WHERE Id NOT IN ( SELECT IL.ItemId FROM Fields F INNER JOIN ItemLogic IL ON F.Id = IL.FieldId WHERE NOT ( IL.Comparison = 2 -- Ignore OR F.Type = 0 AND ( -- string types IL.Comparison = 0 AND F.Value > IL.Value OR IL.Comparison = 1 AND F.Value = IL.Value ) OR F.Type = 1 AND ( -- integer types IL.Comparison = 0 AND TRY_CAST(F.Value AS int) > TRY_CAST(IL.Value AS int) OR IL.Comparison = 1 AND TRY_CAST(F.Value AS int) = TRY_CAST(IL.Value AS int) ) ) ) 
+1
source

You can use case statements to actually match fields based on your logic. Then just count the number of matches and compare them with the number of fields you want to match. The following example demonstrates:

 WITH CTE_TypedFieldValues AS ( -- First unpivot fields by data type and convert to typed value SELECT [Id] ,[DataType] ,[0] AS [ValueString] ,CONVERT(INT, [1]) AS [ValueInt] FROM ( SELECT [Id] ,[Value] ,[Type] AS [DataType] ,[Type] FROM dbo.Fields ) AS [FieldsSource] PIVOT ( MAX([Value]) FOR [Type] IN ([0], [1]) ) AS [FieldTyped] ), CTE_Compare AS ( SELECT [IL].[ItemId] ,[IL].[FieldId] ,[FV].[DataType] ,[IL].[Value] AS [LogicValue] ,[IL].[Comparison] ,[FV].[ValueString] ,[FV].[ValueInt] ,( CASE WHEN [FV].[DataType] = 0 THEN -- If data types are strings then use [ValueString] column for comparison CASE WHEN [IL].[Comparison] = 0 THEN -- Perform greater than comparison, if condition is met then flag as matched. CASE WHEN [FV].[ValueString] > [IL].[Value] THEN 1 ELSE 0 END WHEN [IL].[Comparison] = 1 THEN -- Perform greater than comparison, if condition is met then flag as matched. CASE WHEN [FV].[ValueString] = [IL].[Value] THEN 1 ELSE 0 END END WHEN [FV].[DataType] = 1 THEN -- If data types are integers then use [ValueInt] column for comparison CASE WHEN [IL].[Comparison] = 0 THEN CASE WHEN [FV].[ValueInt] > CAST([IL].[Value] AS INT) THEN 1 ELSE 0 END WHEN [IL].[Comparison] = 1 THEN CASE WHEN [FV].[ValueInt] = CAST([IL].[Value] AS INT) THEN 1 ELSE 0 END END END ) [Match] FROM [dbo].[ItemLogic] [IL] INNER JOIN CTE_TypedFieldValues [FV] ON [IL].FieldId = [FV].[Id] WHERE Comparison < 2 -- Filter out fields marked as ignored. ) SELECT ItemId ,COUNT([FieldId]) AS [ExpectedMatches] ,SUM([Match]) AS [ActualMatches] FROM CTE_Compare GROUP BY ItemId HAVING COUNT([FieldId]) = SUM([Match]) -- Only return ItemIDs where the number matched fields is equal to the number of expected matches. 
0
source

I would suggest a different approach, that is, separate logic: data queries will be executed in SQL and all the transformations and checks that I would do in the application. Logic wold:

  • Get all Items you need to check.

  • Loop throguh all Items (when a match is found, loop break). At each iteration, the request is associated with ItemLogic and Field . Make conversions and checks that should be more efficient than in SQL (also select queries will be executed as soon as possible, because you will limit the result of the query to one element at a time).

Running multiple queries may seem exepnsive, but if you do it in one connection (.NET has a connection pool, so you won’t even have to worry about it), it should be faster.

0
source

As an alternative solution, you can get all the records and apply the rules in memory. You can also try storing related records in memory.

0
source

Source: https://habr.com/ru/post/1276060/


All Articles