Take a look at this psuedo diagram (note that this is a simplification, so please try not to comment too much on the "appropriateness" of the diagram itself). Suppose the indices are inside FK.
TABLE Lookup ( Lookup_ID int not null PK Name nvarchar(255) not null ) TABLE Document ( Document_ID int not null PK Previous_ID null FK REFERENCES Document(Document_ID) ) TABLE Document_Lookup ( Document_ID int not null FK REFERENCES Document(Document_ID) Lookup_ID int not null FK REFERENCES Lookup(Lookup_ID) )
Volumes: document, 4 million lines, of which 90% have a value of the field Previous_ID zero; Search, 6,000 lines, Medium search queries attached to each document 20, giving Document_Lookup 80 million lines.
Now in the .NET Service there is a structure for representing the Lookup string as follows: -
struct Lookup { public int ID; public string Name; public List<int> DocumentIDs; }
and that search strings are stored in Dictionary<int, Lookup> , where the key is the identifier for the search. The important point here is that this dictionary should contain entries in which at least one document refers to the search, i.e. The list of DocumentIDs must have Count> 0.
My task effectively fills this dictionary. So a simple approach: -
SELECT dl.Lookup_ID, l.Name, dl.Document_ID FROM Document_Lookup dl INNER JOIN Lookup l ON l.Lookup_ID = dl.Lookup_ID INNER JOIN Document d ON d.Document_ID = dl.Lookup_ID WHERE d.Previous_ID IS NULL ORDER BY dl.Lookup_ID, dl.Document_ID
Then it could be used to fill out the dictionary quite efficiently.
Question : Does the optimization of the basic rowset (TDS?) Perform? It seems to me that queries that de-normalize data are very common, so the possibility that field values do not change from one row to another is high, so it would be advisable to optimize the stream without sending field values, t has been changed. Does anyone know if such wholesale exists? (Optimize does not exist).
What more complex query can be used to eliminate duplication (I think I’m specifically repeating the name value)? I heard what a "nested row set" is, can this be created? Will it be more revealing? How to access it in .NET?
I would fulfill two queries; one to populate the Search Dictionary, and then a second to populate the list. Then I would add code to knock out the unused Lookup. However, imagine that my predictions are wrong, and Lookup turned out to be 1 million lines with only a quarter that any document refers to?