SQL table size and query performance

We have a number of products coming from a web service; each element contains an unknown number of properties. We store them in a database with the following schema.

Image Elements - ItemID
- ItemName

Image Properties - PropertyID
- PropertyName
- PropertyValue
- PropertyValueType
- TransmitTime
- ItemID [fk]

The property table becomes quite large as it saves properties for each item every time the web service is called. My question is this: at what point should we stop adding new records to the property table and archive the old property records according to their transfer time? When does a property table get too big and too long for a query? Is there a rule?

Thanks.

+1
source share
4 answers

There is no right rule

Some thoughts:

  • define "large" (we have 160 million row tables)
  • Do you have a problem now? if not, do not correct it
  • You start the profiler or some of the whistling dmvs to find out bottlenecks (missing indexes, etc.).
  • if you need data at hand, then you cannot archive it.
  • you can split the table but
+2
source

I'm not sure about MS SQL Server, but most databases seem to have a way to split tables. That is, create a virtual table from many smaller tables and share the data between them based on some simple rules.

This is very suitable for time based data. Divide the table by a time period like day or hour. Then, once in a period of time, add a new section of the table and delete the oldest section of the table. Much more effective than doing DELETE WHERE time <now - "1 hour" or something else.

Or instead of discarding the oldest, archive it or just let it fall into place. As long as your queries always specify a date range, queries can only use the most appropriate sub-tables.

+2
source

I do not think there is a golden rule for this. Your circuitry is pretty normalized, although normalization can lead to a significant decrease in performance.

A few factors to consider:
- Using a usage scenario - Server hardware specifications
- The nature of the database (for example, read more than write ?, insert and not update?)

In your case, if the number of properties does not exceed a certain number, a single table with uneven processing may be better or maybe not. (I can get a flame for this statement: P)

An archiving strategy also depends on your business needs / requirements. You may need to pump up your equipment to meet this need.

+1
source

Depending on how many special property types you have, a watch pattern may help.

In your example:
Item = Subject
Property = Observation
PropertyName = ObservationType.Name
PropertyValueType = ObservationType.IsTrait

This way you are not repeating PropertyName and PropertyValueType in each record. Depending on your application, if you can cache ObservationType and Subject in the application layer, then inserts will also improve.

- Measurements and features are types of observation. A measurement is a numerical observation similar to height. A trait is a descriptive observation, like color.

monitoring_model_02

0
source

Source: https://habr.com/ru/post/1303260/


All Articles