Database Design: Objects with Different Attributes

I am developing a product database in which products can have very different attributes depending on their type, but the attributes are fixed for each type and the types are not manageable at all. For instance:.

log: title, issue_number, pages, copy, close_date, release_date
web_site: name, bandwidth, images, date_of, date_to

I want to use InnoDB and ensure database integrity as far as the engine allows. What is the recommended way to handle this?

I hate projects where tables have 100 columns and most are NULL, so I thought of something like this:

product_type ============ product_type_id INT product_type_name VARCHAR product ======= product_id INT product_name VARCHAR product_type_id INT -> Foreign key to product_type.product_type_id valid_since DATETIME valid_to DATETIME magazine ======== magazine_id INT title VARCHAR product_id INT -> Foreign key to product.product_id issue_number INT pages INT copies INT close_date DATETIME release_date DATETIME web_site ======== web_site_id INT name VARCHAR product_id INT -> Foreign key to product.product_id bandwidth INT hits INT date_from DATETIME date_to DATETIME 

This may lead to the removal of the cascading product, but ... Well, I'm not quite sure ...

+4
source share
3 answers

This is a classic OO design for impedance mismatch of relational tables. The table design you created is known as the "table for each subclass". The three most common projects are all trade-offs compared to what your objects really look like in your application:

  • Class table
  • Table for hierarchy
  • Table for each subclass

The design you don’t like is “where the tables have 100 columns and most of the values ​​are NULL” - this is 2. one table to store the entire specialization hierarchy. This is the least flexible solution for all reasons, including - if your application needs a new subclass, you need to add columns. The design you describe takes the changes into account much better, because you can add its extension by adding a new subclass table described by the value in product_type.

The remaining option is 1. The table for a particular class is usually undesirable due to duplication associated with the implementation of all common fields in each specialization table. Although, the advantages are that you don’t have to make any connections, and subclass tables can even be on different db instances in a very large system.

The design you described is absolutely viable. The following is a variation of how this might look if you used the ORM tool to perform your CRUD operations. Notice how the identifier in each table of the subclass is the FK value for the parent table in the hierarchy. A good ORM automatically manages the correct CRUD subclass table based on the value of the discriminator values ​​only in product.id and product.product_type_id. Regardless of whether you plan to use ORM or not, look at the sleep mode associated with the subclass documentation, if only to see what design decisions they made.

 product ======= id INT product_name VARCHAR product_type_id INT -> Foreign key to product_type.product_type_id valid_since DATETIME valid_to DATETIME magazine ======== id INT -> Foreign key to product.product_id title VARCHAR .. web_site ======== id INT -> Foreign key to product.product_id INT name VARCHAR .. 
+5
source

You seem rude on the right track, except that you may need to consider the difference between a “product” and what is often called a “storage unit” (SKU). Is a 25-box paperclip box (of a specific kind) a single “product” like a 50-box box? As for the store or any inventory system, it matters; in some cases, indeed, a simple difference in packaging is that otherwise the same quantity of the same main “product” can give you excellent SKUs to track.

You need to decide where you want to track this problem, if it matters to your application (it may be good for the products to be presented in the same way as you, and packaged for SKU purposes in other tables, for example, though for some applications, which may be a little overhead).

+2
source

This is actually the standard way to “force” OO design into classic RDBMS.

All “common” attributes fall into the main table (for example, the price, if it is maintained at the level of the product table, can easily be part of the main table), while the specifics go to the subtable.

In theory, if you have sub-subtypes (for example, magazines can be subtypes in daily newspapers and periodicals in 4 colors, possibly with periodicals having a date range for the shelf life), you can add one or more sublevels ...

This is a fairly common (and proven) design. The only concern is that the main table will always be joined, at least with a subtable for most operations. If you have cylinders of items, this can have performance implications.

On the other hand, a normal operation, such as deleting an element (I would suggest logical deletion by setting the flag to "true" in the main table) will be performed once for each type of subtype.

In any case, go for it. And maybe google for the "RDBMS mapping oriented object" or somesuch for a full discussion .

+1
source

Source: https://habr.com/ru/post/1308029/


All Articles