What approach would you use for this particular database design problem?

Just find opinions on the following two scenarios.

We have a table where we store our outgoing sms messages. Every time one of our services sends a message with a high rate, it stores them in this table ... today all the important information that needs to be saved was in the same format.

SMSMessages ---------------------- ID int PK NOT NULL Identity Mobile nvarchar(50) -- the number we're sending to NetworkID int FK -> Table containing networks (voda, o2, etc...) ShortcodeID int FK -> Table containing our outbound shortcodes DateSent DateTime 

Now one of the networks has implemented a completely new API, with which we need to integrate, which requires even more parameters. 1 of these additional options is Team. Depending on the command we are sending, there are 4 to 8 additional parameters that we must send. For simplicity, we will only say two teams there ... "InitialSend" and "AnniversarySend"

Obviously, the rather awful design of the database will simply add all these additional columns to the end of our existing table, so ... we think we have two options.

Option 1.

Create many new tables related to each team related to the original table.

 SMSMessages_CommandTypes --Contains "InitialSend" & "AnniversarySend" + other commands -------------------------- CommandTypeID int PK Command nvarchar(50) SMSMessages_OddBallNetwork -------------------------- ID int PK, FK --> SMSMessages.ID CommandTypeID int FK ---> SMSMessages_CommandTypes SMSMessages_OddBallNetwork_InitialSend -------------------------------------- ID int PK, FK --> SMSMessages.ID Param1 nvarchar(50) Param6 nvarchar(50) Param9 nvarchar(50) Param14 nvarchar(50) SMSMessages_OddBallNetwork_AnniversarySend -------------------------------------- ID int PK, FK --> SMSMessages.ID Param1 nvarchar(50) Param2 nvarchar(50) Param7 nvarchar(50) Param9 nvarchar(50) Param12 nvarchar(50) //There are 4 other Command Types as well so 4 More Tables... 

About it in accordance with our DBA all purists. Each possible combination is highly defined. The relationship is clear, and this is the best performer.

From my POV, the disadvantages are development time, the number of touch points, complex search rules / procedures for messages with different types of commands and the lack of reuse ... a new team in this mobile network or another network, the approach requires the development and implementation of a database level .. .not just code level.

Option 2

This parameter should try to create one dynamic implementation with fewer reusable structures.

 SMSMessages_AdditionalParameterTypes ------------------------------------ ParamterTypeID int PK NOT NULL Identity ParamterType nvarchar(50) /* This table will contain all known parameters for any messages CommandName Param1 Param2 etc.. */ SMSMessages_AdditionalParameters -------------------------------- ID int PK NOT NULL Identity MessageID int FK --> SMS Messages ParamTypeID int FK --> SMSMessages_AdditionalParameterTypes Value nvarchar(255) 

So the pros and cons on this one.

Cons: You have less obvious visibility of which parameters are related to which messages. There is also a small performance problem ... N inserts for each message instead of two

Pros: It is much easier to develop against (imho). You just get a list of parameter names -> Values ​​back for a given message id

It can also be reused ... if the oddball network adds a new team, a new parameter in the team, or even if another network goes and implements a similar API “I want more information”, we don’t need any structural changes in our system.

SO ... What would you do?

+4
source share
6 answers

Why

Why do you need to store this information? Is this reported? Looking for? Used for summaries and categorical analysis? Currently? Often?

Make less

If this data is intended only for logging, paste it in a text field or in the xml field and forget about it. YAGNI (you won’t need it) seems likely ...

Seriously

Not knowing what the data is used for, no one can answer this question, including you.

Yes, the completely normalized structure of the logical database is large and provides clarity, etc. But is it useful?

Not all data is gold; some just cya

+3
source

Option 2 wins for me. For a performance issue, you should only have extra inserts for the oddball network, and that will be a problem anyway. As for visibility, I think it is a matter of perception. After some time working with the new system, it will probably become the second to see the requirements for message parameters.

+2
source

This is similar to a data logging situation that should be “good enough” for CYA purposes. A fully normalized model is good and correct, but it seems to be too similar to what you will do. If you just need some programmed requests, if some auditors come to visit, these requests do not seem to need answers in the second second.

I take back the previous entries accordingly.

+2
source

I would create the following tables

 SMSMessages SMSMessages_AdditionalParameterTypes(ParamterTypeID, ParamterType, Operator) SMSMessages_Parameters(MessageID,ParamTypeID,Value) 
+1
source

Obviously, it would be a terrible DB Design to just add all these extra columns at the end of our existing table

Do not think this awful design: it will certainly save your code as simple as it can be. Use a good name for the columns, not "param1" or something like that.

A foreign key relationship for the command type is a good idea (basically the equivalent of SQL enumeration).

Placing regular columns in another table (called Dynamic Columns) adds a lot of complexity. In practice, it is almost never worth it unless you intend to allow end users to add dynamic columns.

Most importantly, once you have created the design, write some sample queries for common tasks. This usually helps to figure out how difficult it really is.

+1
source

Personally, option 2 makes me tremble, as I'm sure this applies to your dbas. Entity value tables are the worst way to store data if you need to query it. This usually does not mean that "Small" performance amazes you, but "large" performance impressed your view of dbas. Database administrators are always obsessed with the fact that poorly designed systems are created in this way because they look more object-oriented and understandable for them. That's why we hate it when people offer these types of designs.

Designing a database repository for the convenience of programmers is short-sighted and unprofessional. Databases must be designed to ensure integrity, performance, and security. The time to develop against him is a distant, distant fourth. Databases are not reorganized as easily as application code; you can get stuck in this design for the next twenty years.

Now the real question is: do you need to request this data (other than returning values ​​in the request based on some criteria)? If you do not need to request or request very rarely, then just put all the additional data in the varchar (max) field and do it.

+1
source

Source: https://habr.com/ru/post/1300752/


All Articles