SQL architecture in relation to (supercomplex?) Schemas

Question

SQL architecture in relation to (supercomplex?) Schemas

I am currently developing a database for a new section of our product. Why do I need to have a “sanity check” or some advice because I don’t feel overly confident in some parts of the setup.

Some background information

The product we are developing is the so-called "ROI maximization marketing system". It processes big data and processes / improves / enriches a huge amount of information before sending it through different marketing channels. This is basically what he does in a nutshell.

Problem area

Currently, the system does not fully perform good data validation and is subjected to “abuse” daily, “advertising” people and “self-service” customers. Given that the new Google Product Listing Product Network is referring to our CEO, I was tasked with finding a good solution on how to process {information / data} for use on the google shopping channel (call it PLA; ads )

This is the problem :
Our product does not offer any form of verification (read: adhering to the specific requirements of the network), and the PLA basically revolves around data integrity by categorizing elements (each category identified mandatory / optional fields), and each field can or should be in a specific format (may even depend on a related category, I don’t know yet: P).

You guess we are a little attached to the current setting. It is not possible to enforce such “strict” products. Allowing our marketing and self-service customers to create and send data to the PLA will mean bughunting / problem in 99% of cases. And since this is just a small company, I would rather look at the real problem. It means; trying to create a real verification system that can be used for PLA marketing campaigns.

What should be done

I talked with our marketing specialists and customers to find out which use cases and what would be the requirements. They can be summarized in the following list items:

Each input feed item must be matched “categorized” with the Google PLA category (see the “links” section to see which categories you can map to.
Validation must be configured for each field for the category.
Each field for each element should be highlighted / displayed in the field defined in the selected category.

Additional additional information

Now I don’t want to worry about things like “How we will associate“ elements ”with“ category ”or“ fields ”with“ definitions of category fields ”or something like that. These“ dynamic things ”will be (why do you ask ?) The system processes / processes data on a schedule, so each operation must be defined and saved for future use), do not worry about the implementation details at the moment.

In addition, a particular concrete implementation is often implemented using dynamic attributes (for example, attributes in the field, determined by the data type, etc.). EAV is also not my primary concern right now. (the above use case will make more sense if you look at the database design).

My current design

First, let me explain my SQL structure using basic entities:

schemas an abstract way to define "categories", think of a PLA category.
fields ; field definitions (in a schema )
datatypes bag types. (mainly used to provide some data integrity over fields).
valueConstraints ; bag of definitions of constraints (not implementation!).

Now. So far, everything is good and dandy. Here's what bothers me:

valueConstraints tied to a data type using the N: M table ( datatype_valueConstraints ), but almost every user-created data type consists of only a subset of the available evaluation features, it does not make sense to have a Price data type, which may have an Email restriction. However, it makes sense to have a restriction of "Min" and "Max", since the price is always a number. For clarity: datatype_valueConstraints holds the "possible" valueConstraints for each data type.

The same problem occurs with primitiveType → constraintValue relationships. Basically, the data type should include "primitiveType" (in my case, the foreign key for the table is a primitive type). The primitive type controls valueConstraints for selection. primitiveTypes and valueConstraints not considered user created, so now they commit data.

Do not understand? Here's an example workflow ((partial) setup for the PLA / clothing scheme):

Add image data type, set {primitive type in TEXT}
- Select the following valueConstraints to use (TEXT)
  - "URL" (make sure it is http | https or something like that, dunno)
  - "MinLength" (make sure he's there)
  - "Regex" (allow certain image extensions .. or something like that)
Add the definition of the "imageURL" field, set {datatype to "image"}
- specific data configuration i.e. Filling in restriction approval data (associated with the EAV template). "MinLength" = 14, "Regex" = "* (gif | jpg | png)", etc.

Since the data type was a primitive "TEXT", the user could only select "TEXT" - a bypass (and inherited from the tree as a data generation system) valueConstraints .

Once the data type has been correctly configured, we can use the "image" of the data type for several fields in the schema (if we want). For instance; for the PLA / CLOTHING scheme, an “additional image” field may be required. This is now entirely possible by reusing the image type “image”, possibly with a different configuration of restrictions.

Visual table of SQL tables showing relationships (brain wave relative to the top of the text):

My DB schema: (click to enlarge)

TL; DR;

See "My current design" and give me your opinion. I think this may be too complicated / not very well thought out and is looking for improvement. Note. I am not a database administrator, just a developer. (Also, I'm not sure if the circuit design will make sense if you haven't read the Problem section: P)

links

I am very happy to see what you guys think. Thanks in advance!

+6

sql schema

Sebastiaan hilbers Jun 04 '13 at 11:50

source share

1 answer

Zelloss · Answer 1 · 2013-06-05T11:15:06+0000

Only personal preference: they don’t really like parental relationships within the table, if they really are not needed. I see them for the schema table, but in this case I feel that primitive types can benefit from a more strict schema by removing the BASIC type and adding basic length constraints to all primitives (no such feat in terms of space and speed). If you really need to make an additional level primitive types: do it, but add only the parent table for only one parent level and combine everything with this.

Of course: it lacks flexibility, but in my experience it’s easier to add data that matches this scheme, and you don’t need to change the code to extract the conditions (and the code will be a simple request) than to allow unlimited levels of nesting primitives that you are not going to use or worse: you will abuse :)

SQL architecture in relation to (supercomplex?) Schemas

Some background information

Problem area

What should be done

Additional additional information

My current design

TL; DR;

links

More articles: