SQL joins the following foreign key: statically verify that the LHS is stored in the key

Often you join two tables after their foreign key, so a row in the RHS table will always be found. Adding a connection does not affect the number of rows affected by the query. for instance

create table a (x int not null primary key) create table b (x int not null primary key, y int not null) alter table a add foreign key (x) references b (x) 

Now, if you set some data in these two tables, you can get a certain number of rows from:

 select x from a 

Adding a connection to b following the foreign key does not change this:

 select ax from a join b on ax = bx 

However, this does not apply to unions in general, which can filter out some lines or (through the Cartesian product) add more:

 select ax from a join b on ax = bx and by != 42 -- probably gives fewer rows select ax from a join b on ax != by -- probably gives more rows 

When reading SQL code, there is no obvious way to determine if join a key saving type that can add additional columns but does not change the number of rows returned or whether it has other effects. Over time, I developed a coding convention, which I mainly adhere to:

  • If a key-saving connection, use join
  • if you want to filter strings, put the filter condition in the where clause
  • if more rows are required, sometimes cross join for a cartesian product is the clearest way

These are usually only style issues, as you can often put a predicate in a join clause or a where clause, for example.

My question

Is there a way that these key-saving keys are statically checked by the database server when compiling the query? I understand that the query optimizer already knows that a foreign key connection will always find exactly one row in the table that the foreign key points to. But I would like to tag it in my SQL code for the benefit of readers. For example, suppose the new fkjoin syntax fkjoin used to connect after a foreign key. Then the following SQL fragments will throw errors or not:

 a fkjoin b on ax = bx -- OK a fkjoin b on ax = bx and by = 42 -- "Error, join can fail due to extra predicate" a fkjoin b on ax = by -- "Error, no foreign key from ax to by" 

This will be a useful check for me when writing SQL, as well as when returning, to read it later. I understand and agree that changing foreign keys in the database will change the fact that SQL is legal according to this scheme - for me this is the desired result, because if the necessary FK ceases to exist, the query semantics that save the key are not longer , and I would like to know about it.

Potentially, there might be some kind of external static SQL validation tool that does the job, and the syntax of the special commentary can be used instead of the new keyword. The validation tool will need access to the database schema to find out which foreign keys exist, but there is no need to complete the query.

Is there something that does what I want? I am using MSSQL 2008 R2. (Microsoft SQL Server for the pedantic)

+1
source share
1 answer

I understand that you are interested in whether a particular connection points to certain columns in FK, or is it a limitation, or perhaps has some other case or none of the foregoing. (And it is not clear what you mean by the “success” or “failure” of an affiliate or its relevance.) While focusing on this information, as explained below, is not to focus on more important and fundamental things.

The base table has a "value" or "predicate (expression)", which is an expression about spaces in the name (named-) specified by the database administrator. Operator space names are table columns. Lines filling in the blanks to make a true sentence about the world go to the table. Lines that fill in the blanks to make a false peace offer are not counted. Those. the table contains rows that satisfy its statement. You cannot set the base table to a specific value without knowing its statements, observing the world and placing the corresponding rows in the table. You cannot know about the world from the base tables except knowing its statement and accepting the sentences of the current row to be true and missing sentences as false. Those. you need his instructions for using the database.

Note that the typical syntax for declaring a table looks like a shorthand for its statement:

 -- employee [eid] is named [name] and lives at [address] in ... EMPLOYEE(eid,name,address,...) 

You can make larger statements by placing the logical operators AND, OR, AND NOT, EXISTS name, AND, etc. between / around other operators. If you translate the statement into a relationship / SQL expression by converting

  • table instruction to her name
  • And JOIN
  • OR UNION
  • AND NOT EXCEPT / MINUS
  • EXISTS C, ... [...] - SELECT all columns but C,... FROM ...
  • And the ON/WHERE condition
  • ANSWER TO SUBSETOF
  • IFF to =

then you get a relation expression that evaluates the lines that make the statement true. (The UNION and EXCEPT / MINUS arguments need the same columns.) Since each table contains rows that match its expression, the query expression contains rows that satisfy its statement. You cannot find out about the world from the result of the request, except knowing its statement and accepting its real sentences in order to be true and absent sentences as false. Those. you need its expression to compose or interpret the request. (Note that this is true no matter what restrictions apply.)

This is the foundation of the relational model: table expressions compute strings that satisfy the corresponding operators. (To the extent that SQL is different, this is literally illogical.)

For example: If table T contains rows that make the operator T (..., T.Ci, ...) true, and table U contains rows that make the operator U (..., U. Cj, ...) true, then the table T JOIN U contains the rows that make the operator T (..., T.Ci, ...) and U (..., U.Cj, ...) true. This is JOIN semantics, which is important for using a database. You can always join, and the connection always matters, and it is always the value AND of its operands. Regardless of whether there are FK tables in others for others, this is not particularly useful for reasoning about updates or queries. (The DBMS uses restrictions when making errors.)

The conditional expression simply corresponds to the statement aka the always-true statement about the world and at the same time to one about the base tables. For example, for C UNIQUE NOT NULL in U following three expressions are equivalent to each other:

  • FOREIGN KEY T (C) REFERENCES U (C)
  • EXISTS columns other than C T(...,C,...)
    IMPLIES EXISTS columns other than C U(...,C,...)
  • (SELECT C FROM T) SUBSETOF (SELECT C FROM U)

It is true that this means that SELECT C FROM T JOIN U ON TC = UC = SELECT C FROM U , i.e. a connection on FK returns the same number of rows. But what? The join value is still the same function of its arguments.

Whether a particular join in a particular set of columns is a foreign key is simply not suitable for understanding the meaning of the query.

0
source

Source: https://habr.com/ru/post/1241581/


All Articles