Join tables if link exists

Question

Join tables if link exists

I got a PostgreSQL database with 4 tables:

Table a

--------------------------- | ID | B_ID | C_ID | D_ID | --------------------------- | 1 | 1 | NULL | NULL | --------------------------- | 2 | NULL | 1 | NULL | --------------------------- | 3 | 2 | 2 | 1 | --------------------------- | 4 | NULL | NULL | 2 | ---------------------------

Table B

 ------------- | ID | DATA | ------------- | 1 | 123 | ------------- | 2 | 456 | -------------

Table C

 ------------- | ID | DATA | ------------- | 1 | 789 | ------------- | 2 | 102 | -------------

Table D

 ------------- | ID | DATA | ------------- | 1 | 654 | ------------- | 2 | 321 | -------------

I am trying to get a result set that appended data from table B and data from table C only if one of the IDs of the stands is not null.

 SELECT "Table_A"."ID", "Table_A"."ID_B", "Table_A"."ID_C", "Table_A"."ID_D", "Table_B"."DATA", "Table_C"."DATA" FROM "Table_A" LEFT JOIN "Table_B" on "Table_A"."ID_B" = "Table_B"."ID" LEFT JOIN "Table_C" on "Table_A"."ID_C" = "Table_C"."ID" WHERE "Table_A"."ID_B" IS NOT NULL OR "Table_A"."ID_C" IS NOT NULL;

Is this recommended, or am I better off splitting it into multiple requests?

Is there a way to make an inner join between these tables?

Expected Result:

 ------------------------------------------------- | ID | ID_B | ID_C | ID_D | DATA (B) | DATA (C) | ------------------------------------------------- | 1 | 1 | NULL | NULL | 123 | NULL | ------------------------------------------------- | 2 | NULL | 1 | NULL | NULL | 789 | ------------------------------------------------- | 3 | 2 | 2 | NULL | 456 | 102 | -------------------------------------------------

EDIT: ID_B , ID_C , ID_D are foreign keys for tables table_b , table_c , table_d

+6

sql join select postgresql

wiizzard May 25, '13 at 9:25

source share

4 answers

Given your requirements, your request seems good to me.

An alternative would be to use nested selections in projection, but depending on your data, indexes and constraints, which can be slower, since nested selects usually result in nested loops, whereas joins can be performed as merge joins or nested loops

 SELECT "Table_A"."ID", "Table_A"."ID_B", "Table_A"."ID_C", "Table_A"."ID_D", (SELECT "DATA" FROM "Table_B" WHERE "Table_A"."ID_B" = "Table_B"."ID"), (SELECT "DATA" FROM "Table_C" WHERE "Table_A"."ID_C" = "Table_C"."ID") FROM "Table_A" WHERE "Table_A"."ID_B" IS NOT NULL OR "Table_A"."ID_C" IS NOT NULL;

If Postgres caches the scalar subquery (as Oracle does), then nested selections can help if you have many repetitions of the data in Table_A

+2

Lukas Eder May 25 '13 at 9:32

source share

Since you have foreign key constraints, referential integrity is guaranteed and the query in your Q is already the best answer .

Indices for Table_B.ID and Table_C.ID are also indicated.

If matching cases in Table_A are rare (less than ~ 5%, depending on the row and data distribution), a partial multi-column index will contribute to performance:

 CREATE INDEX table_a_special_idx ON "Table_A" ("ID_B", "ID_C") WHERE "ID_B" IS NOT NULL OR "ID_C" IS NOT NULL;

In PostgreSQL 9.2, a coverage index ( viewing by index in Postgres) can help even more - in this case you would include all the Interest in Index columns (not in my example). Depends on several factors, such as row width and refresh rate in your table.

+2

Erwin brandstetter May 25 '13 at 18:38

source share

As a rule, the recommended way is to do this in only one query, and let the database do as much work as possible, especially if you add other operations, such as sorting (sorting) or pagination later (restriction ... offset ...) later. We made some measurements, and in Java / Scala there is no way to sort / break pages faster if you use any collections of a higher level, for example, lists, etc.

RDBMS do very well with single, complex applications, but they find it difficult to handle many small queries. For example, if you request a “one” and “many relationship” in the same query, it will be faster than doing this in the 1 + n select statements.

As for the external connection, we performed the measurements, and there is no real decrease in performance compared to the internal connections. Therefore, if your data model and / or your request requires an external connection, just do it. If this was a performance issue, you can configure it later.

As for your null comparisons, this may indicate that your data model may be optimized, but this is just an assumption. Most likely, you can improve the design so that null is not allowed in these columns.

0

Beryllium May 25 '13 at 11:16

source share

wildplasser · Accepted Answer · 2013-05-25T11:01:30+0000

WHERE "Table_A"."ID_B" IS NOT NULL OR "Table_A"."ID_C" IS NOT NULL; can be replaced by the corresponding sentence in tables B and C: WHERE "Table_B"."ID" IS NOT NULL OR "Table_C"."ID" IS NOT NULL; . This will also work if table_a.id_b and table_a.id_c are not FK for tables B and C. Otherwise, the row table_a with {5, 5,5,5} will extract two NULL rows from tables B and C.

 SELECT ta."ID" AS a_id , ta."ID_B" AS b_id , ta."ID_C" AS c_id , ta."ID_D" AS d_id , tb."DATA" AS bdata , tc."DATA" AS cdata FROM "Table_a" ta LEFT JOIN "Table_B" tb on ta."ID_B" = tb."ID" LEFT JOIN "Table_C" tc on ta."ID_C" = tc."ID" WHERE tb."ID" IS NOT NULL OR tc."ID" IS NOT NULL ;

Join tables if link exists

More articles: