Primary Key Collision Prevention from Separate MySQL Databases

I have several servers running my own instance of a specific MySQL , which, unfortunately, cannot be installed in the replication / cluster. Each server inserts data into several user-related tables that have foreign key constraints (for example, user , user_vote ). This is how the process goes:

  • all servers start with the same data
  • Each server independently generates its own data set from other servers.
  • data from all servers are combined manually and applied back to each server (therefore, the process repeats from step 1).

This was made possible because, in addition to its primary key, the user table contains a unique email field that allows you to identify which users already exist in each database, and merge those who are new when changing the primary and foreign keys in order to avoid collisions and maintain correct foreign key constraints. This works, but it is rather complicated, because the primary and foreign keys must be changed to avoid a collision, so my question is:

Is there a way for each server to use primary keys that did not collide with other servers to facilitate merging?

At first I wanted to use a composite primary key (for example, server_id , id ), but I use Doctrine , which does not support primary keys consisting of several foreign keys , so I will have problems with foreign key constraints.

I thought about using VARCHAR as id and used part of the string as a prefix (SERVER1-1, SERVER1-2, SERVER2-1, SERVER2-2 ...), but I think it will make the DB slower as I will have to do some manipulations with identifiers (for example, when pasting, I have to parse existing identifiers and extract the maximum, increase it, combine with the server identifier ...).

PS: Another option is to implement replication from read from slaves and write to the master, but this option was discarded due to problems such as replication lag and the only point of failure on the main server that cannot be resolved yet.

+4
source share
2 answers

You can make sure that each server uses a different auto increment increment and a different initial offset:

Change the auto_increment step of the fields to

(assuming you use auoincrements)

I used this on only two servers, so in my setup there was one identifier with an identifier and one with an odd one.

When they are combined together, nothing will collide until you make sure that all the tables are consistent with the above idea.

for implementation for 4 servers

You would say adjust the following offsets:

  • Server 1 = 1
  • Server 2 = 2
  • Server 3 = 3
  • Server 4 = 4

You would set your gain as such (I used 10 to leave room for additional servers):

  • Server 1 = 10
  • Server 2 = 10
  • Server 3 = 10
  • Server 4 = 10

And after you merge, before copying to each server, you just need to update the autoinc value for each table to get the correct offset again. Imagine each server created 100 lines, autoincs:

  • Server 1 = 1001
  • Server 2 = 1002
  • Server 3 = 1003
  • Server 4 = 1004

This is where it gets complicated due to the availability of four servers. To represent some tables, there may not have been any rows inserted from a specific server. Thus, you may come across some tables in which their last autoinc identifier was not from server 4, but instead of being from server 2. This would make it very difficult to work out what the next auto-look should be for any particular table .

For this reason, it is probably best to also include a column in each of your tables that records the server number when any rows are inserted.

 id | field1 | field2 | ... | server 

Thus, you can easily find out what should be the last autoinc value for a particular server by selecting the following in any of your tables:

 SELECT MAX(id) FROM `table` WHERE `server`=4 LIMIT 0,1 

Using this value, you can reset the next value of the automatic value, which is necessary for each table on each server, before redirecting the combined data set to the corresponding server.

 UPDATE information_schema.tables SET Auto_increment = ( SELECT MAX(id) FROM `table` WHERE `server`=s LIMIT 0,1 )+n WHERE table_name='table' AND table_schema = DATABASE(); 

Where s is the server number and n is the offset, so in my example it will be 10 .

+2
source

The ID ould prefix does the trick. As for DB, itโ€™s slower - it depends on how much traffic works there. You can also have an โ€œid prefixโ€ divided into two columns, โ€œprefixโ€ and โ€œidโ€, and they can be of any type. It takes some logic to handle this in queries, but it may be worth evaluating.

+1
source

Source: https://habr.com/ru/post/1448089/


All Articles