Distributed (XA) transaction performance tuning - how?

In connection with my other post , I realized that we can say more about stackoverflow in relation to distributed transactions of XA and its internal components. The general consensus is that distributed transactions are slow.

What are the internal XA transactions and how to set them up?

+2
source share
1 answer

First, let's add a common vocabulary. We have two or more parties.

  • A transaction coordinator is where our business logic is. This is the party that manages the distributed transaction.
  • A Transaction Participant (XAResource) can be any Dababase that supports distributed transactions, or some other entity that supports the XA protocol, such as a messaging service.

Allows you to highlight the main API functions that are performed during an XA transaction. - start (XID) - end (XID) - preparation (XID) - commit (XID)

enter image description here

The first 2 operations are visible in our source code. This is when we initiate a transaction, do some work, and then say we committed. As soon as we send a commit message from the source code, the transaction coordinator and transaction participant take over and do some work.

The XID parameter is used as a unique key that identifies the transaction. Each transaction coordinator and each participant can participate in more than one transaction at any time, so this is necessary to identify them. The XID consists of two parts: one part identifies the global transaction, the second part identifies the participant. This means that each participant in the same transaction will have its own sub-identifier. Once we have reached the transaction preparation phase, each participant in the transaction writes his work to the transaction log, and each participant in the transaction (XARersource) votes if part of it is in order or not completed. Once all votes have been received, the transaction is completed. If the power goes down, the transaction coordinator and transaction participant keep their transaction logs reliable and can assume their work. If one of the participants votes for the FAILURE while accepting the transaction, a subsequent rollback will be initiated.

Performance Implications

According to the CAP theorem, each application (functionality) is somewhere between a triangle defined by consistency, partitioning and accessibility. The main problem with the XA / Distributed transaction is that it requires extreme consistency.

This requirement leads to very high I / O activity on the network and on disk.

Disk Activity Both the transaction coordinator and the transaction participant must maintain a transaction log. This log is stored on disk, for each transaction it is necessary to force information into this log, this information is not buffered information. Greater concurrency will result in a large number of small messages being sent to disk in each transaction log. Usually, if we copy one 1 GB file from one hard drive to another hard drive, this will be a very fast operation. If we divide the file into 1,000,000 parts of a pair of bytes each, file transfer will be very slow.

Drive boost grows with the number of participants.

1 participant is considered a normal transaction
2 participants force disk 5
3 is 7

Network Activity In order to draw a parallel for a distributed XAT transaction, we need to compare it with something. Network activity in a normal transaction is as follows. 3 network transactions -enlist transaction, sending multiple SQL queries, commit.

For transaction XA, this is one more complicated idea. If we have 2 members. We credit resources to transaction 2 on network trips. Then we send a preparation message for 2 more trips, then we make 2 more trips.

The actual network activity that occurs for two resources increases even more, the more participants you enroll in the transaction.

Conclusion on how to quickly get a distributed transaction

  • To do this, you need to make sure you have a fast network with minimal latency.
  • Make sure you have hard drives with minimal latency and maximum random write speed. A good SSD can do wonders. -Try to enlist as few distributed resources as possible in a transaction
  • Try dividing your data into data that has strict consistency and availability requirements (operational data) and data that have low consistency requirements. Real-time data usage Distributed transaction. For offline data, use a regular transaction or not a transaction if your data does not require this.

My answer is based on what I read in “ XA Exposed ” (and personal experience), which seems to be no longer available on the Internet, which prompted me to write this.

+3
source

Source: https://habr.com/ru/post/1261501/


All Articles