Best way to store deleted user data?

I am working on an application that tracks and processes work orders / tickets. Each ticket is associated with a user who creates / owns a ticket through a foreign key that cascades any changes in MySQL. Obviously, if a user ever deleted his account for some reason, we would still like to keep a record of his tickets and their basic information.

The first way to achieve this that came to mind is to have a column in the user table that indicates whether they are active or inactive, i.e. deleted or not. Thus, when they close / delete their account, they simply transfer this value and cannot access the application.

Another idea I used was to move the user record to the remote users table when deleting the account. This method will best preserve the performance of the user table, which can be huge when it becomes large, but adds additional queries to move the record.

Obviously, some of this may be preferable, but I'm interested in performance aspects. Essentially, the question is how does the select query compare with the insert query, and at what point will the overall performance increase by adding insert queries (moving records to the remote users table) in the mix?

+4
source share
2 answers

there is a column in the user table that indicates whether they are active or inactive, i.e. deleted or not.

Good.

Another idea I used was to move the user record to the remote users table

Bad Now you have two connections: the user to the ticket and the former user to the ticket. This is unnecessary complexity.

can become a huge deal when it gets big,

If "big" means millions of users, then you're right. If, however, “large,” you mean thousands of users, you cannot measure the big difference.

A. If you really have a noticeable slowdown in the future, you can use things like “materialized views” to automatically create a view / table of a subset of “active” users.

Obviously, some of this may be preferable,

Not really. Deactivating (but not deleting) users has numerous advantages and the absence of real disadvantages.

There are many levels of activity - security lock (but not disabled) - temporarily disabled - delegated to other users. Many, many status changes. A few reasons to remove. There is no reason to switch to another table.

how is the query of choice performed compared to the query to insert, and at what point will the overall performance be increased by adding queries to the insert (moving records to the table of remote users) in the mix?

Only you can measure this for your tables, indexes, your server and your number of transactions. There is no general answer.

+10
source

In my opinion, the best approach to marking a user as remote or not. The second way, with a new table, will lead to changes in each table where you refer to the user table. You must have a new foreign key for the "remote user table". This will change all queries for selection rows from these tables.

As you wrote, the application deals with tickets, logically most of the requests will concern the selection and editing of tickets. Thus, the impact will be on this table, I do not think that you are making large requests about users.

Optimization on the “user” table and the creation of more complex queries for the “ticket” table will not be paid.

+1
source

Source: https://habr.com/ru/post/1387438/


All Articles