Manipulate data in a database query or in code

How do you decide on which side you are manipulating data when you can do it in code or in a query?

If you need to display the date in a specific format, for example. Do you get the required format directly in the SQL query or get the date, and then format it through the code?

What will help you decide: performance, best practices, preference in SQL over code language, task complexity ...?

+4
source share
7 answers

All things being equal, I prefer to do some kind of code manipulation. I try to return the data as raw as possible, so it can be used by a wider customer base. If this is a very specialized, maybe a report, then I can do manipulations on the SQL side.

Another example, when I prefer to do SQL-side manipulations, is what can be done based on a set.

If it is not installed on the basis, and the cycle is involved, I would do the manipulation in the code.

Basically, let the database do what it likes, otherwise it is done in code.

+6
source

Formatting is a user interface problem, it is not a "manipulation".

+3
source

My answer is the opposite of everyone else.

If you need to apply the same formatting logic (the same is true for calculation logic) in several places in the application or in separate applications, I would encapsulate formatting in the form inside the database and SELECT from the view. You do not need to hide the source data, which may also be available. But by putting logic in a database view, you simplify simplified formatting between modules and applications.

For example, the Customer table will have an associated CustomerEx view with a column displayed in MailingAddress that will format the various parts of the address as necessary, combining the city, state, and zip code and discarding blank lines, etc. My application code is CHOICE regarding the CustomerEx view for addresses. If I extend my data model using, say, the Apt # field or to handle international addresses, I only need to change this single view. I do not need to modify or even recompile my application.

+3
source

I will never (ever) indicate any formatting in the request itself. It is up to the consumer to decide how to format. All data manipulations should be performed on the client side, with the exception of bulk operations.

+2
source

If it's just formatting and doesn't always have to be the same formatting, I would do it in an application that is more likely to make it faster.

However, the fastest formatting is the one that is performed only once, so if this is the standard format that I want to use (for example, displaying American phone numbers as (###) ### - ####) then I will store the data in the database is in this format (it can still include the application code, but you can’t select it on the insert) This is especially true if you need to reformat a million records for a report. If you have several formats, you can consider the calculated columns (we have one for the full name and one for lastname, firstname and our raw data - name, middlename, lastname, suffix) or triggers for saving data. In general, I say to store data as you need to see it, if you can save it in the appropriate data type for the real manipulations that you need, for example, datemath or ordinary math for monetary values.

+1
source

In the case of a date column, I would save the full date in the database, and when I return it, I will indicate in the code how I would like to show it to the user. This way you can ignore the time part or even change the order of the date parts when you display it in a datagrid, for example: mm / dd / yyyy, dd / mm / yyyy or only mm / yyyy.

0
source

The only thing I do in the query, which can probably be done in the code, is also convert datetimes to a user timezone.

The MySQL CONVERT_TZ () function is easy to use and accurate. I save all my data in UTC and retrieve it in a user time zone. The rules of saving summer time are changing. This is especially important for client applications, since using your own library depends on the fact that the user has updated his OS.

Even for server-side code, such as a web server, I need to update several tables to get the latest time zone information instead of updating the OS on the server.

Beyond these types of problems, it is probably best to distribute most of the functions to the application server or client, rather than making your database a bottleneck. Application servers are easier to scale than database servers.

If you can write a stored procedure or something that can start with a large dataset, do some inexpensive calculations or a simple iteration to return a single row or value, then it probably makes sense to do this on the server to save big ones from sending wire datasets. So, if the processing is inexpensive, why not return the database exactly what you need?

0
source

Source: https://habr.com/ru/post/1304581/


All Articles