Best practice. Should I save html tags in the database or store html entity value?

I was wondering how I would do the following. I use the tiny MCE wysiwyg editor, which formats user data with the correct html tags. Now I need to save this data entered into the editor into the database table.

Should I encode html tags to their corresponding objects when pasting into the database, then when I return the data from the table, I don’t need to encode it for XSS purposes, but I still have to use eval for html tags to format the text.

OR

I can save the html tags in the database, and then, when I return the data from the database, encode the html tags to my objects, but when the tags appear to the user, I will have to use eval to actually format the data as it is input.

My thoughts are with the first option, I just thought about what you guys thought.

+4
source share
5 answers

None. You save the HTML “as is,” so when you pull out its finished rendering. You cannot convert back and forth. What you invest should be what you show. What you want to do is filter the input before placing it in the database. both tinyMCE and ck / fckEditor have the ability to limit the tags that can be used in the editor, and they will strip those tags for you. Then you need to do any other necessary validation or formatting.

+3
source

I would suggest storing data in a database as close to the “natural" form as possible. Typically, your database level should not concern you with whether the HTML field contains Base64 encoded binary text or plain text. These are problems for your viewing level when it decides how to render content.

Thus, although you might want to pre-display XSS attacks before inserting them into the database, you should always display the XSS screen before sending “untrusted” information to the browser.

It also has the advantage that if your XSS prevention algorithms improve in the future, you can implement it throughout your application by simply changing the routines that display it, instead of scanning your database for fields that may contain HTML, and then update them.

+4
source

When I first started my blog, I decided to convert from BBCode to HTML (and do health checks), and then put it in a database. Well, the month is rolling and it turns out that I had a problem with the layout. Now that my old HTML has been “fixed” in the database, I soon found out that you should always store the source text that the user uses in the database and then convert it later in the request.

This makes it possible that you can correct errors using HTML and XSS, and this will be retroactive.

+2
source

I would just check the SQL injections when pasting into the database and leave the html as it is, it has "raw" in it.

I know that drupal does this and applies filters (for example, all html tags (without filter), only certain tags, xss filter, php code format, tokens, etc.) on the fly. The advantage of this approach is that you do not destructively change your input if you want to change the filter used later.

+1
source

One of the possibilities is to save both the sanitized version and the original version. I use this with HTMLPurifier to avoid performance issues that are created in live sanitation mode, but still allow users to edit their contents in their original form.

Needless to say, it will require twice as much storage space, but usually this is not so much a problem as speed and control.

0
source

Source: https://habr.com/ru/post/1308942/


All Articles