Saving HTML data in a field using BLOB or TEXT / CLOB

I need to save the html data page in the ProjectDescription field in mySQL database using Spring and JPA 2.1. I read this question and all other questions with the BLOB tag, but you need clarity on why the fields are saved in a certain way in my db. I created the fields as follows, using the field types String and byte [].

Method1 : saving data as TEXT (after encoding in Base64 format, I save html data as String using the method below)

@Basic(fetch = FetchType.LAZY) @Lob private String projectDescription = ""; 

Method2 : save data as binary files using BLOB

 @Basic(fetch = FetchType.LAZY) @Lob @Column(length=5000) private byte[] projectDescription1 =new byte[0]; 

My assumptions: I assume that since the html data page is not very large, TEXT is fine, like a BLOB compap

I tested both and the fields are saved as below in the mySQL database

In Method 1:

  • Type: TEXT
  • DisplaySize is constantly 1431655765.

This size does not change regardless of my @Column (length = 5000) annotaton.

Method 2

  • Type: BLOB
  • DisplaySize: -1

Question1: What is the source of this DisplaySize? This seems rather large in the case of TEXT and very small (-1) in the case of the byte [] field type. Why the @Column variable seems changed will change DisplaySize.

Question2: Is it possible to store HTML data as a String field type (eventually, like TEXT), and not byte [] (after all, like blob)?

Note. I read all the questions using BLOB tags and it is clear that images / documents need to be saved as BLOB and text as CLOB / TEXT. However, I would like to confirm the HTML data again, given how large DisplaySize is allocated in DB for TEXT.

Thanks.

+5
source share
1 answer

If this is a whole page, why go through an extra layer from a database table? If this is only part of the page, I recommend TEXT CHARACTER SET utf8mb4 . Any text that does not contain UTF-8 on the page will cause problems; may also catch it earlier.

And the database industry converges when using UTF-8 for all text.

Base64 is 8/6 times larger. And all he does is to get away with characters other than UTF-8, which should not be. If anything, compress it in the client and store it in a BLOB , thereby reducing 3/1.

In MySQL TEXT - 64 Kbytes. You might be better off with MEDIUMTEXT , which has a 16 MB limit. I say "bytes" because, for example, the Chinese need 3, sometimes 4 bytes per character, so only about 25 thousand characters of Chinese text will correspond to TEXT .

"DisplaySize constantly 1431655765" - What ??? Gigabyte for a web page; never! Even if this includes images (which should not), it is completely unfounded. Edit: eggyal Commentary on 2 ^ 32/3 probably explains this odd number.

In MySQL SELECT length(my_text) ... will get the number of bytes in this column.

+5
source

Source: https://habr.com/ru/post/1274762/


All Articles