Long text rendering algorithm in a text editor

I was thinking of writing a text editor control that can edit text that can be of any arbitrary length (say, hundreds of megabytes), similar to some Scintilla editor. The goal is to read the file lazily, so the user does not need to read five hundred megabytes of data to view a small portion. I have two problems:

  • It seems to me that it is impossible to implement any reasonable scroll function for such an editor, if I have not previously read the entire file once to find out line breaks. It's true? Or is there a way to get closer to things that I don't think about?

  • Due to various problems with Unicode (for example, it allows many bytes to represent only one character, not only due to variable length encoding, but also due to accents, etc.), it seems almost impossible to determine exactly how much text will fit on the screen - I will have to use TextOut () or something to draw one character, measure its size, and then draw the next character. And even then, which has not yet been said, as I would have sketched, the user clicks on the correct position of the text.

Is there anything I could read on the Internet regarding algorithms to solve these problems? I searched but found nothing.

Thanks!

+4
source share
2 answers

You can set a "rough" position based on the size of the data instead of rows. The โ€œfineโ€ position of your text box may be based on local scanning around an arbitrary entry point.

This means that you will need to write functions that can scan locally (back and forth) to find the beginning of a line, count Unicode characters, etc. It should not be too complicated; Thus, UTF8 is easily analyzed.

You may especially consider what to do with extremely long lines. Since the upper limit with respect to the length of the line does not exist, this allows us to find the beginning (or end) of the line of the unlimited task; I believe that everything else you need to display the screen should be local.

Finally, if you need a general text editor, you need to figure out what you are going to do when you want to save the file into which you inserted / deleted things. The direct thing is to rewrite the file; however, obviously, it will take longer with a huge file. You can expect the user to run into problems if there is not enough space for the modified copy, so at least you want to check that there is enough space in the file system.

+4
source

@comingstorm is basically right. To display, you start with the cursor and scroll back until you are sure that you are behind the top of the screen. Then you scan back to the end of the line, assuming that you can identify the scan of the end of the line back. Now you are looking ahead, calculating and saving the starting positions of the screen line until you go far enough. Finally, you select the line that you want to start displaying and turning off.

For plain text, this can be done on an archaic processor fast enough to redraw the video image displayed on the map at each key press. [I invented this technology 30 years ago]. The right way to do this is to fix the cursor in the middle line of the screen.

For actual file changes, you can look at using Gnu ropes. A rope is basically a linked list of buffers. The idea is that all local changes can be made in only one small buffer, sometimes adding a new buffer and mixing neighboring buffers from time to time.

I would think about combining this technology with differential storage: this is what all modern version control systems do. Basically there is to use this kind of transaction-based editing if you want to implement the undo function.

The key to this is a reversible transaction, that is, one that contains enough information to apply back to reverse what it did when applied forward. The main transaction of the editor:

at pos p replace old with new 

which has the opposite

 at pos p replace new with old 

This handles the insert (the old one is empty) and deletes (the new one is empty) and also replaces. Given the list of transactions, you can undo changes in place in the row by applying the opposite to the list of reverse transactions.

Now you use the concept of the old checkpoint: you save a fairly recent modified image in place, along with some recent transactions that have not yet been applied. To display, you apply transactions on the fly. To cancel, you simply throw away some transactions. Sometimes you actually apply transactions to create a โ€œcheckpointโ€ image. This speeds up the display by reducing the undo time.

Finally: to rewrite a huge text file, you usually rewrite all the text, which is terrible. If you can fool a bit and allow arbitrary 0 characters in the text, and you have access to the virtual memory system manager and access to a low-level disk, you can do much better by preserving all the unchanged pages of the text and simply rebuilding them: in other words, the idea ropes on the disk.

+3
source

Source: https://habr.com/ru/post/1332140/


All Articles