How are text editors typically used?

This question is likely to make me sound rather ignorant. This is because I am.

I just think, if I were hypothetically interested in developing my own GUI text editor, widget, or whatever you want to name (which I don't like), how would I do it?

The temptation for a novice like me was to store the contents of a text editor as a string, which seems pretty expensive (not that I am too familiar with how string implementations differ between one language / platform and the next, but I I know that in .NET, for example, they are immutable, so frequent manipulations, such as what you will need to support in a text editor, will be wonderfully wasteful, building one copy of a line after another very quickly Sequence).

Presumably, some mutable data structure containing text is used instead; but figuring out what this structure looks like strikes me as something complicated. Random access would be good (I would think, anyway, don't you want the user to be able to jump anywhere in the text?), But then I wonder about the cost, say, of navigating somewhere in the middle of a huge document and immediately starts typing. Again, the newcomers approach (let's say you save the text as a resizable character array size) will lead to very low performance, I think, like every character typed by the user, there would be a lot of data for the "shift", more.

So, if I were to guess, I would suggest that text editors use some kind of structure that breaks the text into smaller parts (lines, maybe?), Which individually contain arrays of characters with random access and which themselves are randomly available in discrete pieces. It even looks like it should be a pretty monstrous simplification, though, if even remotely close to the start.

Of course, I also understand that there cannot be a “standard” way to implement text editors; perhaps this varies greatly from one editor to another. But I thought that since this is clearly a problem that has been resolved many, many times, perhaps a relatively common approach has surfaced over the years.

In any case, I’m just curious to find out if anyone has knowledge on this topic. As I said, I definitely do not want to write my own text editor; I'm just curious.

+42
string language-agnostic user-interface text-editor
Oct 28 '10 at 18:55
source share
4 answers

One method that is common (especially in older editors) is called a shared buffer. Basically, you “break” the text into everything before the cursor and everything after the cursor. Everything before the start of the buffer. Everything after that goes to the end of the buffer.

When the user enters text, he goes into the empty space between them, without moving any data. When the user moves the cursor, you move the appropriate amount of text from one side of the “break” to the other. Typically, a lot moves in one area, so you usually only move a small amount of text at a time. The biggest exception is if you have the option to "go to line xxx".

Charles Crowley wrote a much more comprehensive discussion of Text Editing , which covers significantly wider layers of buffers (and other features).

+31
Oct 28 '10 at 7:01
source share
— -

A back, I wrote my own text editor in Tcl (in fact, I stole the code somewhere and expanded it beyond recognition, and miracles are open source).

As you mentioned, performing string operations on very, very large strings can be expensive. Therefore, the editor breaks the text into smaller lines on each new line ("\ n" or "\ r" or "\ r \ n"). Thus, all that remains for me is editing small lines at the linear level and performing operations with lists when moving between lines.

Another advantage of this is that it is a simple and natural concept for working. My mind already believes that each line of text should be separately reinforced by years of programming, where new lines are stylistically or syntactically meaningful.

It also helps the use case for my text editor to be a programmer editor. For example, I implemented hilighting syntax, but not word / line wrap. So in my case there is a 1: 1 map between newlines in the text and lines highlighted on the screen.

If you want to see, here is the source code for my editor: http://wiki.tcl.tk/16056

This is not a BTW toy. I use it every day as a standard console text editor if the file is too large to fit in RAM. (Seriously, what kind of text file? Even novels, which are usually 4 to 5 MB, fit into RAM. I only saw log files grow to hundreds of MB).

+2
Oct. 31 '10 at 21:13
source share

Depending on the amount of text that should be in the editor at a time, one line for the entire buffer approach is likely to be fine. I think Notepad does this - someday notice how slower it is to insert text into a large file?

Having one row per row in the hash table seems like a good compromise. This will make navigation to a specific line and remove / insert efficiently without too much difficulty.

If you want to implement the undo function, you will need a view that allows you to return to previous versions without saving 30 copies of the entire file for 30 changes, although again it would be nice if the file was small enough.

+1
Oct 28 2018-10-28
source share

The easiest way is to use some kind of string buffer class provided by the language. Even a simple array of char objects could do this.

Adding, replacing and searching for text is relatively quick. Of course, other operations are potentially more time-consuming, with the addition of a sequence of characters at the beginning of the buffer, which is one of the more expensive actions.

However, this may be perfectly acceptable for simple use.

If the cost of inserts and exceptions is especially significant, I will be tempted to optimize by creating a buffer wrapper class that internally maintains a list of buffer objects. Any action (except for a simple replacement) that was not at the tail of the existing buffer will cause the corresponding buffer to be split at the corresponding point, so the buffer can be changed by its tail. However, the outer shell will support the same interface as a simple buffer, so I did not have to rewrite, for example. my search action.

Of course, this simple approach will quickly end with an extremely fragmented buffer, and I would think about having some kind of rule for combining the buffers when necessary, or delaying the splitting of the buffer in the case of, for example, inserting one character. Maybe the rule would be that I would have only 2 internal buffers, and I would combine them before creating a new one, or when something asked me about viewing the entire buffer at once. Not sure.

Point, I would start simply, but access the mutable buffer through a carefully selected interface and play with the internal implementation, if the profiling showed me what I need.

However, I would definitely not start with immutable String objects!

+1
Oct 28 '10 at 19:14
source share



All Articles