Captures two-line editing

I am going to delve into my problem, you can go to TL; DR if you do not want to read all this

What am i trying to do

I need to save a “file” (text document) that can be edited by the user. If I have a source file (which can be huge)

Lorem ipsum dolor sit amet

and the user had to make changes:

Foo ipsum amet_ sit

Basically, I have a source line and a user-edited line. I want to find the differences, "edits." To prevent duplicate saving of very large rows. I want to keep the original and "edit." Then apply the changes to the original. This is similar to data deduplication. The problem is that I have no idea how there can be different changes, and I should also be able to apply these changes to the string.

Attempts

Since the text can be huge, I wonder what would be the most “effective” way to keep editing text without saving two separate versions. My first guess was something like:

var str = 'Original String of text...'.split(' ') || [], mod = 'Modified String of text...'.split(' ') || [], i, edits = []; for (i = 0; i < str.length; i += 1) { edits.push(str[i]===mod[i] ? undefined : mod[i]); } console.log(edits); // ["Modified", null, null, null] (desired output) 

then go back:

 for (i = 0; i < str.length; i += 1) { str[i] = edits[i] || str[i]; } str.join(' '); // "Modified String of text..." 

Basically, I'm trying to split text with spaces into arrays. Compare arrays and save the differences. Then apply the differences to create the modified version

Problems

But if the number of spaces changed, problems arose:

str : Original String of text... mod : OriginalString of text...

Exit: OriginalString of text... text...

My desired result: OriginalString of text...


Even if I switched str.length to mod.length and edits.length as follows:

 // Get edits var str = 'Original String of text...'.split(' ') || [], mod = 'Modified String of text...'.split(' ') || [], i, edits = []; for (i = 0; i < mod.length; i += 1) { edits.push(str[i]===mod[i] ? undefined : mod[i]); } // Apply edits var final = []; for (i = 0; i < edits.length; i += 1) { final[i] = edits[i] || str[i]; } final = final.join(' '); 

edits will be: ["ModifiedString", "of", "text..."] as a result, making all "saving changes" useless. And even worse, if the word was added / deleted. If str became Original String of lots of text... The output will still be the same.


I see that they have many shortcomings in how I do this, but I can’t think of anything else.

Excerpt:

 document.getElementById('go').onclick = function() { var str = document.getElementById('a').value.split(' ') || [], mod = document.getElementById('b').value.split(' ') || [], i, edits = []; for (i = 0; i < mod.length; i += 1) { edits.push(str[i] === mod[i] ? undefined : mod[i]); } // Apply edits var final = []; for (i = 0; i < edits.length; i += 1) { final[i] = edits[i] || str[i]; } final = final.join(' '); alert(final); }; document.getElementById('go2').onclick = function() { var str = document.getElementById('a').value.split(' ') || [], mod = document.getElementById('b').value.split(' ') || [], i, edits = []; for (i = 0; i < str.length; i += 1) { edits.push(str[i] === mod[i] ? undefined : mod[i]); } for (i = 0; i < str.length; i += 1) { str[i] = edits[i] || str[i]; } alert(str.join(' ')); // "Modified String of text..." }; 
 Base String: <input id="a"> <br/>Modified String: <input id="b" /> <br/> <button id="go">Second method</button> <button id="go2">First Method</button> 

TL DR:

How do you find the changes between the two lines?


I am dealing with large fragments of text, each of which can be about one hundred megabytes in megabytes . This is done in the browser.

+6
source share
4 answers

Running proper diff using only JavaScript can be potentially slow, but it depends on the performance requirements and quality of the diff, and of course how often it needs to be run.

One effective way would be to track changes when the user actually edits the document and saves these changes only after they are completed. For this, you can use, for example, the ACE editor or any other editor that supports change tracking.

http://ace.c9.io/

ACE tracks changes while editing a document. The ACE editor tracks commands in an easily understandable format, for example:

 {"action":"insertText","range":{"start":{"row":0,"column":0}, "end":{"row":0,"column":1}},"text":"d"} 

You can connect to the ACE editor changes and listen to the change events:

 var changeList = []; // list of changes // editor is here the ACE editor instance for example var editor = ace.edit(document.getElementById("editorDivId")); editor.setValue("original text contents"); editor.on("change", function(e) { // e.data has the change var cmd = e.data; var range = cmd.range; if(cmd.action=="insertText") { changeList.push([ 1, range.start.row, range.start.column, range.end.row, range.end.column, cmd.text ]) } if(cmd.action=="removeText") { changeList.push([ 2, range.start.row, range.start.column, range.end.row, range.end.column, cmd.text ]) } if(cmd.action=="insertLines") { changeList.push([ 3, range.start.row, range.start.column, range.end.row, range.end.column, cmd.lines ]) } if(cmd.action=="removeLines") { changeList.push([ 4, range.start.row, range.start.column, range.end.row, range.end.column, cmd.lines, cmd.nl ]) } }); 

To find out how this works, simply create some test runs that capture the changes. Basically there are only commands for commands:

  • Inserttext
  • removeText
  • insertLines
  • removeLines

Removing a new line from text can be a bit complicated.

When you have this list of changes, you are ready to reproduce the changes in the text file. You can even combine similar or overlapping changes into one change - for example, inserts in subsequent characters can be combined into one change.

When testing, there will be some problems, making a line back to the text is not trivial, but quite feasible and should not exceed about 100 lines of code.

It's nice that when you finish, you also have undo and redo commands that are easily accessible, so you can play the entire editing process.

+1
source

Edit: Added a modified script that can handle multiple text areas.

Here is a JSFiddle for a page with more than one editable text area. (Remember to open the developer tools to see the changes.) You just need to assign a unique identifier to each text resource. Then create a map using these id as keys, and each textarea edits the array as values. The following is an updated script:

 'use strict'; function Edit(type, position, text) { this.type = type; this.position = position; this.text = text; } var ADD = 'add'; var DELETE = 'delete'; var textAreaEditsMap = {}; var cursorStart = -1; var cursorEnd = -1; var currentEdit = null; var deleteOffset = 1; window.addEventListener('load', function() { var textareas = document.getElementsByClassName('text-editable'); for (var i = 0; i < textareas.length; ++i) { var textarea = textareas.item(i); var id = textarea.getAttribute('id'); textAreaEditsMap[id] = []; textarea.addEventListener('mouseup', handleMouseUp); textarea.addEventListener('keydown', handleKeyDown); textarea.addEventListener('keypress', handleKeyPress); } }); function handleMouseUp(event) { cursorStart = this.selectionStart; cursorEnd = this.selectionEnd; currentEdit = null; } function handleKeyDown(event) { cursorStart = this.selectionStart; cursorEnd = this.selectionEnd; if (event.keyCode >= 35 && event.keyCode <= 40) { // detect cursor movement keys currentEdit = null; } // deleting text if (event.keyCode === 8 || event.keyCode === 46) { if (currentEdit != null && currentEdit.type !== 'delete') { currentEdit = null; } if (cursorStart !== cursorEnd) { // Deleting highlighted text var edit = new Edit(DELETE, cursorStart, this.innerHTML.substring(cursorStart, cursorEnd)); textAreaEditsMap[this.getAttribute('id')].push(edit); currentEdit = null; } else if (event.keyCode === 8) { // backspace if (currentEdit == null) { deleteOffset = 1; var edit = new Edit(DELETE, cursorStart, this.innerHTML[cursorStart - 1]); textAreaEditsMap[this.getAttribute('id')].push(edit); currentEdit = edit; } else { ++deleteOffset; currentEdit.text = this.innerHTML[cursorStart - 1] + currentEdit.text; } } else if (event.keyCode === 46) { // delete if (currentEdit == null) { deleteOffset = 1; var edit = new Edit(DELETE, cursorStart, this.innerHTML[cursorStart]); textAreaEditsMap[this.getAttribute('id')].push(edit); currentEdit = edit; } else { currentEdit.text += this.innerHTML[cursorStart + deleteOffset++]; } } } console.log(textAreaEditsMap) } function handleKeyPress(event) { if (currentEdit != null && currentEdit.type !== 'add') { currentEdit = null; } if (currentEdit == null) { currentEdit = new Edit(ADD, cursorStart, String.fromCharCode(event.charCode)); textAreaEditsMap[this.getAttribute('id')].push(currentEdit); } else { currentEdit.text += String.fromCharCode(event.charCode); } console.log(textAreaEditsMap); } 

The original message with the original script that processes only one text field:

I made an example script that does what you need. I put a working example on JSFiddle. Make sure you press ctrl + shift + J on the JSFiddle sample page to open the developer tools so that you can see the array of changes registered as changes. Editing is added to the editing array in chronological order, so you can return to the original text by applying the reverse (i.e., add the deleted text back, delete the added text) in the reverse chronological order (i.e. iterate the array back). I did not handle copying, pasting, deleting, or re-editing from the context menu or key bindings, but I think you should use this example as a guide to take care of these things. Here is the script:

 'use strict'; function Edit(type, position, text) { this.type = type; this.position = position; this.text = text; } window.addEventListener('load', function() { var ADD = 'add'; var DELETE = 'delete'; var cursorStart = -1; var cursorEnd = -1; var edits = []; var currentEdit = null; var deleteOffset = 1; var textarea = document.getElementById('saved-text'); textarea.addEventListener('mouseup', function(event) { cursorStart = this.selectionStart; cursorEnd = this.selectionEnd; currentEdit = null; }); textarea.addEventListener('keydown', function(event) { cursorStart = this.selectionStart; cursorEnd = this.selectionEnd; if(event.keyCode >= 35 && event.keyCode <= 40) { // detect cursor movement keys currentEdit = null; } // deleting text if(event.keyCode === 8 || event.keyCode === 46) { if(currentEdit != null && currentEdit.type !== 'delete') { currentEdit = null; } if(cursorStart !== cursorEnd) { var edit = new Edit(DELETE, cursorStart, textarea.innerHTML.substring(cursorStart, cursorEnd)); edits.push(edit); currentEdit = null; } else if (event.keyCode === 8) { // backspace if (currentEdit == null) { deleteOffset = 1; var edit = new Edit(DELETE, cursorStart, textarea.innerHTML[cursorStart - 1]); edits.push(edit); currentEdit = edit; } else { ++deleteOffset; currentEdit.text = textarea.innerHTML[cursorStart - 1] + currentEdit.text; } } else if (event.keyCode === 46) { // delete if(currentEdit == null) { deleteOffset = 1; var edit = new Edit(DELETE, cursorStart, textarea.innerHTML[cursorStart]); edits.push(edit); currentEdit = edit; } else { currentEdit.text += textarea.innerHTML[cursorStart + deleteOffset++]; } } } console.log(edits) }); textarea.addEventListener('keypress', function(event) { if(currentEdit != null && currentEdit.type !== 'add') { currentEdit = null; } // adding text if(currentEdit == null) { currentEdit = new Edit(ADD, cursorStart, String.fromCharCode(event.charCode)); edits.push(currentEdit); } else { currentEdit.text += String.fromCharCode(event.charCode); } console.log(edits); }); }); 
+3
source

This is a problem with versioning the code and saving only changes between versions.

Take a look at jsdiff

You can create a patch, save it, and apply it later to the source code to get the modified text.

+3
source

Try to create basic comparison identifiers, for example, js below "+" , "-" ; use .map() to compare the source o , the edited input lines e , the returned diff array of the differences between o , e ; set o , e, diff as object properties

 var o = "Lorem ipsum dolor sit amet", e = "Foo ipsum amet_ sit" , res = { "original": o, "edited": e, "diff": o.split("").map(function(val, key) { // log edits // `+` preceding character: added character , // `-`: preceding character: removed character; // `+` preceding "|": no changes , // `-`: preceding "": no changes; // `"index"`: character `index` of original `o` input string return e[key] !== val ? "[edits:" + "+" + (e[key] || "") + "|-" + val + ", index:" + key + "]" + (e[key] || "") : "[edits:+|-, index:" + key + "]" + val }) }; document.getElementsByTagName("pre")[0].textContent = JSON.stringify(res, null, 2); 
 <pre></pre> 
+1
source

Source: https://habr.com/ru/post/988633/


All Articles