Javascript (jQuery) removes the last sentence of long text

I am looking for a javascript function that is smart enough to remove the last sentence of a long piece of text (actually a single paragraph). Example text that shows complexity:

<p>Blabla, some more text here. Sometimes <span>basic</span> html code is used but that should not make the "selection" of the sentence any harder! I looked up the window and I saw a plane flying over. I asked the first thing that came to mind: "What is it doing up there?" She did not know, "I think we should move past the fence!", she quickly said. He later described it as: "Something insane."</p> 

Now I could divide by . and delete the last entry of the array, but this will not work for sentences ending in ? or ! , and some sentences end with quotes like something: "stuff."

 function removeLastSentence(text) { sWithoutLastSentence = ...; // ?? return sWithoutLastSentence; } 

How to do it? What is the correct algorithm?

Change By long text, I mean all the content in my paragraph and in the sentence I mean the actual sentence (not the line), so in my example the last sentence: He later described it as: "Something insane." When one of them leaves, the next is She did not know, "I think we should move past the fence!", she quickly said."

+6
source share
3 answers

Define your rules: // 1. The sentence begins with the letter "Capital" // 2. The application is preceded by nothing or [.!?], But not [,:;] // 3. The application may be preceded by quotation marks if they are not formatted properly in the way, for example, ["'] // 4. The sentence may be incorrect in this case, if the word following the quote is a name

Any additional rules?

Define your purpose: // 1. Delete the last sentence

Assumptions: If you started with the last character in the line of text and worked back, then you should define the beginning of the sentence as: 1. The line of text before the symbol [.?!] OR 2. The line of text before the symbol ["'] and is preceded by the letter" Capital "3. Each [.] Is preceded by a space 4. We do not correct the html tags 5. These assumptions are not reliable and should be adapted regularly

Possible solution: Read in your line and divide it by a space character to give us fragments of lines for viewing in reverse order.

 var characterGroups = $('#this-paragraph').html().split(' ').reverse(); 

If your line is:

Blabla, some more text. Sometimes basic HTML is used, but this should not make the "choice" of a sentence more difficult! I looked out the window and saw a plane flying over it. I asked the first thing that occurred to me: "What is he doing there?" She didn’t know: β€œI think we need to get past the fence!” She said quickly. He later described it as: β€œSomething crazy.”

 var originalString = 'Blabla, some more text here. Sometimes <span>basic</span> html code is used but that should not make the "selection" of the sentence any harder! I looked up the window and I saw a plane flying over. I asked the first thing that came to mind: "What is it doing up there?" She did not know, "I think we should move past the fence!", she quickly said. He later described it as: "Something insane."'; 

Then your array in characterGroups will be:

  ["insane."", ""Something", "as:", "it", "described", "later", "He", "said.", "quickly", "she", "fence!",", "the", "past", "move", "should", "we", "think", ""I", "know,", "not", "did", "She", "there?"", "up", "doing", "it", "is", ""What", "mind:", "to", "came", "that", "thing", "first", "the", "asked", "I", "over.", "flying", "plane", "a", "saw", "I", "and", "window", "the", "up", "looked", "I", "harder!", "any", "sentence", "the", "of", ""selection"", "the", "make", "not", "should", "that", "but", "used", "is", "code", "html", "basic", "Sometimes", "here.", "text", "more", "some", "Blabla,"] 

Note. tags '' and others will be removed using the .text () method in jQuery

Each block is followed by a space, therefore, when we determine the initial position of the sentence (by the index of the array), we will know what index took place, and we can split the original line in the place where this index occupies the space from the end of the sentence.

Give yourself a variable for the mark, if we found it or not, and a variable to hold the index position of the array element, which we identify as holding the beginning of the last sentence:

 var found = false; var index = null; 

Scroll through the array and find any element ending in [.!?] OR ending in "where the previous element began with a capital letter.

 var position = 1,//skip the first one since we know that the end anyway elements = characterGroups.length, element = null, prevHadUpper = false, last = null; while(!found && position < elements) { element = characterGroups[position].split(''); if(element.length > 0) { last = element[element.length-1]; // test last character rule if( last=='.' // ends in '.' || last=='!' // ends in '!' || last=='?' // ends in '?' || (last=='"' && prevHadUpper) // ends in '"' and previous started [AZ] ) { found = true; index = position-1; lookFor = last+' '+characterGroups[position-1]; } else { if(element[0] == element[0].toUpperCase()) { prevHadUpper = true; } else { prevHadUpper = false; } } } else { prevHadUpper = false; } position++; } 

If you run the above script, it will correctly identify β€œHe” as the beginning of the last sentence.

 console.log(characterGroups[index]); // He at index=6 

Now you can run the line that you had before:

 var trimPosition = originalString.lastIndexOf(lookFor)+1; var updatedString = originalString.substr(0,trimPosition); console.log(updatedString); // Blabla, some more text here. Sometimes <span>basic</span> html code is used but that should not make the "selection" of the sentence any harder! I looked up the window and I saw a plane flying over. I asked the first thing that came to mind: "What is it doing up there?" She did not know, "I think we should move past the fence!", she quickly said. 

Run it and get: Blabla, some more text. Sometimes basic HTML is used, but this should not make the "choice" of a sentence more difficult! I looked out the window and saw a plane flying over it. I asked the first thing that occurred to me: "What is he doing there?"

Run it and get: Blabla, some more text. Sometimes basic HTML is used, but this should not make the "choice" of a sentence more difficult! I looked out the window and saw a plane flying over it.

Run it and get: Blabla, some more text. Sometimes basic HTML is used, but this should not make the "choice" of a sentence more difficult!

Run it and get: Blabla, more text here.

Run it and get: Blabla, more text here.

So, I think it matches what you are looking for?

As a function:

 function trimSentence(string){ var found = false; var index = null; var characterGroups = string.split(' ').reverse(); var position = 1,//skip the first one since we know that the end anyway elements = characterGroups.length, element = null, prevHadUpper = false, last = null, lookFor = ''; while(!found && position < elements) { element = characterGroups[position].split(''); if(element.length > 0) { last = element[element.length-1]; // test last character rule if( last=='.' || // ends in '.' last=='!' || // ends in '!' last=='?' || // ends in '?' (last=='"' && prevHadUpper) // ends in '"' and previous started [AZ] ) { found = true; index = position-1; lookFor = last+' '+characterGroups[position-1]; } else { if(element[0] == element[0].toUpperCase()) { prevHadUpper = true; } else { prevHadUpper = false; } } } else { prevHadUpper = false; } position++; } var trimPosition = string.lastIndexOf(lookFor)+1; return string.substr(0,trimPosition); } 

It is trivial to make a plugin for it if, but beware of the ASSUMPTIONS! :)

Does it help?

Thanks AE

+2
source

That should do it.

 /* Assumptions: - Sentence separators are a combination of terminators (.!?) + doublequote (optional) + spaces + capital letter. - I haven't preserved tags if it gets down to removing the last sentence. */ function removeLastSentence(text) { lastSeparator = Math.max( text.lastIndexOf("."), text.lastIndexOf("!"), text.lastIndexOf("?") ); revtext = text.split('').reverse().join(''); sep = revtext.search(/[AZ]\s+(\")?[\.\!\?]/); lastTag = text.length-revtext.search(/\/\</) - 2; lastPtr = (lastTag > lastSeparator) ? lastTag : text.length; if (sep > -1) { text1 = revtext.substring(sep+1, revtext.length).trim().split('').reverse().join(''); text2 = text.substring(lastPtr, text.length).replace(/['"]/g,'').trim(); sWithoutLastSentence = text1 + text2; } else { sWithoutLastSentence = ''; } return sWithoutLastSentence; } /* TESTS: var text = '<p>Blabla, some more text here. Sometimes <span>basic</span> html code is used but that should not make the "selection" of the text any harder! I looked up the window and I saw a plane flying over. I asked the first thing that came to mind: "What is it doing up there?" She did not know, "I think we should move past the fence!", she quickly said. He later described it as: "Something insane. "</p>'; alert(text + '\n\n' + removeLastSentence(text)); alert(text + '\n\n' + removeLastSentence(removeLastSentence(text))); alert(text + '\n\n' + removeLastSentence(removeLastSentence(removeLastSentence(text)))); alert(text + '\n\n' + removeLastSentence(removeLastSentence(removeLastSentence(removeLastSentence(text))))); alert(text + '\n\n' + removeLastSentence(removeLastSentence(removeLastSentence(removeLastSentence(removeLastSentence(text)))))); alert(text + '\n\n' + removeLastSentence(removeLastSentence(removeLastSentence(removeLastSentence(removeLastSentence(removeLastSentence(text))))))); alert(text + '\n\n' + removeLastSentence('<p>Blabla, some more text here. Sometimes <span>basic</span> html code is used but that should not make the "selection" of the text any harder! I looked up the ')); */ 
+1
source

This one good. Why don't you create a temporary variable, convert everything to '!' and '?' in '.', separate this temporary variable, delete the last sentence, concatenate this temp array into a string and take its length? Then adjust the original paragraph to this length

0
source

Source: https://habr.com/ru/post/897912/


All Articles