Algorithm: divide the string into N parts using spaces so that all parts have almost the same length

I am looking for an algorithm that takes a string and breaks it into a certain number of parts. These parts should contain complete words (therefore, spaces are used to separate the line), and the parts should be almost the same length or contain the longest parts.

I know that it is difficult to program a function that can do what I want, but I wonder if there is a well-established and fast algorithm for this purpose?

edit: To clarify my question, I will describe your problem that I am trying to solve.

I generate images with a fixed width. In these images, I write usernames using GD and Freetype in PHP. Since I have a fixed width, I want to split the names into 2 or 3 lines if they do not fit into one.

To fill as much space as possible, I want to split the names so that each line contains as many words as possible. By this I mean that in one line there should be as many words as necessary in order to keep the length of each line next to the average line length of the entire text block. Therefore, if there is one long word and two short words, two short words should stand on the same line if all lines are equal.

(Then I calculate the width of the text block using 1, 2 or 3 lines, and if it fits into my image, I visualize it. Just if there are 3 lines and this does not fit, I reduce the font size until everything is fine. )

Example: This is a long text should display something like this:

 This is a long text 

or

 This is a long text 

but not:

 This is a long text 

and also no:

 This is a long text 

Hope I can explain what I'm looking for.

+4
source share
7 answers

If you're talking about line breaks, see Dynamic Line Break, which gives Dynamic Programming to split words into lines.

+6
source

I do not know about the verified ones, but it seems that the simplest and most effective solution would be to divide the string length by N, and then find the nearest white space for the divided places (you need to look both forward and backward).

The code below works, although there are many errors that it does not handle. It looks like it will work in O (n), where n is the number of rows required.

 class Program { static void Main(string[] args) { var s = "This is a string for testing purposes. It will be split into 3 parts"; var p = s.Length / 3; var w1 = 0; var w2 = FindClosestWordIndex(s, p); var w3 = FindClosestWordIndex(s, p * 2); Console.WriteLine(string.Format("1: {0}", s.Substring(w1, w2 - w1).Trim())); Console.WriteLine(string.Format("2: {0}", s.Substring(w2, w3 - w2).Trim())); Console.WriteLine(string.Format("3: {0}", s.Substring(w3).Trim())); Console.ReadKey(); } public static int FindClosestWordIndex(string s, int startIndex) { int wordAfterIndex = -1; int wordBeforeIndex = -1; for (int i = startIndex; i < s.Length; i++) { if (s[i] == ' ') { wordAfterIndex = i; break; } } for (int i = startIndex; i >= 0; i--) { if (s[i] == ' ') { wordBeforeIndex = i; break; } } if (wordAfterIndex - startIndex <= startIndex - wordBeforeIndex) return wordAfterIndex; else return wordBeforeIndex; } } 

The output for this is:

 1: This is a string for 2: testing purposes. It will 3: be split into 3 parts 
+3
source

Again, after Brian's answer, I made a PHP version of my code:

 // Input text $txt = "This is a really long string that should be broken up onto lines of about the same number of characters."; // Number of lines $numLines = 3; /* Do it, result comes as an array: */ $aResult = splitLinesByClosestWhitespace($txt, $numLines); /* Output result: */ if ($aResult) { for ($x=1; $x<=$numLines; $x++) echo "Line ".$x.": ".$aResult[$x]."<br>"; } else { echo "Not enough spaces to generate the lines!"; } /**********************/ /** * Splits a string into multiple lines of the closest possible same length, * using the closest whitespaces * @param string $txt String to split * @param integer $numLines Number of lines * @return array|false */ function splitLinesByClosestWhitespace($txt, $numLines) { $p = intval( strlen($txt) / $numLines ); $aTxtIndx = array(); $aTxt = array(); // Check we have enough whitespaces to generate the number of lines $wsCount = count( explode(" ", $txt) ) - 1; if ($wsCount<$numLines) return false; // Get the indexes for ($x=1; $x<=$numLines; $x++) { $aTxtIndx[$x] = FindClosestWordIndex($txt, $p * ($x-1) ); } // Do the split for ($x=1; $x<=$numLines; $x++) { if ($x != $numLines) $aTxt[$x] = substr( $txt, $aTxtIndx[$x], trim($aTxtIndx[$x+1]) ); else $aTxt[$x] = substr( $txt, trim($aTxtIndx[$x]) ); } return $aTxt; } /** * Finds the closest word to a string index * @param string $s String to search * @param integer $startIndex Index at which to find the closest word * @return integer */ function FindClosestWordIndex($s, $startIndex) { $wordAfterIndex = 0; $wordBeforeIndex = 0; for ($i = $startIndex; $i < strlen($s); $i++) { if ($s[$i] == ' ') { $wordAfterIndex = $i; break; } } for ($i = $startIndex; $i >= 0; $i--) { if ($s[$i] == ' ') { $wordBeforeIndex = $i; break; } } if ($wordAfterIndex - $startIndex <= $startIndex - $wordBeforeIndex) return $wordAfterIndex; else return $wordBeforeIndex; } 
+1
source

Equal Division NP-Complete

0
source

The way word wrapping is usually done is to place as many words as possible on one line and move on to the next when there is no more space. This assumes, of course, that you have the maximum width in mind.

No matter what algorithm you use, keep in mind that if you are not working with a fixed-width font, you want to work with the physical width of the word, not the number of letters.

0
source

Following Brian, I made a version of my JavaScript code: http://jsfiddle.net/gmoz22/CPGY2/ .

 // Input text var txt = "This is a really long string that should be broken up onto lines of about the same number of characters."; // Number of lines var numLines = 3; /* Do it, result comes as an array: */ var aResult = splitLinesByClosestWhitespace(txt, numLines); /* Output result: */ if (aResult) { for (var x = 1; x<=numLines; x++) document.write( "Line "+x+": " + aResult[x] + "<br>" ); } else { document.write("Not enough spaces to generate the lines!"); } /**********************/ // Original algorithm by http://stackoverflow.com/questions/2381525/algorithm-split-a-string-into-n-parts-using-whitespaces-so-all-parts-have-nearl/2381772#2381772, rewritten for JavaScript by Steve Oziel /** * Trims a string for older browsers * Used only if trim() if it is not already available on the Prototype-Object * since overriding it is a huge performance hit (generally recommended when extending Native Objects) */ if (!String.prototype.trim) { String.prototype.trim = function(){return this.replace(/^\s+|\s+$/g, '');}; } /** * Splits a string into multiple lines of the closest possible same length, * using the closest whitespaces * @param {string} txt String to split * @param {integer} numLines Number of lines * @returns {Array} */ function splitLinesByClosestWhitespace(txt, numLines) { var p = parseInt(txt.length / numLines); var aTxtIndx = []; var aTxt = []; // Check we have enough whitespaces to generate the number of lines var wsCount = txt.split(" ").length - 1; if (wsCount<numLines) return false; // Get the indexes for (var x=1; x<=numLines; x++) { aTxtIndx[x] = FindClosestWordIndex(txt, p * (x-1) ); } // Do the split for (var x=1; x<=numLines; x++) { if (x != numLines) aTxt[x] = txt.slice(aTxtIndx[x], aTxtIndx[x+1]).trim(); else aTxt[x] = txt.slice(aTxtIndx[x]).trim(); } return aTxt; } /** * Finds the closest word to a string index * @param {string} s String to search * @param {integer} startIndex Index at which to find the closest word * @returns {integer} */ function FindClosestWordIndex(s, startIndex) { var wordAfterIndex = 0; var wordBeforeIndex = 0; for (var i = startIndex; i < s.length; i++) { if (s[i] == ' ') { wordAfterIndex = i; break; } } for (var i = startIndex; i >= 0; i--) { if (s[i] == ' ') { wordBeforeIndex = i; break; } } if (wordAfterIndex - startIndex <= startIndex - wordBeforeIndex) return wordAfterIndex; else return wordBeforeIndex; } 

It works great when the number of lines you need is not too close to the number of spaces. In the example I gave, there are 19 spaces, and it begins to appear when you ask to break it into 17, 18 or 19 lines. Editing is welcome!

0
source

Source: https://habr.com/ru/post/1303144/


All Articles