Splitting a string without string.Split

I am doing a home question to split a string without using the framework method.

Below is the working code I came across.

I would like to know how to improve runtime to O (n)?

Any suggestions for improvement are also welcome.

public static string[] split(string txt, char[] delim) { char[] text = txt.ToCharArray(); string[] result = new string[0]; int count = 0; int i = 0; StringBuilder buff = new StringBuilder(); while(i < text.Length) { bool found = false; foreach(char del in delim) { if(del == txt[i]) { found = true; break; } } if(found) { count++; Array.Resize(ref result, count); result[count - 1] = buff.ToString(); buff = new StringBuilder(); } else { buff.Append(txt[i]); } i++; } if(buff.Length != 0) { count++; Array.Resize(ref result, count); result[count - 1] = buff.ToString(); } return(result); } 
+6
source share
5 answers

I have several changes that will simultaneously make this function more like C and reduce the runtime to O (n):

1) Instead of dynamically resizing the result array many times, you should create it with enough points to hold the maximum number of lines (you know that the number is less than txt.Length ), and then resize it only once in the end before you return it.

2) Instead of collecting your results using StringBuilder, make a char[] buff length txt.Length and an index variable j and make buff[j++] = txt[i] .

I think your function should be O (N) after you do this. Well technically it will be O (N * M), where M is the number of delimiters.

EDIT 1:

Here is the change that O (N) + O (M) will do instead of O (N * M):

Instead of iterating over the delimiters for each character in the string, you should skip the delimiters in ADVANCE and configure this array:

 bool[] isDelimiter = new bool[128]; // increase size if you are allowing non-ascii foreach(char delim in isDelimiter) { isDelimiter[(int)char] = true; } 

Then you can simply use this array to check each character of the string in constant time.

+6
source

I think your professor is looking for an API that uses only one character, not an array. Not an array of characters. I mean, if your separator string is "abcd", you will not split into all instances of "a", "b", "c", "d". If you find the whole line, you will split only.

Your current algorithm is not O (n), because for each element of the input array you are comparing it with each element of the delimiter array. This results in runtime O (n * m).

I do not think that this can be converted to O (n), because each element in the input needs to be compared with each element of the delimiter array. I think that most likely your professor is asking another question regarding an array of delimiters.

 public static String[] Split(String input, String delimiter) { List<String> parts = new List<String>(); StringBuilder buff = new StringBuilder(); if (delimiter.Length > 1) //you are splitting on a string not a character { //perform string searching algorithm here } else if(delimiter.Length == 0) { throw new InvalidOperationException("Invalid delimiter."); } else //you are splitting on a character { char delimChar = delimiter[0]; for (int i = 0; i < input.Length; i++) { if (input[i] == delimChar) { parts.Add(buff.ToString()); buff.Clear(); } else { buff.Append(input[i]); } } } return parts.ToArray(); } 

C # String.Split() takes in an array of delimiters, but I don't think it performs splitting in O (n) time.

If you are looking for String search algorithms, this might be useful. http://en.wikipedia.org/wiki/String_searching_algorithm

edit: I incorrectly referenced the fact that the C # String.Split() API did not accept an array of delimiters.

+2
source

You can do this O (n) if you put separator characters in a HashSet. Testing for the existence of a value in a HashSet is O (1).

 var delimterSet = new HashSet<char>(delim); 

...

 if(delimterSet.Contains(txt[i]) { ... } 

However, for a small number of delimiters, this will not improve performance.

+1
source

It is not possible to execute String.Split in O (n), since the list of delimiters must be traversed / searched.

0
source

Perhaps you can try to do all the work in one go.

 public static String[] Split(String txt, char[] delim) { if (txt == null) return new String[0]; // or exception if (delim == null || delim.Length == 0) return new String[0]; // or exception char[] text = txt.ToCharArray(); string[] result = new string[1]; // If there is no delimiter in the string, return the whole string int part = 0; int itemInArray = 1; for (int i = 0; i < text.Length; i++) { if (IsIn(delim, text[i])) { Array.Resize(ref result, ++itemInArray); // Is it consider as a framework method ??? part++; } else result[part] += text[i]; } return result; } public static Boolean IsIn(char[] delim, char c) { for (int i = 0; i < delim.Length; i++) if (c == delim[i]) return true; return false; } 
0
source

Source: https://habr.com/ru/post/907245/


All Articles