How to convert int to char [] without garbage generation in C #

Question

How to convert int to char [] without garbage generation in C #

Sure, this seems like a strange request, given the existence of ToString() and Convert.ToString() , but I need to convert the unsigned integer (i.e. UInt32 ) to its string representation, but I need to save the response in char[] .

The reason is that I work with character arrays for efficiency, and since the target char[] initialized as a member of char[10] (to store the string representation of UInt32.MaxValue ) when creating the object, it should theoretically be possible to do the conversion without creating garbage ( by which I mean without creating any temporary objects in the managed heap.)

Can anyone see a neat way to achieve this?

(I work in Framework 3.5SP1, if that matters.)

+4

c # .net

Stevewilkinson Mar 6 '11 at 14:50

source share

4 answers

The following code does this with the following caveat: it does not take culture settings into account, but always displays normal decimal digits.

 public static int ToCharArray(uint value, char[] buffer, int bufferIndex) { if (value == 0) { buffer[bufferIndex] = '0'; return 1; } int len = (int)Math.Ceiling(Math.Log10(value)); for (int i = len-1; i>= 0; i--) { buffer[bufferIndex+i] = (char)('0'+(value%10)); value /= 10; } return len; }

The return value is how much of char[] was used.

Edit (for arx): the next version avoids floating point math and swaps the buffer into place:

 public static int ToCharArray(uint value, char[] buffer, int bufferIndex) { if (value == 0) { buffer[bufferIndex] = '0'; return 1; } int bufferEndIndex = bufferIndex; while (value > 0) { buffer[bufferEndIndex++] = (char)('0'+(value%10)); value /= 10; } int len = bufferEndIndex-bufferIndex; while (--bufferEndIndex > bufferIndex) { char ch = buffer[bufferEndIndex]; buffer[bufferEndIndex] = buffer[bufferIndex]; buffer[bufferIndex++] = ch; } return len; }

And here is another variation that calculates the number of digits in a small loop:

 public static int ToCharArray(uint value, char[] buffer, int bufferIndex) { if (value == 0) { buffer[bufferIndex] = '0'; return 1; } int len = 1; for (uint rem = value/10; rem > 0; rem /= 10) { len++; } for (int i = len-1; i>= 0; i--) { buffer[bufferIndex+i] = (char)('0'+(value%10)); value /= 10; } return len; }

I leave the benchmarking to anyone who wants to do this ...;)

+4

Lucero Mar 6 '11 at 15:15

source share

Let this simplify and make the most of existing code:

 public static int ToCharArray(uint value, char[] buffer, int bufferIndex) { string txt = value.ToString(); txt.CopyTo(0, buffer, bufferIndex, txt.Length); return txt.Length; }

Since txt is a super-cheap gen0 trash, it is very efficient.

+1

Henk holterman Mar 6 '11 at 16:55

source share

I'm a little late to the party, but I think you probably won’t be able to demand memory faster and less than with a simple re-interpretation of the memory:

  [System.Security.SecuritySafeCritical] public static unsafe char[] GetChars(int value, char[] chars) { //TODO: if needed to use accross machines then // this should also use BitConverter.IsLittleEndian to detect little/big endian // and order bytes appropriately fixed (char* numPtr = chars) *(int*)numPtr = value; return chars; } [System.Security.SecuritySafeCritical] public static unsafe int ToInt32(char[] value) { //TODO: if needed to use accross machines then // this should also use BitConverter.IsLittleEndian to detect little/big endian // and order bytes appropriately fixed (char* numPtr = value) return *(int*)numPtr; }

This is just a demonstration of the idea - you obviously need to add a check on the size of the char array and make sure that you have the correct byte encoding. You can peer into the reflected BitConverter helper methods for these checks.

0

Jan Feb 03 '14 at 10:52

source share

arx · Accepted Answer · 2011-03-06T16:35:10+0000

In addition to my comment above, I wondered if log10 was too slow, so I wrote a version that doesn't use it.

For four-digit numbers, this version is about 35% faster, and for ten-digit numbers, about 16%.

One of the drawbacks is that it takes up space to fill ten digits in the buffer.

I do not swear he has no mistakes!

 public static int ToCharArray2(uint value, char[] buffer, int bufferIndex) { const int maxLength = 10; if (value == 0) { buffer[bufferIndex] = '0'; return 1; } int startIndex = bufferIndex + maxLength - 1; int index = startIndex; do { buffer[index] = (char)('0' + value % 10); value /= 10; --index; } while (value != 0); int length = startIndex - index; if (bufferIndex != index + 1) { while (index != startIndex) { ++index; buffer[bufferIndex] = buffer[index]; ++bufferIndex; } } return length; }

Update

I have to add, I'm using Pentium 4. More recent processors can calculate transcendental functions faster.

Conclusion

Yesterday, I realized that I made a student mistake and ran tests on the debug assembly. So I ran them again, but really it didn't really matter. The first column indicates the number of digits in the converted number. The remaining columns show the time in milliseconds to convert 500,000 numbers.

Results for uint:

  luc1 arx henk1 luc3 henk2 luc2 1 715 217 966 242 837 244 2 877 420 1056 541 996 447 3 1059 608 1169 835 1040 610 4 1184 795 1282 1116 1162 801 5 1403 969 1405 1396 1279 978 6 1572 1149 1519 1674 1399 1170 7 1740 1335 1648 1952 1518 1352 8 1922 1675 1868 2233 1750 1545 9 2087 1791 2005 2511 1893 1720 10 2263 2103 2139 2797 2012 1985

Results for ulong:

  luc1 arx henk1 luc3 henk2 luc2 1 802 280 998 390 856 317 2 912 516 1102 729 954 574 3 1066 746 1243 1060 1056 818 4 1300 1141 1362 1425 1170 1210 5 1557 1363 1503 1742 1306 1436 6 1801 1603 1612 2233 1413 1672 7 2269 1814 1723 2526 1530 1861 8 2208 2142 1920 2886 1634 2149 9 2360 2376 2063 3211 1775 2339 10 2615 2622 2213 3639 2011 2697 11 3048 2996 2513 4199 2244 3011 12 3413 3607 2507 4853 2326 3666 13 3848 3988 2663 5618 2478 4005 14 4298 4525 2748 6302 2558 4637 15 4813 5008 2974 7005 2712 5065 16 5161 5654 3350 7986 2994 5864 17 5997 6155 3241 8329 2999 5968 18 6490 6280 3296 8847 3127 6372 19 6440 6720 3557 9514 3386 6788 20 7045 6616 3790 10135 3703 7268

luc1: Lucero's first function

arx: my function

henk1: Henk function

luc3 The third function of Lucero

henk2: Henk function without copying to char array; those. just check the performance of ToString ().

luc2: Lucero's second function

A peculiar order is the order in which they were created.

I also ran a test without henk1 and henk2 so that there is no garbage collection. Times for the other three functions were almost the same. After the test passed three digits, the memory usage was stable: so the GC occurred during the Henk functions and did not adversely affect other functions.

Conclusion: just call ToString ()

How to convert int to char [] without garbage generation in C #

More articles: