If you can get the int array underlying BitArray, this should provide much better performance:
Assuming you don't know the number of bits that are set:
public static int[] GetIndexesForPositives() { var idIndexes = new List<int>(); System.Reflection.FieldInfo field = data.GetType().GetField("m_array", System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance); int[] values = field.GetValue(data) as int[]; for (var i = 0; i < values.Length; i++) { int _i = values[i]; if (_i != 0) { for (var j = 0; j < 32; j++) { if ((_i & (1 << j)) != 0) { idIndexes.Add(i * 32 + j); } } } } return idIndexes.ToArray(); }
If you know the number of bits that are set, you can do this instead:
public static int[] GetIndexesForPositives(int length) { var idIndexes = new int[length]; var idx = 0; System.Reflection.FieldInfo field = data.GetType().GetField("m_array", System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance); int[] values = field.GetValue(data) as int[]; for (var i = 0; i < values.Length; i++) { int _i = values[i]; if (_i != 0) { for (var j = 0; j < 32; j++) { if ((_i & (1 << j)) != 0) { idIndexes[idx++] = i * 32 + j; } } } }
In my tests, these two are faster than your method, even one that does not know how large the returned array will be.
My results were tested using a random BitArray of 50 million entries:
1) 25001063 records found in 50000000, took 1415.5752ms 2) 25001063 records found in 50000000, took 1099.67ms 3) 25001063 records found in 50000000, took 1045.6862ms 4) 25001063 records found in 50000000, took 745.7762ms" 1) is your code but using an arraylist instead of using some `GetPositiveCount` to get the output length. 2) is your code 3) is my (revised) first example 4) is my (revised) second example
edit: itโs also worth noting that this is a problem that can really benefit from multithreaded processing. Divide ByteArray into 4 parts, and there you have 4 threads that could check data at the same time.
Edit: I know that this is already accepted, but here is one more bit you can do to improve performance if you know that most of the time your list will be very scarce:
for (var j = 0; j < 32; j++) { if (_i == 0) break; if ((_i & (1)) != 0) { idIndexes.Add(i * 32 + j); } _i = _i >> 1; }
it's a little slower if the list is> 40% or more, but if you know that the list will always be 10% 1 s and 90% 0, then this will work even faster for you.