You create a large number of threads, each of which then only works very little. To sort 25,000 ints, you create about 12,500 threads that spawn other threads and combine their results, and about 12,500 threads that sort only two ints.
The overhead of creating all of these threads far outweighs the benefits of parallel processing.
To avoid this, make sure that each thread has enough work. For example, if one thread discovers that it needs to sort <10,000 numbers, it can simply sort them by itself with the usual merge sort, instead of creating new threads.
source share