I am implementing a Floyd-Warsall algorithm with C ++, C # and java. in each language I use a serial and parallel implementation after testing, the result:
(Elapsed time - only for basic functions and reading files, Inti of Variable and ... not measured.)
download sources here SourceCodes
C ++
- IDE: Netbeans
- Compiler: MinGW-4.8.1
- sequential time: 9.333000
- Parallel Time: 2.539000
- using
OpenMp
for parallel
if NumOfThreads
= 1, then Sequential else Parallel
variables
#define n 1000 double dist[n][n]; void floyd_warshall(int NumOfThreads) { int i, j, k; omp_set_num_threads(NumOfThreads); for (k = 0; k < n; ++k) #pragma omp parallel for private(i,j) for (i = 0; i < n; ++i) for (j = 0; j < n; ++j) if ((dist[i][k] * dist[k][j] != 0) && (i != j)) if ((dist[i][k] + dist[k][j] < dist[i][j]) || (dist[i][j] == 0)) dist[i][j] = dist[i][k] + dist[k][j]; }
Java
- IDE: Netbeans
- Compiler: Netbeans by default
- sequential time: 11.632
- Parallel Time: 3.089
- -Xms512m -Xmx1024m
- import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Future;
java variables
public final int numNodes =1000; public final double[][] costs= new double[numNodes][numNodes] ;
I did not put java code here because its work is on the right (I think)
WITH#
- IDE: Visual Studio 2012
- Compiler: Visual Studio 2012 by default
- sequential time: 31.312
- Parallel Time: 8.920
- using System.Threading.Tasks;
variable
const int n = 1000; static double[,] dist = new double[n, n];
parallel code
static void floyd_warshall(ParallelOptions pOp) { int k; for (k = 0; k < n; ++k) Parallel.For<int>(0, n, pOp, () => 0, (i, loop, j) => { for (j = 0; j < n; ++j) if ((dist[i, k] * dist[k, j] != 0) && (i != j)) if ((dist[i, k] + dist[k, j] < dist[i, j]) || (dist[i, j] == 0)) dist[i, j] = dist[i, k] + dist[k, j]; return 0; }, (j) => { });
one code
static void floyd_warshallSingle() { int i, j, k; for (k = 0; k < n; ++k) for (i = 0; i < n; ++i) for (j = 0; j < n; ++j) if ((dist[i,k] * dist[k,j] != 0) && (i != j)) if ((dist[i,k] + dist[k,j] < dist[i,j]) || (dist[i,j] == 0)) dist[i,j] = dist[i,k] + dist[k,j]; }
What happened to my C # implementation?
everyone uses the same file
Now my question is, why does solving this algorithm take longer with C #? java and C ++ have almost the same time, but I think my implementation with C # is wrong, because this difference between C # and C ++ is weird!
please help me improve my C # implementation or name some reasons Thank you!
change 1
i changed my arrays to a notched array, and the result changed to:
- consecutive time: 19.22
- Parallel Time: 4.903
still its a huge difference between C # and C ++ or java! any idea why?
new variables
const int n = 1000; static double[][] dist = new double[n][];
new codes:
static void floyd_warshallSingle() { int i, j, k; for (k = 0; k < n; ++k) for (i = 0; i < n; ++i) for (j = 0; j < n; ++j) if ((dist[i][k] * dist[k][j] != 0) && (i != j)) if ((dist[i][k] + dist[k][j] < dist[i][j]) || (dist[i][j] == 0)) dist[i][j] = dist[i][k] + dist[k][j]; } static void floyd_warshall(ParallelOptions pOp) { int k; for (k = 0; k < n; ++k) Parallel.For<int>(0, n, pOp, () => 0, (i, loop, j) => { for (j = 0; j < n; ++j) if ((dist[i][k] * dist[k][j] != 0) && (i != j)) if ((dist[i][ k] + dist[k][j] < dist[i][ j]) || (dist[i][j] == 0)) dist[i][ j] = dist[i][k] + dist[k][j]; return 0; }, (j) => { }); }