Parallel Implementation of Backward Substitution in Gaussian Elimination to Solve Systems of Linear Equations
I just uploaded another paper on Parallel Computing. This time I implemented parallel algorithm in the backward Substitution portion of the Gaussian ELimination algorithm to solve Systems of Linear Equations. I used C++ and the MPI Library from the Microsoft Compute Cluster Pack that is part of Microsoft High Performance Computing.
Here's the abstract of the paper:
Many practical problems, including problems in Mathematics itself can be modeled as Systems of Linear Equations. This suggests that methods to solve such systems are of high interest. One method that is often used in practice is the Gaussian Elimination as it uses reasonable computing time and storage demand. In addition to that, the computing time can be improved by implementing the algorithm in parallel. This paper shows the parallel implementation of the backward substitution process within the Gaussian Elimination. The implementation uses distributed memory model or message passing mechanism. Specifically, it uses the C++ programming language and the MPI library.
Note that I only implement the Backward Substitution part of Gaussian Elimination. I haven't got time to do parallel implementation of the portion of the algorithm to find the Upper Triangular Form of the augmented matrix of the System of Linear Equations using Elementary Row Operation. Probably you would? 
If you'd like to have some idea on how the sequential code and the parallel code look like, here it goes:
The Sequential Code:
// Backward Substitution (for parallelism), it works for Triangular Matrix
double result[colSize-1];
result[colSize-2] = augMat[rowSize-1][colSize-1]/augMat[rowSize-1][colSize-2];
for(int i = rowSize-2; i>=0; i--)
{
double sum = 0.0;
for(int j = colSize-2;j >= i+1;j--)
{
sum = sum + augMat
[j]*result[j];
}
result
= (1/augMat
)*(augMat
[colSize-1] - sum);
}
// printing out the result
cout << "solution";
for(int i = 0; i <= colSize-2; i++)
{
cout << endl;
cout << result
;
}
The Parallel Code:
// Backward Substitution (for parallelism)
// it works for Triangular Matrix
int rank, size;
double row[colSize];
double x = augMat[rowSize-1][colSize-1]/augMat[rowSize-1][colSize-2];
int currRoot = colSize-2;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
// Distribute matrix rows to processes (one row/process)
MPI_Scatter(&augMat, colSize, MPI_DOUBLE, &row, colSize, MPI_DOUBLE, 0, MPI_COMM_WORLD);
// Broadcast x to each process to do backward substitution in parallel
while(currRoot >= 0)
{
MPI_Bcast(&x, 1, MPI_DOUBLE, currRoot, MPI_COMM_WORLD);
if(rank == currRoot)
{
row[colSize-1] = x;
}
else if(rank == currRoot-1)
{
row[currRoot] = row[currRoot] * x;
row[colSize-1] = row[colSize-1] - row[currRoot];
x = row[colSize-1]/row[currRoot-1];
row[colSize-1] = x;
}
else
{
row[currRoot] = row[currRoot] * x;
row[colSize-1] = row[colSize-1] - row[currRoot];
}
MPI_Barrier(MPI_COMM_WORLD);
currRoot = currRoot - 1;
}
// Collect solution from each process
MPI_Gather(&row[colSize-1], 1, MPI_DOUBLE, solution, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD);
// Print Output
if(rank == 0)
{
for(int i = 0; i < colSize-1; i++)
{
cout << solution
;
cout << endl;
}
}
MPI_Finalize();
See the paper for more details.