If you are faced with a loop nest, one simple approach is to unroll the inner loop. Unrolling the innermost loop in a nest isn’t any different from what we saw above. You just pretend the rest of the loop nest doesn’t exist and approach it in the nor- mal way. However, there are times when you want to apply loop unrolling not just to the inner loop, but to outer loops as well — or perhaps only to the outer loops. Here’s a typical loop nest:
for (i=0; i<n; i++)
for (j=0; j<n; j++)
for (k=0; k<n; k++)
a[i][j][k] = a[i][j][k] + b[i][j][k] * c;
To unroll an outer loop, you pick one of the outer loop index variables and replicate the innermost loop body so that several iterations are performed at the same time, just like we saw in the (Reference). The difference is in the index variable for which you unroll. In the code below, we have unrolled the middle (j) loop twice:
for (i=0; i<n; i++)
for (j=0; j<n; j+=2)
for (k=0; k<n; k++) {
a[i][j][k] = a[i][j][k] + b[i][k][j] * c;
a[i][j+1][k] = a[i][j+1][k] + b[i][k][j+1] * c;
}
We left the k loop untouched; however, we could unroll that one, too. That would give us outer and inner loop unrolling at the same time:
for (i=0; i<n; i++)
for (j=0; j<n; j+=2)
for (k=0; k<n; k+=2) {
a[i][j][k] = a[i][j][k] + b[i][k][j] * c;
a[i][j+1][k] = a[i][j+1][k] + b[i][k][j+1] * c;
a[i][j][k+1] = a[i][j][k+1] + b[i][k+1][j] * c;
a[i][j+1][k+1] = a[i][j+1][k+1] + b[i][k+1][j+1] * c;
}
We could even unroll the i loop too, leaving eight copies of the loop innards. (Notice that we completely ignored preconditioning; in a real application, of course, we couldn’t.)









"The purpose of Chuck Severence's book, High Performance Computing has always been to teach new programmers and scientists about the basics of High Performance Computing. This book is for learners […]"