Skip to content Skip to navigation

OpenStax-CNX

You are here: Home » Content » Understanding Parallelism - Ambiguous References

Navigation

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

Endorsed by Endorsed (What does "Endorsed by" mean?)

This content has been endorsed by the organizations listed. Click each link for a list of all content endorsed by the organization.
  • HPC Open Edu Cup

    This module is included inLens: High Performance Computing Open Education Cup 2008-2009
    By: Ken Kennedy Institute for Information TechnologyAs a part of collection: "High Performance Computing"

    Click the "HPC Open Edu Cup" link to see all content they endorse.

Affiliated with (What does "Affiliated with" mean?)

This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
  • NSF Partnership display tagshide tags

    This module is included inLens: NSF Partnership in Signal Processing
    By: Sidney BurrusAs a part of collection: "High Performance Computing"

    Click the "NSF Partnership" link to see all content affiliated with them.

    Click the tag icon tag icon to display tags associated with this content.

  • Featured Content

    This module is included inLens: Connexions Featured Content
    By: ConnexionsAs a part of collection: "High Performance Computing"

    Comments:

    "The purpose of Chuck Severence's book, High Performance Computing has always been to teach new programmers and scientists about the basics of High Performance Computing. This book is for learners […]"

    Click the "Featured Content" link to see all content affiliated with them.

Also in these lenses

  • UniqU content

    This module is included inLens: UniqU's lens
    By: UniqU, LLCAs a part of collection: "High Performance Computing"

    Click the "UniqU content" link to see all content selected in this lens.

  • Lens for Engineering

    This module is included inLens: Lens for Engineering
    By: Sidney Burrus

    Click the "Lens for Engineering" link to see all content selected in this lens.

  • eScience, eResearch and Computational Problem Solving

    This module is included inLens: eScience, eResearch and Computational Problem Solving
    By: Jan E. OdegardAs a part of collection: "High Performance Computing"

    Click the "eScience, eResearch and Computational Problem Solving" link to see all content selected in this lens.

Recently Viewed

This feature requires Javascript to be enabled.

Tags

(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.
 

Understanding Parallelism - Ambiguous References

Module by: Charles Severance, Kevin Dowd. E-mail the authors

Every dependency we have looked at so far has been clear cut; you could see exactly what you were dealing with by looking at the source code. But other times, describing a dependency isn’t so easy. Recall this loop from the “Antidependencies” section (Reference) earlier in this chapter:


DO I=1,N A(I) = B(I) * E B(I) = A(I+2) * C ENDDO

Because each variable reference is solely a function of the index, I, it’s clear what kind of dependency we are dealing with. Furthermore, we can describe how far apart (in iterations) a variable reference is from its definition. This is called the dependency distance. A negative value represents a flow dependency; a positive value means there is an antidependency. A value of zero says that no dependency exists between the reference and the definition. In this loop, the dependency distance for A is +2 iterations.

However, array subscripts may be functions of other variables besides the loop index. It may be difficult to tell the distance between the use and definition of a particular element. It may even be impossible to tell whether the dependency is a flow dependency or an antidependency, or whether a dependency exists at all. Consequently, it may be impossible to determine if it’s safe to overlap execution of different statements, as in the following loop:


DO I=1,N A(I) = B(I) * E B(I) = A(I+K) * C ← K unknown ENDDO

If the loop made use of A(I+K), where the value of K was unknown, we wouldn’t be able to tell (at least by looking at the code) anything about the kind of dependency we might be facing. If K is zero, we have a dependency within the iteration and no loop-carried dependencies. If K is positive we have an antidependency with distance K. Depending on the value for K, we might have enough parallelism for a superscalar processor. If K is negative, we have a loop-carried flow dependency, and we may have to execute the loop serially.

Ambiguous references, like A(I+K) above, have an effect on the parallelism we can detect in a loop. From the compiler perspective, it may be that this loop does contain two independent calculations that the author whimsically decided to throw into a single loop. But when they appear together, the compiler has to treat them conservatively, as if they were interrelated. This has a big effect on performance. If the compiler has to assume that consecutive memory references may ultimately access the same location, the instructions involved cannot be overlapped. One other option is for the compiler to generate two versions of the loop and check the value for K at runtime to determine which version of the loop to execute.

A similar situation occurs when we use integer index arrays in a loop. The loop below contains only a single statement, but you can’t be sure that any iteration is independent without knowing the contents of the K and J arrays:


DO I=1,N A(K(I)) = A(K(I)) + B(J(I)) * C ENDDO

For instance, what if all of the values for K(I) were the same? This causes the same element of the array A to be rereferenced with each iteration! That may seem ridiculous to you, but the compiler can’t tell.

With code like this, it’s common for every value of K(I) to be unique. This is called a permutation. If you can tell a compiler that it is dealing with a permutation, the penalty is lessened in some cases. Even so, there is insult being added to injury. Indirect references require more memory activity than direct references, and this slows you down.

Pointer Ambiguity in Numerical C Applications

FORTRAN compilers depend on programmers to observe aliasing rules. That is, programmers are not supposed to modify locations through pointers that may be aliases of one another. They can become aliases in several ways, such as when two dummy arguments receive pointers to the same storage locations:


CALL BOB (A,A) ... END SUBROUTINE BOB (X,Y) ← X,Y become aliases

C compilers don’t enjoy the same restrictions on aliasing. In fact, there are cases where aliasing could be desirable. Additionally, C is blessed with pointer types, increasing the opportunities for aliasing to occur. This means that a C compiler has to approach operations through pointers more conservatively than a FORTRAN compiler would. Let’s look at some examples to see why.

The following loop nest looks like a FORTRAN loop cast in C. The arrays are declared or allocated all at once at the top of the routine, and the starting address and leading dimensions are visible to the compiler. This is important because it means that the storage relationship between the array elements is well known. Hence, you could expect good performance:


#define N ... double *a[N][N], c[N][N], d; for (i=0; i<N; i++) for (j=0; j<N; j++) a[i][j] = a[i][j] + c[j][i] * d;

Now imagine what happens if you allocate the rows dynamically. This makes the address calculations more complicated. The loop nest hasn’t changed; however, there is no guaranteed stride that can get you from one row to the next. This is because the storage relationship between the rows is unknown:


#define N ... double *a[N], *c[N], d; for (i=0; i<N; i++) { a[i] = (double *) malloc (N*sizeof(double)); c[i] = (double *) malloc (N*sizeof(double)); } for (i=0; i<N; i++) for (j=0; j<N; j++) a[i][j] = a[i][j] + c[j][i] * d;

In fact, your compiler knows even less than you might expect about the storage relationship. For instance, how can it be sure that references to a and c aren’t aliases? It may be obvious to you that they’re not. You might point out that malloc never overlaps storage. But the compiler isn’t free to assume that. Who knows? You may be substituting your own version of malloc!

Let’s look at a different example, where storage is allocated all at once, though the declarations are not visible to all routines that are using it. The following subroutine bob performs the same computation as our previous example. However, because the compiler can’t see the declarations for a and c (they’re in the main routine), it doesn’t have enough information to be able to overlap memory references from successive iterations; the references could be aliases:


#define N... main() { double a[N][N], c[N][N], d; ... bob (a,c,d,N); } bob (double *a,double *c,double d,int n) { int i,j; double *ap, *cp; for (i=0;i<n;i++) { ap = a + (i*n); cp = c + i; for (j=0; j<n; j++) *(ap+j) = *(ap+j) + *(cp+(j*n)) * d; } }

To get the best performance, make available to the compiler as many details about the size and shape of your data structures as possible. Pointers, whether in the form of formal arguments to a subroutine or explicitly declared, can hide important facts about how you are using memory. The more information the compiler has, the more it can overlap memory references. This information can come from compiler directives or from making declarations visible in the routines where performance is most critical.

Content actions

Download module as:

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks