Skip to content Skip to navigation Skip to collection information

Connexions

You are here: Home » Content » High Performance Computing » Problem Decomposition

Navigation

Table of Contents

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

Endorsed by Endorsed (What does "Endorsed by" mean?)

This content has been endorsed by the organizations listed. Click each link for a list of all content endorsed by the organization.
  • HPC Open Edu Cup

    This collection is included inLens: High Performance Computing Open Education Cup 2008-2009
    By: Ken Kennedy Institute for Information Technology

    Click the "HPC Open Edu Cup" link to see all content they endorse.

Affiliated with (What does "Affiliated with" mean?)

This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
  • NSF Partnership display tagshide tags

    This collection is included inLens: NSF Partnership in Signal Processing
    By: Sidney Burrus

    Click the "NSF Partnership" link to see all content affiliated with them.

    Click the tag icon tag icon to display tags associated with this content.

  • Featured Content

    This collection is included inLens: Connexions Featured Content
    By: Connexions

    Comments:

    "The purpose of Chuck Severence's book, High Performance Computing has always been to teach new programmers and scientists about the basics of High Performance Computing. This book is for learners […]"

    Click the "Featured Content" link to see all content affiliated with them.

Also in these lenses

  • UniqU content

    This collection is included inLens: UniqU's lens
    By: UniqU, LLC

    Click the "UniqU content" link to see all content selected in this lens.

  • Lens for Engineering

    This module and collection are included inLens: Lens for Engineering
    By: Sidney Burrus

    Click the "Lens for Engineering" link to see all content selected in this lens.

  • eScience, eResearch and Computational Problem Solving

    This collection is included inLens: eScience, eResearch and Computational Problem Solving
    By: Jan E. Odegard

    Click the "eScience, eResearch and Computational Problem Solving" link to see all content selected in this lens.

Recently Viewed

This feature requires Javascript to be enabled.

Tags

(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.
 

Problem Decomposition

Module by: Charles Severance, Kevin Dowd. E-mail the authors

There are three main approaches to dividing or decomposing work for distribution among multiple CPUs:

  • Decomposing computations: We have already discussed this technique. When the decomposition is done based on computations, we come up with some mechanism to divide the computations (such as the iterations of a loop) evenly among our processors. The location of the data is generally ignored, and the primary issues are iteration duration and uniformity. This is the preferred technique for the shared uniform memory systems because the data can be equally accessed by any processor.
  • Decomposing data: When memory access is nonuniform, the tendency is to focus on the distribution of the data rather than computations. The assumption is that retrieving "remote" data is costly and should be minimized. The data is distributed among the memories. The processor that contains the data performs the computations on that data after retrieving any other data necessary to perform the computation.
  • Decomposing tasks: - When the operations that must be performed are very independent, and take some time, a task decomposition can be performed. In this approach a master process/thread maintains a queue of work units. When a processor has available resources, it retrieves the next "task" from the queue and begins processing. This is a very attractive approach for embarrassingly parallel computations.1

In some sense, the rest of this chapter is primarily about data decomposition. In a distributed memory system, the communication costs usually are the dominant performance factor. If your problem is so embarrassingly parallel that it can be distributed as tasks, then nearly any technique will work. Data-parallel problems occur in many disciplines. They vary from those that are extremely parallel to those that are just sort of parallel. For example, fractal calculations are extremely parallel; each point is derived independently of the rest. It's simple to divide fractal calculations among processors. Because the calculations are independent, the processors don't have to coordinate or share data.

Our heat flow problem when expressed in its red-black (or FORTRAN 90) form is extremely parallel but requires some sharing of data. A gravitational model of a galaxy is another kind of parallel program. Each point exerts an influence on every other. Therefore, unlike the fractal calculations, the processors do have to share data.

In either case, you want to arrange calculations so that processors can say to one another, "you go over there and work on that, and I'll work on this, and we'll get together when we are finished."

Problems that offer less independence between regions are still very good candidates for domain decomposition. Finite difference problems, short-range particle interaction simulations, and columns of matrices can be treated similarly. If you can divide the domain evenly between the processors, they each do approximately the same amount of work on their way to a solution.

Other physical systems are not so regular or involve long-range interactions. The nodes of an unstructured grid may not be allocated in direct correspondence to their physical locations, for instance. Or perhaps the model involves long-range forces, such as particle attractions. These problems, though more difficult, can be structured for parallel machines as well. Sometimes various simplifications, or "lumping" of intermediate effects, are needed. For instance, the influence of a group of distant particles upon another may be treated as if there were one composite particle acting at a distance. This is done to spare the communications that would be required if every processor had to talk to every other regarding each detail. In other cases, the parallel architecture offers opportunities to express a physical system in different and clever ways that make sense in the context of the machine. For instance, each particle could be assigned to its own processor, and these could slide past one another, summing interactions and updating a time step.

Depending on the architecture of the parallel computer and problem, a choice for either dividing or replicating (portions of ) the domain may add unacceptable overhead or cost to the whole project.

For a large problem, the dollar value of main memory may make keeping separate local copies of the same data out of the question. In fact, a need for more memory is often what drives people to parallel machines; the problem they need to solve can't fit in the memory of a conventional computer.

By investing some effort, you could allow the domain partitioning to evolve as the program runs, in response to an uneven load distribution. That way, if there were a lot of requests for As, then several processors could dynamically get a copy of the A piece of the domain. Or the A piece could be spread out across several processors, each handling a different subset of the A definitions. You could also migrate unique copies of data from place to place, changing their home as needed.

When the data domain is irregular, or changes over time, the parallel program encounters a load-balancing problem. Such a problem becomes especially apparent when one portion of the parallel computations takes much longer to complete than the others. A real-world example might be an engineering analysis on an adaptive grid. As the program runs, the grid becomes more refined in those areas showing the most activity. If the work isn't reapportioned from time to time, the section of the computer with responsibility for the most highly refined portion of the grid falls farther and farther behind the performance of the rest of the machine.

Footnotes

  1. The distributed RC5 key-cracking effort was coordinated in this fashion. Each processor would check out a block of keys and begin testing those keys. At some point, if the processor was not fast enough or had crashed, the central system would reissue the block to another processor. This allowed the system to recover from problems on individual computers.

Collection Navigation

Content actions

Download:

Collection as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add:

Collection to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks

Module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks