The shared-memory programming model typically exploits a shared memory system, where any memory location is directly accessible by any of the computing processes (i.e. there is a single global address space). This programming model is similar in some respects to the sequential single-processor programming model with the addition of new constructs for synchronising multiple access to shared variables and memory locations.
The shared-memory architecture and programming model is illustrated in Figure 1. In this model any processing element (PE) can make a reference to any memory address within the global address space. The memory request is forwarded through an interconnection network to memory, with the result returned back, via the network, to the PE.
Typically, a shared-memory system will implement a multi-threaded programming model, where each processing element executes program thread. Currently, OpenMP and Pthreads are the two application programming interfaces (API) predominantly used to implement multi-threaded applications on shared-memory systems.
Which of the following properties of the shared-memory programming model (e.g using OpenMP) are generally true:
(a) It's a simpler programming model than the programming models for distributed-memory systems
(b) Provides precise control over data locality and processor affinity
(c) Supports finer-grain parallelism e.g., loop-level parallelism
(d) Supports incremental parallelization
(e) Shared-memory programming bugs are easier to track down
(a) TRUE - It's a simpler programming model than the programming models for distributed-memory systems
Arguably, a single global address space simplifies memory references (e.g. array and variable references) within the program code, akin to the sequential programming model.
(b)
FALSE - Provides precise control over data locality and processor affinity
Generally shared-memory programming models provide little or no support for data placement and processor association.
(c)
TRUE - Supports finer-grain parallelism e.g., loop-level parallelism
Generally with a multi-threaded based programming language, program loops can be parallelized by executing independent iterations on different threads. In OpenMP this can be accomplished with the PARALLEL DO directive:
!$OMP DO SCHEDULE(DYNAMIC,CHUNK)
DO I = 1, N
C(I) = A(I) + B(I)
WRITE(*,100) TID,I,C(I)
100 FORMAT(' Thread',I2,': C(',I3,')=',F8.2)
ENDDO
!$OMP END DO NOWAIT
(d)
TRUE - Supports incremental parallelization
As illustrated in the example above, individual code segments can be parallelized by simply wrapping them in OpenMP directives.
(e)
FALSE - Shared-memory programming bugs are easier to track down
Shared-memory programming bugs involving synchronization and data races can be very difficult to trace.
The distributed-memory programming model exploits a distributed-memory system where each processor maintains its own local memory and has no direct knowledge about another processor's memory (a "share nothing" approach). For data to be shared, it must be passed from one processor to another as a message.
Data that resides on the local memory of a processor can be accessed much more quickly than data that resides on another processor.
The distributed-memory architecture and programming model is illustrated in Figure 2. If a processing element (PE) requires data from another PE to continue its execution, it must request the data. The request is made by sending a message, through the interconnection network, to the destination PE. When the data is available on the destination PE it is returned within a message, through the interconnection network, to the requesting PE.
Typically, a distributed-memory system will implement a message-passing programming model, where processing elements communicate data by sending and receiving messages.
Currently, MPI is the de facto standard specification for implementing message-passing on distributed-memory systems. MPI programs generally follow a Single Program Multiple Data (SPMD) programming style.
Which of the following properties of the distributed-memory programming model (using MPI) are generally true:
(a) For many architectures, it can result in near-optimal performance
(b) Provides precise control over data locality and processor affinity
(c) Allows natural mapping of algorithms to implementation
(d) Supports incremental parallelization
(e) Runs on most parallel platforms
(a) TRUE - For many architectures, it can result in near-optimal performance
(b) TRUE - Provides precise control over data locality and processor affinity
Each MPI process has direct access to its local memory for reading and writing data values. Accessing local data is more efficient than accessing data on a remote process.
(e)
TRUE - Runs on most parallel platforms