Skip to content Skip to navigation Skip to collection information

Connexions

You are here: Home » Content » An Introduction to High-Performance Computing (HPC) » Practical 4 - Basic OpenMP

Navigation

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

In these lenses

  • eScience, eResearch and Computational Problem Solving

    This collection is included inLens: eScience, eResearch and Computational Problem Solving
    By: Jan E. Odegard

    Click the "eScience, eResearch and Computational Problem Solving" link to see all content selected in this lens.

Recently Viewed

This feature requires Javascript to be enabled.
 

Practical 4 - Basic OpenMP

Module by: Tim Stitt Ph.D.. E-mail the author

Based on: Practical 3 - Basic MPI by Tim Stitt Ph.D.

Summary: In this module you will gain some experience with the fundamentals of the OpenMP programming paradigm by executing some simple OpenMP codes in a series of simple exercises.

Introduction

In this practical you will develop and execute some simple codes, which incorporate the most fundamental OpenMP directives. Exercises include the querying of OpenMP threads for a trivial OpenMP code, to the calculation of speedup for computing pi in parallel.

If you require any assistance, please do not hesitate to contact the available support staff.

Disclaimer:

This course (and accompanying practical exercises) are not designed to teach you OpenMP programming but are provided to give you a feel for the OpenMP programming paradigm. A detailed introduction to OpenMP will be given in further courses that are under development.

Objectives

The objectives of this practical are to gain experience in:

  1. executing an OpenMP code with different thread counts
  2. using basic OpenMP directives to query the threads of an OpenMP program
  3. calculating speedup for a simple OpenMP program

OpenMP: A Crash Course

OpenMP is a portable thread-based programming model, whereby a single thread of execution (referred to as the initial thread) can spawn (or fork) additional threads to perform work in parallel. Ideally each thread of execution is mapped onto an individual processing core for maximum performance. Each thread (within a thread team) is uniquely identified by its thread ID and individual threads can communicate with each other through shared memory resources.

OpenMP Note:

OpenMP is not a programming language but rather is a specification that enables shared-memory parallelism, in base languages such as Fortran and C, through the following components:
  1. Compiler Directives
  2. Runtime Library Routines
  3. Environment Variables

Figure 1 shows a typical OpenMP thread-based execution whereby the initial thread creates a parallel region to perform some work in parallel. Within the parallel region a team of n+1 threads are forked (spawned) and assigned individual tasks to complete. After the final task is complete, the threads are synchronised (joined) and serial execution continues with the initial thread.

Figure 1: The OpenMP Programming Model
OpenMP Fork-Join Model

The most fundamental OpenMP operations can be classified into three (3) groups:

  1. Invoking a Parallel Region

    Fortran Example

    
    !$omp parallel private(var1,var2,...),shared(var1,var2,...)
    ... code block ...
    !$omp end parallel

    C Example

    
    #pragma omp parallel private(var1,var2,...),shared(var1,var2,...)
    {
    ... code block ...
    }

    Private and Shared Variables:

    Variables that are in scope at the invocation of a parallel region must be declared either private or shared within the ensuing parallel region. If a variable is designated as private, then all threads within the parallel region will create an individual instance of the variable, and can modify it without any conflict from other threads. A variable that is designated as shared remains shared among all threads, and needs to be accessed carefully to avoid race conditions.

  2. Querying Thread ID and Team Size (within the current parallel region)

    Fortran Examples

    integer :: id, threads
    
    id=OMP_GET_THREAD_NUM() ! returns ID of this thread
    threads=OMP_GET_NUM_THREADS() ! returns the number of threads in parallel region

    C Examples

    int id, threads;
    
    id=OMP_GET_THREAD_NUM(); // returns ID of this thread
    threads=OMP_GET_NUM_THREADS(); // returns the number of threads in parallel region
    

  3. Loop-level Parallelism (splitting loop iterations over threads)

    Fortran Example

    
    !$omp parallel do private(var1,...),shared(var1,...)
    ... do loop ...
    !$omp end parallel do
    

    C Example

    
    #pragma omp parallel for private(var1,...),shared(var1,...)
    ... for loop ...
    

Note:

For more detailed information on the OpenMP standard, download and review the latest OpenMP specification.

Basic OpenMP: "Hello World"

Exercise 1: Executing A Simple OpenMP Code

Develop, compile and execute a code (which uses the OpenMP framework) to display a "Hello World" message on multiple threads. Test your OpenMP code works correctly for 1 and 2 threads.

Code Tip:

If you are not brave enough (yet) to write the code, you can use the OpenMP template provided in the ../Practicals/Practical_4/Exercise_1 folder.

Batch Script Tip:

Make sure you modify your submission script to request multiple threads in your job e.g. to test your code for 2 threads modify your script to contain the following:
...
#PBS -l mppwidth=1        # Request 1 process
#PBS -l mppnppn=1         # Request 1 process per node
#PBS -l mppdepth=2        # Request 2 threads per process
...
...
export OMP_NUM_THREADS=2
aprun -n 1 -N 1 -d 2 ./hello_world  # Execute code with 1 process and 2 threads

OpenMP Tip:

For each execution, you need to set the OMP_NUM_THREADS environment variable to the default number of threads for each parallel region in your code.

aprun Tip:

To notify aprun that you require multiple threads per process, you need to set the -d option e.g. to request 4 threads per process you should use
aprun -n 1 -N 1 -d 4 ./foo
For more information see man aprun

Exercise 2: Querying Thread ID and Team Size

Modify your solution in Exercise 1 to print the "Hello World" message, along with the ID of the current thread and the total number of threads in the current thread team.

Hint A

Use the OMP_GET_THREAD_NUM() and OMP_GET_NUM_THREADS() runtime routines to find the ID of the current thread and the size of the thread team, respectively.

Fortran Solution B

...
integer :: id, threads

!$OMP PARALLEL private(id,threads)

id=OMP_GET_THREAD_NUM()
threads=OMP_GET_NUM_THREADS()

print *,"Hello World from thread ",id," of ",threads

!$OMP END PARALLEL
... 

C Solution C

...
int id,threads;

#pragma omp parallel private(id,threads)
{
  id=omp_get_thread_num();
  threads=omp_get_num_threads();

  printf("Hello World from thread %d of %d\n",id,threads);
}
... 

Measuring Speedup

In parallel computing, speedup refers to how much a parallel algorithm is faster than a corresponding sequential algorithm.

Speedup is defined as:

                                        
S p = T 1 T p S p T 1 T p

where

  • p is the number of processers
  • T 1 T 1 is the time of the sequential algorithm
  • T p T p is the time of the parallel algorithm with p processors

Exercise 3: Calculating PI

The value of pi can be computed approximately using an integration technique (see the course slides for more information). Using the serial and OpenMP code implementations found in ../Practicals/Practical_4/Exercise_2, calculate the speedup obtained for 1 and 2 threads.

Tip:

Briefly review the serial and parallel code implementations so you are familiar with the computational method and its OpenMP parallelisaton.

Record your results in a table similar to Table 1.

Table 1: Speedup Results for PI Calculation
Threads Time Taken(sec) Speedup
   
1
   
   
2
   

Glossary

Parallel Region:

Within the OpenMP specification, a parallel region encapsulates a block of instructions which will be executed by a team of threads.

See Also: Thread Team
Speedup:

In parallel computing, speedup refers to how much a parallel algorithm is faster than a corresponding sequential algorithm.

Figure 2: Speedup Diagram
Speedup Diagram
Thread-Based:

A programming model whereby work is carried out in parallel using concurrent threads of execution.

Thread ID:

Each thread within a team of T threads is uniquely identified by an ID in the range 0..(T-1)

Thread Team:

A group of threads (including the initial thread that spawned the team) which can work in parallel on some task.

Collection Navigation

Content actions

Download:

Collection as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add:

Collection to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks

Module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks