OpenMP for Fortran


  • OpenMP Directive
  • Syntax of OpenMP compiler directive for Fortran:
     !$OMP  DirectiveName Optional_CLAUSES...
    ...
    ... Program statements between the !$OMP lines
    ... are executed in parallel by all threads
    ...
    !$OMP END DirectiveName
  • Program statements between the 2 red lines are executed by multiple threads


  • Setting the level of parallellism in OpenMP programs
  • The number of threads that will be created to execute parallel sections in an OpenMP program is controlled by the environment variable OMP_NUM_THREADS
  • To set this environment variable use:
      export OMP_NUM_THREADS=...            
    
    Example:
    
      export OMP_NUM_THREADS=8
    


  • Compiling OpenMP programs
    • Fortran

      • Compile:

          f90 -O -c -xopenmp -stackvar Prog.f90
        
      • Link:
          f90 -O -o Executable \
        -xopenmp -stackvar \
        Prog1.o Prog2.o ....


  • Introductory Example
    • Parallel "Hello World" OpenMP program:

         PROGRAM  Main
      
         !$OMP PARALLEL
      
         print *, "Hello World !"                 
      
         !$OMP END PARALLEL
      
         END
      

    • Example Program: (Demo above code)                                                
    • Compile with:
          f90 -O

      -xopenmp -stackvar

        openMP01.f90
    • Run with:
      • export OMP_NUM_THREADS=8
      • a.out

      Make sure you do it on compute.

      You will see "Hello World !!!" printed EIGHT times !!! (Remove the #pragma line and you get ONE line)....



  • Defining shared and private (non-shared) variables in parallel section
  • Recall:
    • There is no scopes in Fortran

    Fortran uses option keywords to define private (non-shared) (and shared) variables....


  • Defining shared and private variables in a PARALLEL section
    • A variable is by default shared among all threads
    • A private variable in a PARALLE section must be specified using the option PRIVATE

  • Fortran example of SHARED variable:
       PROGRAM  Main
    IMPLICIT NONE integer :: N ! Shared N = 1001
    print *, "Before parallel section: N = ", N !$OMP PARALLEL
    N = N + 1
    print *, "Inside parallel section: N = ", N
    !$OMP END PARALLEL print *, "After parallel section: N = ", N
    END

  • Example Program: (Demo above code)                        
    • Prog file: (Shared variable in OpenMP) --- click here
  • Compile with:
        f90 -O

    -xopenmp -stackvar

      openMP02a.f90
  • Run a few times with:
    • export OMP_NUM_THREADS=8
    • a.out

    You should see the value for N at the end is not always 1009, it could be less. This is evidence of asynchronous update.



  • Fortran example of NON-SHARED (private) variable:
       PROGRAM  Main
    IMPLICIT NONE integer :: N ! Shared N = 1001
    print *, "Before parallel section: N = ", N !$OMP PARALLEL PRIVATE(N)
    N = N + 1
    print *, "Inside parallel section: N = ", N
    !$OMP END PARALLEL print *, "After parallel section: N = ", N
    END

  • Example Program: (Demo above code)                        
    • Prog file: (Private variable in OpenMP) --- click here
  • Compile with:
        f90 -O

    -xopenmp -stackvar

      openMP02b.f90
  • Run a few times with:
    • export OMP_NUM_THREADS=8
    • a.out

  • Output:
        Before parallel section: N =  1001
    Inside parallel section: N = 1
    Inside parallel section: N = 1
    Inside parallel section: N = 1
    Inside parallel section: N = 1
    Inside parallel section: N = 1
    Inside parallel section: N = 1
    Inside parallel section: N = 1
    Inside parallel section: N = 1
    After parallel section: N = 1001

    Each thread has its own variable N

    This variable N is different from the "program" variable defined in the main program !!!



  • OpenMP Support function
  • Most useful support functions in OpenMP:
    Function Name Effect
    omp_set_num_threads(int nthread) Set size of thread team
    INTEGER omp_get_num_threads() return size of thread team
    INTEGER omp_get_max_threads() return max size of thread team (typically equal to the number of processors
    INTEGER omp_get_thread_num() return thread ID of the thread that calls this function
    INTEGER omp_get_num_procs() return number of processors
    LOGICAL omp_in_parallel() return TRUE if currently in a PARALLEL segment
  • Here is a simple OMP program in Fortran:
       PROGRAM  Main
    IMPLICIT NONE INTEGER :: nthreads, myid
    INTEGER, EXTERNAL :: OMP_GET_THREAD_NUM, OMP_GET_NUM_THREADS !$OMP PARALLEL private(nthreads, myid) myid = OMP_GET_THREAD_NUM() print *, "Hello I am thread ", myid if (myid == 0) then
    nthreads = OMP_GET_NUM_THREADS()
    print *, "Number of threads = ", nthreads
    end if !$OMP END PARALLEL END
  • Example Program: (OpenMP Fortran program) --- click here       
  • Compile using the following command:
        f90 -O

    -xopenmp -stackvar

      hello.f90
  • Run with:
    • export OMP_NUM_THREADS=8
    • a.out

  • Output:
      Hello I am thread  7
    Hello I am thread 5
    Hello I am thread 1
    Hello I am thread 0
    Hello I am thread 2
    Number of threads = 8
    Hello I am thread 4
    Hello I am thread 3
    Hello I am thread 6


  • Caveat with Fortran
    • Recall:

      • Array indices in Fortran by default start with 1 (ONE)
    • Observed from "Hello" program:
      • Thread IDs start with 0 (ZERO)
    • Caveat:
      • Use ThreadID+1 as index to an array in Fortran !!!


  • Example OpenMP Program: Find minimum in an array
    • A sequential program in C++ can be found here: (click here)
    • We will write this program using OpenMP in Fortran

    • Parallel Find Min program in Fortran:
        PROGRAM Min
      IMPLICIT NONE INTEGER, PARAMETER :: MAX = 10000000 DOUBLE PRECISION, DIMENSION(MAX) :: x
      DOUBLE PRECISION, DIMENSION(10) :: my_min
      DOUBLE PRECISION :: rmin INTEGER :: num_threads
      INTEGER :: i, n
      INTEGER :: id, start, stop ! ===========================================================
      ! Declare the OpenMP functions
      ! ===========================================================
      INTEGER, EXTERNAL :: OMP_GET_THREAD_NUM, OMP_GET_NUM_THREADS ! ===================================
      ! Parallel section: Find local minima
      ! ===================================
      !$OMP PARALLEL PRIVATE(i, id, start, stop, num_threads, n) num_threads = omp_get_num_threads()
      n = MAX/num_threads id = omp_get_thread_num() ! ----------------------------------
      ! Find my own starting index
      ! ----------------------------------
      start = id * n + 1 !! Array start at 1 ! ----------------------------------
      ! Find my own stopping index
      ! ----------------------------------
      if ( id <> (num_threads-1) ) then
      stop = start + n
      else
      stop = MAX
      end if ! ----------------------------------
      ! Find my own min
      ! ----------------------------------
      my_min(id+1) = x(start) DO i = start+1, stop
      IF ( x(i) < my_min(id+1) ) THEN
      my_min(id+1) = x(i)
      END IF
      END DO !$OMP END PARALLEL ! ===================================
      ! Find min over the local minima
      ! ===================================
      rmin = my_min() DO i = 2, num_threads
      IF ( rmin < my_min(i) ) THEN
      rmin = my_min(i)
      END IF
      END DO print *, "min = ", rmin
      END PROGRAM
    • Example Program: (Demo above code)                                                
          f90 -O

      -xopenmp -stackvar

        min-mt1.f90
    • Run with:
      • export OMP_NUM_THREADS=8
      • a.out


  • Mutual exclusion synchronization Primitives
  • This mutual exclusion effect in Fortran is achieved in OpenMP using the following pragma:
       !$OMP CRITICAL
    
           ... statements are guaranteed to be executed
    ,,, by ONE thread at any one time !$OMP END CRITICAL


  • Example OpenMP program with synchronization: compute Pi
  • Example:
      PROGRAM Compute_PI
    IMPLICIT NONE INTEGER, EXTERNAL :: OMP_GET_THREAD_NUM, OMP_GET_NUM_THREADS INTEGER N, i
    INTEGER id, num_threads
    DOUBLE PRECISION w, x, sum
    DOUBLE PRECISION pi, mypi N = 50000000 !! Number of intervals
    w = 1.0d0/N !! width of each interval sum = 0.0d0 !$OMP PARALLEL PRIVATE(i, id, num_threads, x, mypi) num_threads = omp_get_num_threads()
    id = omp_get_thread_num() mypi = 0.0d0; DO i = id, N-1, num_threads
    x = w * (i + 0.5d0)
    mypi = mypi + w*f(x)
    END DO !$OMP CRITICAL
    pi = pi + mypi
    !$OMP END CRITICAL !$OMP END PARALLEL PRINT *, "Pi = ", pi END PROGRAM
  • Example Program: (OpenMP compute Pi) --- click here       
  • Compile with:
        f90 -O

    -xopenmp -stackvar

      openMP_compute_pi2.f90
  • Run a few times with:
    • export OMP_NUM_THREADS=8
    • a.out



  • Parallel For Loop in OpenMP

    The division of labor (splitting the work of a for-loop) of a for-loop can be done in OpenMP through a special Parallel LOOP construct.

  • A Parallel Loop construct MUST appear within a Parallel region of the program !
  • The syntax of a Parallel LOOP construct in Fortran is:
       !$OMP    DO
    
          DO  index = ....
    .... ! Division of labor is taken care of
    ! by the Fortran compiler
    END DO !$OMP END DO
  • The meaning of this Parallel LOOP construct is to distribute the iterations in the for-loop (or do-loop) among the threads.

    Each iteration of the for-loop is executed exactly once by each thread.

    The loop variable used in the Parallel LOOP construct is by default PRIVATE (other variables are still by default SHARED)


  • Example: compute Pi with parallel DO loop
      PROGRAM Compute_PI
    IMPLICIT NONE INTEGER N, i, num_threads
    DOUBLE PRECISION w, x, sum
    DOUBLE PRECISION pi, mypi N = 50000000 !! Number of intervals
    w = 1.0d0/N !! width of each interval sum = 0.0d0 !$OMP PARALLEL PRIVATE(x, mypi) mypi = 0.0d0; !$OMP DO
    DO i = 0, N-1 !! Parallel Loop
    x = w * (i + 0.5d0)
    mypi = mypi + w*f(x)
    END DO
    !$OMP END DO !$OMP CRITICAL
    pi = pi + mypi
    !$OMP END CRITICAL !$OMP END PARALLEL PRINT *, "Pi = ", pi END PROGRAM
  • Example Program: (OpenMP compute Pi) --- click here       

  • Compile with:
        f90 -O

    -xopenmp -stackvar

      openMP_compute_pi3.f90
  • Run with:
    • export OMP_NUM_THREADS=8
    • a.out


  • Final Notes
  • The stack size of each thread can be controlled by setting another environment variable:
      setenv   STACKSIZE    nBytes
    
  • For more information on OpenMP, see: http://www.openmp.org





OpenMP for Fortran的更多相关文章

  1. 在fortran下进行openmp并行计算编程

    最近写水动力的程序,体系太大,必须用并行才能算的动,无奈只好找了并行编程的资料学习了.我想我没有必要在博客里开一个什么并行编程的教程之类,因为网上到处都是,我就随手记点重要的笔记吧.这里主要是open ...

  2. Fortran+ OpenMP实现实例

    PROGRAM parallel_01 USE omp_lib IMPLICIT NONE INTEGER :: i,j INTEGER() :: time_begin, time_end, time ...

  3. OpenMP并行构造的schedule子句详解 (转载)

    原文:http://blog.csdn.net/gengshenghong/article/details/7000979 schedule的语法为: schedule(kind, [chunk_si ...

  4. openMP的一点使用经验【非原创】

    按照百科上说的,针对于openmp的编程,最简单的就是在开头加个#include<omp.h>,然后在后面的for上加一行#pragma omp parallel for即可,下面的是较为 ...

  5. Call Paralution Solver from Fortran

    Abstract: Paralution is an open source library for sparse iterative methods with special focus on mu ...

  6. OpenMP初步(英文)

    Beginning OpenMP OpenMP provides a straight-forward interface to write software that can use multipl ...

  7. Fortran并行计算的一些例子

    以下例子来自https://computing.llnl.gov/tutorials/openMP/exercise.html网站 一.打印线程(Hello world) C************* ...

  8. OpenMP并行编程

    什么是OpenMP?“OpenMP (Open Multi-Processing) is an application programming interface (API) that support ...

  9. 学习OpenCV——OpenMP

    转自:http://www.cnblogs.com/yangyangcv/archive/2012/03/23/2413335.html openMP的一点使用经验   最近在看多核编程.简单来说,由 ...

随机推荐

  1. 那些年不错的Android开源项目

    那些年不错的Android开源项目 转载自 eoe 那些年不错的Android开源项目-个性化控件篇 第一部分 个性化控件(View) 主要介绍那些不错个性化的View,包括ListView.Acti ...

  2. Introduction to Computer Networks(网络架构与七层参考模式)

    Network Connectivity 1. Important terminologies 1) Link 设备连接的连线.Link本身既可以是有线的,也可以是无线的. 2) Node 设备.电脑 ...

  3. XTU 1242 Yada Number 容斥

    Yada Number Problem Description: Every positive integer can be expressed by multiplication of prime ...

  4. 【HTML5】特性

    HTML5 建立的一些规则: 新特性应该基于 HTML.CSS.DOM 以及 JavaScript. 减少对外部插件的需求(比如 Flash) 更优秀的错误处理 更多取代脚本的标记 HTML5 应该独 ...

  5. 单元测试框架-TestNG的安装

    一.在eclipse中安装TestNG插件 1)打开eclipse,选择help--> Install New Software

  6. 在Windows Server 2008中安装IIS

    1.右键“我的电脑”,选择“管理”,打开“服务器管理器” 2.点击左边菜单栏“角色”调出角色窗口 3.接着点击“添加角色”,弹出添加“角色向导” 4.点击“下一步”进入服务器角色选项 5.勾选“Web ...

  7. 递推DP URAL 1260 Nudnik Photographer

    题目传送门 /* 递推DP: dp[i] 表示放i的方案数,最后累加前n-2的数字的方案数 */ #include <cstdio> #include <algorithm> ...

  8. BZOJ1109 : [POI2007]堆积木Klo

    f[i]表示第i个在自己位置上的最大值 则f[i]=max(f[j])+1 其中 j<i a[j]<a[i] a[i]-a[j]<=i-j -> j-a[j]<=i-a[ ...

  9. FFMPEG解码流程

    FFMPEG解码流程:  1. 注册所有容器格式和CODEC: av_register_all()  2. 打开文件: av_open_input_file()  3. 从文件中提取流信息: av_f ...

  10. SpringMVC_The resource identified by this request is only capable of generating responses with characteristics

    今天在调试springMVC的时候,在将一个对象返回为json串的时候,浏览器中出现异常: The resource identified by this request is only capabl ...