Sign up or log in to see what your friends are attending and create your own schedule!

View analytic
Filter: Valencia, Lobby Level


TUTORIAL: Accelerator Programming with OpenACC and CUDA
    Monday July 16, 2012 8:00am - 5:00pm @ Valencia, Lobby Level

    Tutorial: Accelerator Programming with OpenACC and CUDA

    Abstract: Tutorial scope: 

    Full day 

    Beginning through intermediate material 

    o Jon Urbanic PSC, Introduction to OpenACC 

    o Lars Koesterke TACC, Intermediate CUDA 


    This full day tutorial will address the 2 main programming models in use today for adapting HPC codes to effectively use GPU accelerators. The two half day sessions will share some common techniques for achieving best performance with an accelerator. The included accelerator-tutorial.pdf file contains draft material that will be reworked before the final tutorial date. 


    Introduction to OpenACC: 

    The Intro to OpenACC session will cover the newest programming model for accelerators based on OpenMP-like directives. While there are no prerequisites, reading or skimming one of the many introductory CUDA tutorials would be helpful, and a working knowledge of C or Fortran is required.Students may participate in hands-on programming sessions hosted on a cluster at PSC. Topics and 

    hands-on sessions will include: 


    -Welcome/Intro to the Environment (10 mins) 

    -Parallel Computing Overview (10 mins) 

    -Introduction to OpenACC (2 hrs) 

    -OpenACC with CUDA Libraries (30 mins) 


    Intermediate CUDA: 

    This portion of the tutorial will cover intermediate programming techniques and performance tools/tips for CUDA programmers. There are many introductory materials for the beginning CUDA programmer and one or more of these is considered a prerequisite for this portion of the tutorial. 




    The intermediate CUDA material will be presented in lecture-discussion style using well-developed example codes and walkthroughs. Students will receive working sample code and routines they can incorporate into their own HPC projects. Topics and examples will include: 


    - CUDA Fortran 

    - CUDA C 

    - optimizing Shared Memory use in the accelerator 

    - using streams to overlap communication and computation on the accelerator 

    - MPI and accelerators with CUDA 

    - driving multiple GPUs per process 

    - Optimizing cpu core – gpu device affinity for maximum memory bandwidth 

    -- [http://www.ncsa.illinois.edu/UserInfo/Resources/Hardware/DellNVIDIACluster/Doc/Architecture.html#Affinity] 



    Type Tutorial


Get Adobe Flash player