Diary

This page will contain a diary of the course, i.e. a short description of the lectures. The table will be filled in as we go along.


How to register (a PDF-file)

Since I have added new material to the course I will not have time to lecture C and Fortran. Read the following pages instead. I will assume that you have read this (at least superficially) when we start the second lecture.

Introduction to C, Fortran 90, FORTRAN 77, tcsh and bash

I will use Beamer to change the layout of the lecture notes, but I have only had time to convert two chapters. That is why the notes have been separated in several files. The following files are available in two formats, two OH-pages/A4, four OH-pages/A4, choose one. Note that the Beamer-files use a larger font.

Since ps2pdf has problems when using fixed-width fonts I have put the PostScript-files (PS) here as well. You can print those directly.

  1. Lecture notes, two OH-pages/A4, four OH-pages/A4, PS, two, four.
  2. Code Optimization, Beamer, two OH-pages/A4, four OH-pages/A4,
  3. OpenMP, Beamer, two OH-pages/A4, four OH-pages/A4
    I noticed that I had put up the PDC-version which has the wrong title page and a few extra pages which I will skip in the lecture (they have been covered earlier in the parallel intro). There is no need to print the new version.
  4. A few words about CUDA, two OH-pages/A4, four OH-pages/A4.  PS, two, four.

Some old handouts (read if you like):

springer.pdf

I do not lecture about these topics any longer:
Process control under unix, interprocess communication, nonblocking communication


Week Day Activity Comments
1 Tue Introduction. Course-adm. Talked about registration. First part of programming languages for HPC. OH-pages 1-14. No lab today.

Registration at the GU Student Portal should work now.
  Thu Make. Computer architecture. OH-pages 15-33. Read the free chapter in Triebel's book. Fetch the "Architecture Optimization Reference Manual", from this page. Skim through chapter 2, "INTEL® 64 AND IA-32 PROCESSOR ARCHITECTURES". My lecture does not cover all the terminology used in the above manuals, so it would be hard to understand every small detail.

"Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel and AMD CPU's" you can find on this page.
2 Tue The rest of computer architecture. OH-pages 34-45.
First part of code optimization (in a separate handout), OH-pages 1-13.
  Thu Talked about different assembler instructions for floating point and how they might be used by Matlab.
The conclusion is that Matlab uses SSE and multi threading (at least the matrix multiplication routine is).

More code optimization, OH-pages 14-35.
The little example I wanted to show was:
1 + (1e16 - 1e16) becomes 1, but
(1 + 1e16) - 1e16 is 0, since 1 + 1e16 equals 1e16.
3 Tue The last part of code optimization, OH-pages 36-67.

First part of profiling (we are skipping Valgrind for the time being), OH-pages 53-54 (back to the ordinary slides, called Lecture notes above).
I have updated OH-pages 48-50 in Lecture notes (new Valgrind runs and a new comment on PAPI).
  Thu Handed out a quick ref quide (PostScript) for the BLAS and talked about it some time. Profiling, gprof, gcov. Valgrind. PAPI. Calling a Lapack-routine from Fortran. Calling a Lapack-routine from C.
OH-pages 55-58, then back to Valgrind, 47-52, 59-65.
Page 47, change lines 2-3 to http://valgrind.org .
Page 56, delete the second half of the second line.
4 Tue Mex-files and libraries. OH-pages 66-84.
  Thu Intro. to parallel computing. OH-pages 85-104.
5 Tue Deadlock and other communication issues. POSIX-threads, intro. to PVM and MPI. OH-pages 105-122.
  Thu More MPI. OH-pages 123-139. Answer to a question: No, a process cannot belong to two communicators in MPI_Comm_split (MPI hangs if you try :-). Could not find anything written about it.

wrap was OK (I was a bit tired though :-). On page 138, the second parameter is out of bounds, but it is OK, since the second element of wrap is 1.

page 136, floor(rank / 5) can be written rank / 5 due to integer division (so floor is not necessary).

Monday: the labs are graded, look outside my office. I will bring the labs on Thursday's lecture.
6 Thu OpenMP. Pages 1-28 (in the latest version). Bring your calendar next time. We are going to decide the exam-schedule.
7 Tue More OpenMP. OH-pages 29-59.

Two typos:
page 41: should be double *private_sum
page 55: should be {99, 99, 99, 99}
The exam takes place May 24-May 25.
There is a list outside my office where you can sign up for a time.

Noticed that ps2pdf destroys the PostScript-files somewhat. For that reason I have put the PS-files on this page as well.
  Thu The last part of OpenMP. OH-pages  60-72.
An intro. to GPU-programming using CUDA (see the special handout). OH-pages 1-8 in some detail. Pages 9-15 superficially.
This was the last lecture.
Had the course evaluation today.
8 Tue No lecture, but lab as usual. I have graded lab 2. You find it outside my office in the red plastic box.