Diary

This page will contain a diary of the course, i.e. a short description of the lectures, handouts etc. The table will be filled in as we go along.

If printing is a problem, it may be because you have not defined a default printer. This is how you do it: Choose Preferences from the Applications menu. Then pick More Preferences and then click on Default Printer. Click on Set Default when you have marked the correct printer. (There should be a label on the printer itself, giving the printer's name.)


About registration: if you are a Chalmers student and if you have not registered you should contact your Student Centre. GU-students and PhD-students do not have to register this way.

springer.ps, springer.pdf

Day Activity Comments Handouts
 Tue Introduction. No lab today. 1.pdf
Remove math. from my mail-adress on page 2 in the handouts.
Thu Fortran90 (77) and C. OH-pages 16-34 (28-32 on your own).   2.pdf
Tue cpp, man, indent, lint/splint, LDA in Fortran77, dangerous things in C and Fortran, make. OH-pages 35-45, 51-56. About pretty printers for Fortran. Here are a few pages with references to such software (search for pretty). I have not tested all the programs.

http://www.ifremer.fr/ditigo/molagnon/fortran90/ I tried this on a small example, and it worked fine. There is a Makefile and manual pages.

http://www.ifremer.fr/ditigo/molagnon/fortran90/engfaq.html
http://www.faqs.org/faqs/fortran-faq/
http://www.fortran.com/
http://ray.met.fsu.edu/~bret/fpret.html Old
http://www.math.utah.edu/~beebe/software/fortran-tools.html Old

Kalle Kempe (one of the course participants) sent me the following link about HPC on the Mac http://hpc.sourceforge.net/
3.pdf
Thu Computer architecture. Showed som pages from the AMD Opt. Ref. Manual, the IBM-Power4-manual and a page from "Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel and AMD CPU's" as well. OH-pages 57-74. Read the free chapter in Triebel's book. Read about the Pentium 4. Look at the end of this page and fetch the Architecture Optimization Reference Manual, read the first chapter. A good, but technical, reference for the AMD64-machines (the student machines) is Appendix A in the Software Optimization Guide for AMD64 Processors. My lecture does not cover all the terminology used in the above manuals.

"Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel and AMD CPU's" you can find on this page.
 
Tue Virtual memory. Code optimization. OH-pages 75-97 (read 77-79 on your own).    
Thu More on the Horner benchmark. A few words about const. Code optimization. OH-pages 98-113. Here is the link to Sun's double precision routines for elementary functions. 4.pdf
Tue The last part of optimization. Handed out a quick ref quide (PostScript) for the BLAS and talked about it some time. Profiling. Valgrind. OH-pages: 114-129. A typo: OH-page 128, middle of page, it should say
cg_annotate --pid source-file
(so --pid and not -pid).
 
Thu PAPI, gprof, gcov. Calling a Lapack-routine from Fortran and C. OH-pages: 129-147.    
Tue Answer to some questions from the previous lecture (see handouts in the right column). Mex-files, static libraries. OH-pages: 148-158. Here are the handouts (ordinary textfiles):
complex_ex, Matlab_profiling
 
Thu The rest of libraries. Discussion of the copy-operation in the Mex-file. Intro. to parallel computing. OH-pages 158-178.   5.pdf
Tue A simple example. Speedup, Amdahls'law. Load balancing. More terminology. Threads. OH-pages 179-189, 198-202. Read 190-197 on your own (if you are interested). Returned the Matlab-labs. If you have not picked up yours you can find it in the plastic magazine holder outside my office.

Some questions I got during the break (and the answers):
 
Thu PVM and MPI. Point-to-point communication. Deadlock. Collective communication. OH-pages 203-221.    
Tue The rest of MPI. Parallel programming in Matlab. First half of OpenMP. OH-pages 222-223, 233-252.
Read 224-232 (nonblocking communication in MPI) on your own (if you are interested).
Typos: Page 236. the for n-loop should only loop over 800, 1600 and 3200. After the program: yo should be you.
Page 237, the first table: the time for x = A \ b, when n = 1600, should be 0.6 not 9.6.
 
Thu Info. about the exam. The date will be June 1 & 2. I have put up booking list outside my office.
More OpenMP. OH-pages 253-268.
Typo page 258. It should say a = 2.0 * cos(b) + 3.0 * sin(c), and parallel workshare shared(a, b, c)
 
Tue The second case study. OH-pages 267-278.    
 May 27I have graded the uniprocessor lab. Look in the plastic magazine holder outside my office.

I have graded the pthreads lab. If you have not received a complaint by e-mail, the code is OK.