Diary

This page will contain a diary of the course, i.e. a short description of the lectures, handouts etc. The table will be filled in as we go along.

If printing is a problem, it may be because you have not defined a default printer. This is how you do it: Choose Preferences from the Applications menu. Then pick More Preferences and then click on Default Printer. Click on Set Default when you have marked the correct printer. (There should be a label on the printer itself, giving the printer's name.)

 

Date Activity Comments Handouts

April 1

Introduction.
About registration: if you are a Chalmers student, and if you are not one of the three that have registered, you should contact your Student Centre (it turns out that we, the math-dept, cannot register you). GU-students and PhD-students do not have to register this way. 1.pdf
on OH-page 14, line 22, remove: "Add to Personal Folders"
3
Fortran90 (77) and C. cpp, man. OH-pages 16-36 (28-32 on your own).
2.pdf
8
indent, lint/splint, LDA in Fortran77, dangerous things in C and Fortran, make, intro. to computer architecture, OH-pages 37-45, 50-59. 46-49 for reference only.

3.pdf
10
More about computer architecture. Showed som pages from the AMD Opt. Ref. Manual and and an IBM-Power4-manual as well. OH-pages 60-76. Read pages 77-79, (about reading assembly code) on your own. Read the free chapter in Triebel's book. Read about the Pentium 4. Look at the end of this page and fetch the Architecture Optimization Reference Manual, read the first chapter. A good, but technical, reference for the AMD64-machines (the student machines) is Appendix A in the Software Optimization Guide for AMD64 Processors. My lecture does not cover  all the terminology used in the above manuals.
springer.ps
15
Talked about the following job-ad (reads like the contents of the HPC-course almost). Virtual memory. Code optimization. OH-pages 80-95.


17
More on code optimization. OH-pages 96-106. Here is the link to the double precision routines for elementary functions.
22 The last part of optimization. OH-pages 107-125.


24
Profiling. Valgrind, PAPI, gprof, gcov. Calling a Lapack-routine from Fortran. OH-pages 126-141.5.

4.pdf
29
Handed out a quick ref quide for the BLAS and talked about it some time. Calling Fortran from C and Matlab. OH-pages 141.5-153.
I have given you extra time for the uniproc lab, the new deadline is 2008-05-13.

19:08, note that OH-page 154 (using Mex-files) has been updated.

21:06, I have added some help for the Lapack-lab. Some routines have become hard to find, so I made them available on the student system. See the lab-text.

May 6
The rest of Mex-files. Libraries. Intro. to parallel computing. OH-pages 154-169.
Returned the Matlab-labs.

8
More about networks. A simple example. Speedup, Amdahls'law. Load balancing. OH-pages 170-187.
A talk you may attend: Supercomputing for the Future, Supercomputing from the Past, 2008-05-08 17:30. Honorary doctorate talk by Mateo Valero, Barcelona Supercomputing Center, SPAIN.

I changed the deadline for the threads-lab since we are somewhat behind schedule.

On page 187, the second paragraph should read:
Even if the processors are identical (and with equal amount of memory) we may have to compute a more complicated partitioning. Suppose that A is upper triangular (zeros below the diagonal). (We would not use an iterative method to compute an eigenvector in this case.) The triangular matrix is easy to partition, it is worse if A is a general sparse matrix (many elements are zero).
5.pdf The last handout.
13
More terminology. Threads. PVM and MPI. OH-pages 188-189, 198-211.
There will be no talks this year, since the group is too large (it would take so much time).
 
Some questions I got during the break (and the answers):
  • Q: Where can i see the source code for the ls-command?
    A: You can fetch it from http://www.gnu.org/software/coreutils/. Fetch the tar-file, unpack it, and look in coreutils-6.9/src/ls.c. It may not be that easy to read, it is 4430 lines.
  • Q: How is a mutex-lock implemented on low level?
    A: here is a partial answer.
  • Q: How many threads can one create?
    A: read this page.

15
More about MPI. Deadlock. Collective communication. Parallel programming in Matlab. OH-pages 212-223, 233-239.
224-232, nonblocking communication, read if you like.


20
First half of OpenMP. OH-pages 240-259.
We decided the exam schedule. I will put up a booking list outside my office tomorrow.
New to me: an article about the person,  Kazushige Goto, behind the Goto-BLAS routines.
22
The rest of OpenMP.
I added some hints to the second MPI-lab.

  May 27: The Uniprocessor lab is graded. Some of you have received a number of comments. Think about them. You find the labs outside my office.

May 28, 17:45: I have graded all pthreads, MPI, OpenMP-labs now. If you have not received a mail about it, send me a mail.