Diary
This page will contain a diary of the course, i.e. a short description of the lectures, handouts etc. The table will be filled in as we go along.
If printing is a problem, it may be because you have not defined a default printer. This is how you do it: Choose Preferences from the Applications menu. Then pick More Preferences and then click on Default Printer. Click on Set Default when you have marked the correct printer. (There should be a label on the printer itself, giving the printer's name.)
About registration: if you are a Chalmers student and if you have not registered you should contact your Student Centre. GU-students and PhD-students do not have to register this way.
Day | Activity | Comments | Handouts |
Tue | Introduction. | No lab today. | 1.pdf Remove math. from my mail-adress on page 2 in the handouts. |
Thu | Fortran90 (77) and C. OH-pages 16-34 (28-32 on your own). | 2.pdf | |
Tue | cpp, man, indent, lint/splint, LDA in Fortran77, dangerous things in C and Fortran, make. OH-pages 35-45, 51-56. | About pretty printers for Fortran. Here are a few pages with references to such software (search for pretty).
I have not tested all the programs. http://www.ifremer.fr/ditigo/molagnon/fortran90/ I tried this on a small example, and it worked fine. There is a Makefile and manual pages. http://www.ifremer.fr/ditigo/molagnon/fortran90/engfaq.html http://www.faqs.org/faqs/fortran-faq/ http://www.fortran.com/ http://ray.met.fsu.edu/~bret/fpret.html Old http://www.math.utah.edu/~beebe/software/fortran-tools.html Old Kalle Kempe (one of the course participants) sent me the following link about HPC on the Mac http://hpc.sourceforge.net/ | 3.pdf |
Thu | Computer architecture. Showed som pages from the AMD Opt. Ref. Manual, the IBM-Power4-manual and a page from "Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel and AMD CPU's" as well. OH-pages 57-74. | Read the free chapter in Triebel's
book. Read about the Pentium 4. Look at the end
of this
page and fetch the Architecture Optimization
Reference
Manual, read the first chapter. A good, but technical,
reference for the AMD64-machines (the student machines) is Appendix A
in
the Software
Optimization Guide for AMD64 Processors. My lecture does not
cover all the terminology used in the above manuals. "Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel and AMD CPU's" you can find on this page. |
|
Tue | Virtual memory. Code optimization. OH-pages 75-97 (read 77-79 on your own). | ||
Thu | More on the Horner benchmark. A few words about const. Code optimization. OH-pages 98-113. | Here is the link to Sun's double precision routines for elementary functions. | 4.pdf |
Tue | The last part of optimization. Handed out a quick ref quide (PostScript) for the BLAS and talked about it some time. Profiling. Valgrind. OH-pages: 114-129. | A typo: OH-page 128, middle of page, it should say cg_annotate --pid source-file (so --pid and not -pid). |
|
Thu | PAPI, gprof, gcov. Calling a Lapack-routine from Fortran and C. OH-pages: 129-147. | ||
Tue | Answer to some questions from the previous lecture (see handouts in the right column). Mex-files, static libraries. OH-pages: 148-158. | Here are the handouts (ordinary textfiles): complex_ex, Matlab_profiling |
|
Thu | The rest of libraries. Discussion of the copy-operation in the Mex-file. Intro. to parallel computing. OH-pages 158-178. | 5.pdf | |
Tue | A simple example. Speedup, Amdahls'law. Load balancing. More terminology. Threads. OH-pages 179-189, 198-202. Read 190-197 on your own (if you are interested). | Returned the Matlab-labs. If you have not picked up yours you can find it in the plastic magazine holder outside my office. Some questions I got during the break (and the answers):
|
|
Thu | PVM and MPI. Point-to-point communication. Deadlock. Collective communication. OH-pages 203-221. | ||
Tue | The rest of MPI. Parallel programming in Matlab. First half of OpenMP. OH-pages 222-223, 233-252. Read 224-232 (nonblocking communication in MPI) on your own (if you are interested). |
Typos: Page 236. the for n-loop should only loop over 800, 1600 and 3200. After the program: yo should be you. Page 237, the first table: the time for x = A \ b, when n = 1600, should be 0.6 not 9.6. |
|
Thu | Info. about the exam. The date will be June 1 & 2. I have put up booking list outside my office. More OpenMP. OH-pages 253-268. |
Typo page 258. It should say a = 2.0 * cos(b) + 3.0 * sin(c), and parallel workshare shared(a, b, c) |
|
Tue | The second case study. OH-pages 267-278. | ||
May 27 | I have graded the uniprocessor lab. Look in the plastic magazine holder outside my office. I have graded the pthreads lab. If you have not received a complaint by e-mail, the code is OK. |