Diary
This page will contain a diary of the course, i.e. a short description of the lectures. The table will be filled in as we go along.
How to register (a PDF-file)
Introduction to C, tcsh and bash
springer.pdf (read if you like)
Some old handouts, I do not lecture about these topics any longer (read if you like):
Process control under unix, interprocess communication, nonblocking communication
Day | Activity | Comments |
Tue | Introduction. Course-adm. Registration. First part of programming languages for HPC. OH-pages 1-16. | No lab today |
Thu | Fortran, OH-pages 17-26. | |
Tue | LDA in Fortran77, dangerous things in C and Fortran. Make. Computer architecture. OH-pages 27-44. | |
Thu | Computer architecture. OH-pages 45-62. | Read the free chapter in Triebel's
book. Fetch the "Architecture Optimization
Reference
Manual", from this
page. Skim through chapter 2, "INTEL® 64 AND IA-32 PROCESSOR ARCHITECTURES". My lecture does not
cover all the terminology used in the above manuals, so it would be hard to understand every small detail. "Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel and AMD CPU's" you can find on this page. Typo page 53, penultimate line: Division is still slow. |
Tue | Virtual memory, code optimzation. OH-pages 63-78. | Typo page 77, the prototype for add. Change int n to double f, int n . |
Thu | Note in the handouts: Specifications of the student machines (see under Comments to the right). Hints to one lab. More about aliasing. Code optimization contd. OH-pages 79-86. |
The model is: Intel Core i5-650 (4M Cache, 3.20 GHz) http://ark.intel.com/Product.aspx?id=43546 for some technical details. I fetched a new and better cpuid-code from: http://linux.softpedia.com (search for cpuid) and ran it on a student machine. Here are some of the results. The student machines have a maximum clock frequency of 3.2 GHz. They have two cores with hyper-threading, two threads each (/proc/cpuinfo and top list four cores). Here something about the cache: L1 instruction cache: 32K, 4-way, 64-byte lines L1 data cache: 32K, 8-way, 64 byte lines L2 cache: 256K, 8-way, 64 byte lines L3 cache: 4M, 16-way, 64 byte lines |
Tue | More on code optimization. Handed out a quick ref quide (PostScript) for the BLAS and talked about it some time. OH-pages 87-101. | |
Thu | Profiling. Valgrind. PAPI, gprof, gcov. Calling a Lapack-routine from Fortran. Calling a Lapack-routine from C. OH-pages 102-125. | I have added some information for the Lapack- and MEX-labs (how to get things working on the student machines). I have updated OH-page 59 (details about the student machines) and page 133 (Mex-files on the student machines). Changed the last sentence on page 128 as well. Update 2011-04-29: I have updated the threads-example in the handouts so it works on the 64-bit student system (pages 170-173). |
Tue | Mexfiles and libraries. OH-pages 126-133, 137-145. | Returned lab1. If you were not there, you can find the lab in the
plastic magazine holder outside my office. G (for Godkänt) means a
passing grade (there are ony pass/no pass on the labs). OH-page 130 is updated, to take care of the case when bandsolve is called with only one output argument. |
Thu | Answers to two questions (a text file). Tar-files. Intro. to parallel computing. OH-pages 146-167. |
May 5 at 18:50. Updated the text for the inlining lab. Added a hint to the Lapack-lab (about how to link with the RedHat Lapack-library). |
Tue | Deadlock and other communication issues. POSIX-threads, intro. to PVM and MPI. OH-pages 166-183. | |
Thu | The rest of MPI and the beginning of OpenMP. OH-pages 184-207. | Bring your calendar next time. We are going to decide the exam-schedule. |
Tue | More OpenMP, OH-pages 208-227. | Changed last due date to May 20. Handed back lab 2 today. Decided exam dates. I have put up a booking schedule, outside my office, where you can sign up for a time. On OH-page 227 we need to add a barrier, as I said during the lecture. So add the following two lines (above the comment // Add the partial...) // Must wait for all partial sums to be ready #pragma omp barrier Answer to two questions: Reduction variables and subtraction (a text-file). Are there other HPC-courses? Here is an incomplete list:
|
Thu | The rest of OpenMP. Did not have time with the second case study. OH-pages 228-243. | Course evaluation. |