How is a mutex-lock implemented?

I was asked this question during the break. Here is a partial answer.
When I stared working with threads I fetched a few pthread-implementations and read part of the code. I have seen mutex-code in C, but the one we are using is in assembler on the lowest level. How can one find out?

The pthread-library is part of the GNU C library. To look at the code, fetch the tar-file and unpack it. It is a big package, the number of C-files and header files is roughly 8000. There is support for several platforms, and some directories contain code that is tailor-made for specific hardware. The pthread-code is located in the directory nptl (after you have cd:d to the glibc-directory). cd to nptl and have a look at pthread_mutex_lock.c. There is a lot of stuff in the file, but I imagine that the actual locking is done by the line:

      /* We have to get the mutex.  */
      LLL_MUTEX_LOCK (mutex);


The macro LLL_MUTEX_LOCK  is defined in the beginning of the file (using a # define). So, a routine called lll_lock is used. The definition of this function depends on the platform. The code for lab-computers can be found in the header file sysdeps/unix/sysv/linux/i386/lowlevellock.h. Note that the sysdeps directory contains code for several architectures. On line 286 in the header-file comes the code, in assembler. It seems lll_lock is calling __lll_lock_wait_private which is defined in sysdeps/unix/sysv/linux/lowlevellock.c. Judging by the name, the function atomic_compare_and_exchange_val_acq seems to be responsible for performing an atomic operation. The function is defined in atomic.h in the top level include directory. I kind of lost track after that :-).

Using cpp to see what source file is created I saw that the an assembly instruction, cmpxchgl, is used. Reading AMD64 Architecture Programmer's Manual Volume 1: Application Programming on page 65 one finally can see that this is an atomic operation. It says "The test and load are performed atomically, so that concurrent processes or threads which use the semaphore to access a shared object will not conflict."