CST 334 week 5 (29/100)

This week we covered concurrency using multithreaded programming with the POSIX api. Concurrency solves the problems of parallelism and avoiding i/o blocking. This is important to operating systems because it allows for better sharing of the hardware resources. Threads allow for independent execution context within a process. Each thread has its own program counter, stack, and registers, but they all share the same memory space. This gives them a lower cost to create. One thing to be careful about is critical sections of code where multiple threads need to access a shared resource. We can’t allow them to operate on that resource simultaneously or else we could have a race condition where the output depends on the timing of thread execution. When a program has different output from one run to another it is called indeterminate. To avoid this we use locks to isolate the shared data such that only one thread can access it at a time - mutual exclusion. Atomic operations are operations that execute completely or not at all. They are necessary for creating mutual exclusion with locks.


We reviewed the pthreads api including create, join, lock, unlock, wait, and signal. Create makes a new thread. Join sleeps until a thread completes. Lock deals with mutexes. If the mutex is free the thread acquires it and proceeds. Otherwise it blocks until the mutex is released. Unlock releases a mutex. Wait makes a thread sleep and releases a mutex atomically. Signal wakes up a thread based on a condition variable, and the thread has to re-acquire its mutex before resuming. Important note is to use a while loop when doing a pthread_cond_wait.


Implementing locking is complicated and best done using hardware instructions to ensure atomicity. All these methods look for correctness, fairness, and performance. Test-and-set atomically reads and writes a memory location in one instruction. It is correct but it is a spin lock which means that it will waste cpu time in a tight loop waiting for a lock to free if its held. Compare-and-swap is like the previous but only writes a new value if the old one is what we expect. This gives better control over an atomic update by ensuring that a thread writes when we have a good starting point, i.e. the expected old value. Fetch-and-add is a ticket lock. It’s like a deli counter where each thread takes a number and they take turns in a FIFO queue. 


To avoid spinning while waiting for locks, we can use yield() to give up the CPU but it might result in the thread not winning a lock. Better is to queue and sleep with park/unpark so the sleeping threads are woken up in order. IRL - linux uses futex_wait and futex_wake which has better performance when there is no lock contention. Pthread_mutex uses futex two-phase approach that will spin briefly before sleeping if a lock isn’t available quickly to get the best performance while also being correct and fair.


Lastly we examined some concurrent data structures and how to integrate locking into their access patterns. This was reminiscent of our work in android programming and in databases. We used table locks to ensure consistent concurrent access to database data in both classes. Understanding how this works at a low level is enlightening. This also ties us back to the previous section on virtual memory. Last week we looked at how to ensure exclusive access per process, and threads give us a way to scale program execution with shared resources. This is a critical aspect of modern computing so we can minimize the cost of program execution, but we have to almost break virtualization to do it. The limit of virtualization is then at the process level but within a process the operating system gives us more tools to eke out some performance gains.

Comments

Popular posts from this blog

Week 2/100

week 8/100

Week 4/100