CST 334 Week 2
Josh Goldberg
CST 334-40_2262
Mar 17, 2026
This week we learned about processes. What is a process, the difference between a process and a program, and how the OS manages processes. A process is an instantiation of a program and has a state; the program is just code and data on disk that is turned into a program when the OS starts executing the code.
Last week we talked about CPU virtualization which is the gateway to this week. Each process gets to behave like it has it’s own CPU through virtualization, and the OS has time sharing mechanisms and policies by which it shares the CPU between processes. Mechanisms describe how things work, and policies describe what is acted on by the mechanisms. For example, how does the OS perform a context switch vs. how does it decide which process to switch in and out of a running state.
Each process state describes the process’s address space reserved for code, heap, and stack, what is in the registers for each process, and information about any I/O that the process is attached to. When a process context changes all of that data must be saved. Processes can transition from ready to running, running to ready, running to blocked, and blocked to ready. Other state transitions are verboten.
We discussed how a process is created and the difference between fork and exec. Back before multithreading was easy, this is how we did business! fork() will return PID=0 to the child and the child PID to the parent. A negative number is a fork failure. With fork, the program code manages execution flow. A child process can become a parentless zombie if the parent doesn’t wait() or waitpid() for it to finish execution. Exec does not return; it relies on the OS to manage execution flow.
Signals are used in the OS to control processes. Processes can have signal handlers in their code to catch specific signal types to take action that is specific to the program’s design. SIGINT, SIGKILL, SIGSTP, SIGTERM, and SIGHUP are my favorites.
We learned about the mechanisms of context switching between user mode and kernel mode and how this works with system calls and traps. While some processes are cooperative and switch context nicely, some are not and require interrupts to force a context switch, e.g. long running loops. Interrupts can be blocked during interrupt handling.
Some scheduling mechanisms were reviewed, and measured in the context of response time and turnaround time, usually looking at the averages and how the policies performed differently. FIFO = first in first out where the oldest process is executed next. SJF executes the shortest job first. STCF executes the shortest duration job first. SJF and STCF require foreknowledge of how long the job will take. RR is round robin which executes processes in equal time slices. RR has best response time, SJF/STCF have best turnaround time.
MLFQ is multi-level feedback queue. This is designed to optimize turnaround time without the oracle problem of knowing ahead of time how long a job will take. If two jobs are equal priority, round robin. Otherwise execute the higher priority job. New jobs get top priority. Once it has run, the priority is reduced. When jobs reach a lower priority threshold, boost them to the top priority. Many OSes use this since the policies handle most edge cases that would lead to poor execution for some jobs. Most systems I know have 20 priorities and we nice to change a priority manually.
What was the worst part of this week? I am really bad at transcribing numbers. I eventually got the hang of calculating schedules after several runs and doing some in spreadsheets instead of by hand on paper, which helped me develop a good system for doing it by hand.
Whereas this week we focused on userland scheduling, I expect we’ll talk about the other side of kernel scheduling with hardware devices other than the CPU, and virtualizing I/O. This week also made me think about userland async event loops and how that has some similarities to lower level scheduling in terms of managing those kinds of patterns. What is the best scheduling policy on a RTOS where deterministic timing is also a requirement but we also need to sometimes optimize turnaround time? Maybe the oracle problem doesn’t exist since, as an RTOS, all job timing can be pre-calculated. Maybe the right policy depends on the applications that are running here in that case to give the most responsive system without starving long running jobs.
Comments
Post a Comment