Resources
A process is one of the most fundamental abstractions that the operating system (OS) provides. A process is a running program. The program itself is a file waiting on disk that is ready to spring into action.
How do we provide the illusion of multiple CPUs?
The OS creates this illusion by virtualizing the CPU. This is a technique known as time sharing the CPU. To do this, the OS needs two things: low-level machinery mechanisms and high-level intelligence.
For example, context switching allows the OS to stop running one program and start running another on a given CPU. This is a time sharing mechanism employed by modern OSes. On top of these mechanisms you’ll find the intelligence of the OS in the form of policies. Policies are algorithms for the OS to make decisions.
Processes
What makes up a process? Well let’s get into what comprises machine state, what a computer can read or update when it’s running:
- program counter (PC) or the instruction pointer (IP) which tells us which instruction to execute next
- stack pointer and associated frame pointer manage the stack for function parameters, local variables, and return addresses
- its memory (address space)
Process API
These APIs in some form are available on any modern OS:
- create
- destroy
- wait
- miscellaneous control
- status
Process Creation
How are programs transformed into processes? To start, we need to load the program’s code and any static data (variables) into memory and into the address space of the process. This loading is done lazily, but we’ll get into that more when we talk about paging and swapping.
Then, memory must be allocated for the program’s run-time stack (variables, function parameters, and return addresses). Then, the OS may allocate memory for the program’s heap. The OS also has to do some initialization tasks related to I/O (input/output). All of this then sets the stage for the execution of a program.
Process States
Now we’ll get into what different states a process can be in while running:
- running
- ready
- blocked
Here’s an example with two processes running that only use the CPU and no I/O:
Time | Process0 | Process1 | Notes |
---|---|---|---|
1 | Running | Ready | |
2 | Running | Ready | |
3 | Running | Ready | |
4 | Running | Ready | Process0 now done |
5 | – | Running | |
6 | – | Running | |
7 | – | Running | |
8 | – | Running | Process1 now done |
An example with I/O, with decisions being made by the scheduler:
Time | Process0 | Process1 | Notes |
---|---|---|---|
1 | Running | Ready | |
2 | Running | Ready | |
3 | Running | Ready | Process0 initiates I/O |
4 | Blocked | Running | Process0 is blocked, |
5 | Blocked | Running | so Process1 runs |
6 | Blocked | Running | |
7 | Ready | Running | I/O done |
8 | Ready | Running | Process1 now done |
9 | Running | – | |
10 | Running | – | Process0 now done |
Data Structures
The OS itself is a program. So, it has key data structures that track various relevant pieces of information:
- process list
- register context (for context switching)
Summary
A process is a running program. Now that we understand the abstraction, we can get into the details about low-level mechanisms to implement processes and the high-level policies needed to schedule them intelligently. With mechanisms and policies, we will understand how an OS virtualizes the CPU!
Homework Answers
- The usage is 100% since both processes use the CPU 100% of the time with no I/O
- The first process takes 4 ticks, then 1 tick to issue the I/O, then 5 ticks to complete the I/O, and then 1 tick to specify the I/O is done. So:
Stats: Total Time 11
Stats: CPU Busy 6 (54.55%)
Stats: IO Busy 5 (45.45%)
- Since we issue the I/O call first, that process becomes blocked letting the other process run. So, the I/O process takes 1 tick on the CPU to start running, the and the CPU process takes over and uses the CPU for 4 ticks, finally the CPU is idle for 1 tick, then 1 tick to specify the I/O is done. So:
Stats: Total Time 7
Stats: CPU Busy 6 (85.71%)
Stats: IO Busy 5 (71.43%)
- We once again arrive at a total time of 11 when switching is disabled since the idle CPU cannot be used:
Stats: Total Time 11
Stats: CPU Busy 6 (54.55%)
Stats: IO Busy 5 (45.45%)
- With switching on (default but here we specified it with a flag) we get a more optimal result like before:
Stats: Total Time 7
Stats: CPU Busy 6 (85.71%)
Stats: IO Busy 5 (71.43%)
- We initiate the first I/O instruction but then don’t switch back to it until the CPU is ready again. This ends up saving all the rest of the I/O for the end since the subsequent processes are using 100% of the CPU. Then at the end it can finally issue the final I/O calls. This isn’t efficient since you could’ve had more I/O going on in the background while the CPU was being used.
Stats: Total Time 31
Stats: CPU Busy 21 (67.74%)
Stats: IO Busy 15 (48.39%)
- When we can immediately switch to run the I/O process that finished, we get more background I/O going and can more efficient use both CPU and I/O:
Stats: Total Time 21
Stats: CPU Busy 21 (100.00%)
Stats: IO Busy 15 (71.43%)
- For this one it really depends on the seed of the process list. But, I noted having
IO_RUN_IMMEDIATE
since that process that issued I/O will likely have more we can run in the background andSWITCH_ON_IO
is beneficial because you can switch to other tasks while other processes are blocked by I/O.