Lec 17: Process Groups and Terminal Signaling
Table of Contents
1 Pipelines and Process Groups
In the last lesson and lab, we've been discussing job control and
the mechanisms that enable it. Generally, job control is a feature
of the shell and supported by the terminal device driver. The shell
manages which jobs are stopped or running and notifies the terminal
driver which job is currently in the foreground. The terminal device
driver listens for special keys, like Ctrl-c
or Ctrl-z
, and
delivers the appropriate signal to the foreground process, like
terminate or stop.
That narrative is fairly straightforward, as long as there is only one process running within a job, but a job may contain more than one process, which could complicate the actions of the terminal device driver. Additionally, jobs can be further grouped together into sessions, and the mechanisms that enable all this interaction requires further discussion. In this lesson, we will explore process grouping and how this operating system services support job control and shell features we've grown to rely on (and love?).
1.1 Pipeline of processes
Consider the following pipeline:
sleep 10 | sleep 20 | sleep 30 | sleep 50 &
Here we have four different sleep
commands running in a
pipeline. The sleep
command doesn't read or write to the terminal;
it just sleeps for that many seconds and then exits. None of the
sleep commands are blocking or waiting on input from another sleep
command, so they can all run independently. We just happend to put
them in a pipeline, but what is the impact of that? How long will
this job take to complete?
One possibility is that each sleep command will run in
sequence. First sleep 10
runs, then sleep 20
, then sleep 30
runs, and finally sleep 50
runs, and thus it would take
10+20+30+50 = 110 seconds for the pipeline to finish. Another
possibility is that they run all at the same time, or concurrently
or in parallel, in which case the job would complete when the
loggest sleep finishes, 50 seconds.
These two possibilities, in sequence and in parallel, also describe two possibilities for how a pipeline is executed. In sequence would imply that the shell forks the first item in the pipeline, lets that run, then the second item in the pipeline, lets that run, and so on. Or, in parallel: the shell forks all the items in the pipeline at once and lets the run concurrently. The major difference between these two choices is that a pipeline executing in sequence would have a single process running at a time for each total job while executing in parallel, however, would have multiple currently running processes per job.
By now, hopefully, you've already plugged that pipeline into the
shell and found out that, yes, the pipeline executes in parallel,
not in sequence. We can see this as well using the ps
command.
sleep 10 | sleep 20 | sleep 30 | sleep 50 & [1] 4128 aviv@saddleback: ~ $ ps -o pid,args PID COMMAND 3981 -bash 4125 sleep 10 4126 sleep 20 4127 sleep 30 4128 sleep 50 4129 ps -o pid,args
1.2 Process Grouping for Jobs
The implication of this discovery, that all process in the pipeline run concurrently, is that the shell must use a procedure for forking each of the process individually. But, then, how are these process linked? They are suppose to be a single job after all, and we also know that the terminal device driver is responsible for delivering signals to the foreground job. There must be some underlying procedure and process to enable this behavior, and, of course, there is.
The operating system provides a number of ways to group processes together. Process can be grouped into both process groups and sessions. A process group is a way to group processes into distinct jobs that are linked, and a session is way to link process groups under a single interruptive unit, like the terminal.
The key to understanding how the pipeline functions is that all of
these process are places in the same process group, and we can see
that by running the pipeline again. This time, however, we can also
request that ps
outputs the parent pid (ppid
) and the process
group (pgid
) in addition to the process id (pid
) and the command
arguments (args
).
#> sleep 10 | sleep 20 | sleep 30 | sleep 50 & [1] 4134 #> ps -o pid,pgid,ppid,args PID PGID PPID COMMAND 3981 3981 3980 -bash 4131 4131 3981 sleep 10 4132 4131 3981 sleep 20 4133 4131 3981 sleep 30 4134 4131 3981 sleep 50 4135 4135 3981 ps -o pid,pgid,ppid,args
Notice first that the shell, bash
, has a pid of 3981 and process
group id (pgid
) that is the same. The shell is in it's own process
group. Similarly, the ps
command itself also has a pid that is the
same as its process group. However, the sleep commands, are in the
process group id of 4131, which also is the pid of the first process
in the pipeline. We can visualize this relationship like so:
As you can see, the rule of thumb for process grouping is that process executing as the same job, e.g., a single input to the shell as a pipeline, are placed in the same group. Also, the choice of process group id is the pid of the process.
2 Programming with Process Groups
Below, we will look at how we program with process groups using system calls, and we will investigate this from the perspective of the programmer as well as how the shell automatically groups process. We will use series of fairly straight forward system calls, and to bootstrap that discussion, we outline them below with brief descriptions.
Retrieving pid's or pgid's:
pid_t getpid()
: get the process id for the calling processpid_t getppid()
: get the process id of the parent of the calling procespid_t getpgrp()
: get the prcesso group id of the calling processpid_t getpgid(pid_t pid)
: get the process group id for the proces identified by pid
Setting pgid's:
pid_t setpgrp()
: set the process group of the calling process to iteself, i.e. after a call tosetpgr()
, the following condition holds getpid() == getpgrp().pid_t setpgid(pid_t pid, pid_t pgid)
: set the process group id of the process identified bypid
to thepgid
, ifpid
is 0, then set the process group id of the calling process, and ifpgid
is 0, then the pid of the process identified bypid
and is made the same as its process group, i.e.,setpgid(0,0)
is equivalent to callingsetpgrp()
.
2.1 Retrieving the Process Group
Each process group has a unique process group identifier, or pgid,
which are typically a pid of a process that is a member of the
group. Upon a fork()
, the child process inherits the parent's
process group. We can see how this works with a small program that
forks a child and prints the proces group identifies of both parent
and child.
int main(int argc, char * argv[]){ /*inherit_pgid.c*/ pid_t c_pid,pgid,pid; c_pid = fork(); if(c_pid == 0){ /* CHILD */ pgid = getpgrp(); pid = getpid(); printf("Child: pid: %d pgid: *%d*\n", pid, pgid); }else if (c_pid > 0){ /* PARRENT */ pgid = getpgrp(); pid = getpid(); printf("Parent: pid: %d pgid: *%d*\n", pid, pgid); }else{ /* ERROR */ perror(argv[0]); _exit(1); } return 0; }
Here is the output of running this program.
#> ./inherit_pgid Parent: pid: 3630 pgid: *3630* Child: pid: 3631 pgid: *3630*
Notice that the process groups are the same, and that's because a child inherits the process group of its parent. Now let's look at a similar program that doesn't fork, and instead just prints the process group identifier of itself and its parent, which is the shell.
/*getpgrp.c*/ int main(int argc, char * argv[]){ pid_t pid, pgid; //process id and process group for this program pid_t ppid, ppgid; //process id and proces group for the _parent_ //current pid = getpid(); pgid = getpgrp(); //parent ppid = getppid(); ppgid = getpgid(ppid); //print this parent's process pid and pgid printf("%s: (current) pid:%d pgid:%d\n", argv[0], pid, pgid); printf("%s: (parrent) ppid:%d pgid:%d\n", argv[0], ppid, ppgid); return 0; }
If we were to run this program in the shell, you might expect that both the child and the parent would print the same process group. Of course, why shouldn't this be the case? The program is a result of a fork from the shell, and thus the parent is the shell and the child is the program, and that's what just happened before, the parent and child had the same process group. But, looking at the output, that is not what occurs here.
#> ./getpgrp ./getpgrp: (current) pid:3760 pgid:3760 ./getpgrp: (parrent) ppid:369 pgid:369
Instead, we find that the parent, which is the shell, is not in the
same process group as the child, the getpgrp
program. Why is that?
This is because the new process is also a job in the shell and each
job needs to run in its own process group for the purpose of terminal
signaling. What we can now recognize from these examples, starting
with the pipeline of sleep commands, is that a shell will fork each
process separately in a job and assign the process group id based on
the first child forked, as is clear upon further inspection of the
output of these two examples:
#> sleep 10 | sleep 20 | sleep 30 | sleep 50 & [1] 4134 #> ps -o pid,pgid,ppid,args PID PGID PPID COMMAND 3981 3981 3980 -bash 4131 4131 3981 sleep 10 4132 4131 3981 sleep 20 4133 4131 3981 sleep 30 4134 4131 3981 sleep 50 4135 4135 3981 ps -o pid,pgid,ppid,args
#> ./inherit_pgid Parent: pid: 3630 pgid: *3630* Child: pid: 3631 pgid: *3630*
2.2 Setting the Process Group
Finally, now that we have learned to identify the process group,
the next thing to do is to assign new process groups. There are two
functions that do this: setpgrp()
and setpgid()
.
setpgrp()
: sets the process group of the calling process to itself. That is the calling process joins a process group of one, containing itself, where its pid is the as its pgid.setpgid(pid_t pid, pid_t pgid)
: set the process group of the process identified bypid
topgid
. Ifpid
is 0, then sets the process group of the calling process topgid
. Ifpgid
is 0, then sets the process group of the process identified bypid
topid
. Thus,setgpid(0,0)
is the same assetpgid()
.
Let's consider a small program that sets the process group of the
child after a fork using setpgrp()
call from the child. The
program below will print the process id's and process groups from
the child's and parent's perspective.
/*setpgrp.c*/ int main(int argc, char * argv[]){ pid_t cpid, pid, pgid, cpgid; //process id's and process groups cpid = fork(); if( cpid == 0 ){ /* CHILD */ //set process group to itself setpgrp(); //print the pid, and pgid of child from child pid = getpid(); pgid = getpgrp(); printf("Child: pid:%d pgid:*%d*\n", pid, pgid); }else if( cpid > 0 ){ /* PARRENT */ //print the pid, and pgid of parent pid = getpid(); pgid = getpgrp(); printf("Parent: pid:%d pgid: %d \n", pid, pgid); //print the pid, and pgid of child from parent cpgid = getpgid(cpid); printf("Parent: Child's pid:%d pgid:*%d*\n", cpid, cpgid); }else{ /*ERROR*/ perror("fork"); _exit(1); } return 0; }
And, here's the output:
#> ./setpgrp Parent: pid:20178 pgid: 20178 Parent: Child's pid:20179 pgid:*20178* Child: pid:20179 pgid:*20179*
Clearly, something is not right. The child sees a different pgid is different than the parent. What we have here is a race condition, which is when you have two processes running in parallel, you don't know which is going to finish the race first.
Consider that there are two possibility for how the above program will execute following the fork. In one possibility, after the fork, the child runs before the parent and the process group is set properly, and in the other scenario, the parent runs first reads the process group before the child gets a chance to set it. It is the later that we see above, the parent running before the child, thus the wrong pgid.
To avoid these issues, when setting the process group of a child,
you should call setpgid()=/=setpgrp()
in both the parent and the
child before anything depends on those values. In this way, you can
disambiguate the runtime process, it will not matter which runs
first, the parent or the child, the result is always the same, the
child is placed in the appropriate process group. Below is an
example of that and the output.
/*setpgid.c*/ int main(int argc, char * argv[]){ pid_t cpid, pid, pgid, cpgid; //process id's and process groups cpid = fork(); if( cpid == 0 ){ /* CHILD */ //set process group to itself setpgrp(); //<---------------------------! //print the pid, and pgid of child from child pid = getpid(); pgid = getpgrp(); printf("Child: pid:%d pgid:*%d*\n", pid, pgid); }else if( cpid > 0 ){ /* PARRENT */ //set the proccess group of child setpgid(cpid, cpid); //<------------------! //print the pid, and pgid of parent pid = getpid(); pgid = getpgrp(); printf("Parent: pid:%d pgid: %d \n", pid, pgid); //print the pid, and pgid of child from parent cpgid = getpgid(cpid); printf("Parent: Child's pid:%d pgid:*%d*\n", cpid, cpgid); }else{ /*ERROR*/ perror("fork"); _exit(1); } return 0; }
#> ./setpgid Parent: pid:20335 pgid: 20335 Parent: Child's pid:20336 pgid:*20336* Child: pid:20336 pgid:*20336*
3 Process Groups and Terminal Signaling
Where process groups fit into the ecosystem of process settings is
within the terminal settings. Let's return the terminal control
function, tcsetpgrp()
. Before, we discussed this function as
setting the foreground processes, but just from its name
tcsetpgrp()
, it actually sets the foreground process group.
3.1 Foreground Process Group
This distinction is important because of terminal signaling. We know
now that when we execute a pipeline, the shell will fork all the
process in the job and place them in the same process group. We also
know that when we use special control keys, like Ctrl-c
or
Ctrl-z
that the terminal will deliver special signals to the
foreground job, such as indicating to terminate or stop. For
example, this sequence of shell interaction makes sense:
#> sleep 10 | sleep 20 | sleep 30 | sleep 50 & [1] 24253 #> ps PID TTY TIME CMD 4038 pts/3 00:00:00 bash 24250 pts/3 00:00:00 sleep 24251 pts/3 00:00:00 sleep 24252 pts/3 00:00:00 sleep 24253 pts/3 00:00:00 sleep 24254 pts/3 00:00:00 ps #> fg sleep 10 | sleep 20 | sleep 30 | sleep 50 ^C #> ps PID TTY TIME CMD 4038 pts/3 00:00:00 bash 24255 pts/3 00:00:00 ps
We started the sleep commands in the background, we see that there
are 4 instances of sleep running, and we can move them from the
background to the foreground, were they are signaled with Ctrl-c
to
terminate via the terminal. All good, right? There is something
missing: Given that there are multiple processes running in the
foreground, how does the terminal know which of those to signal to
stop or terminate signal? How does it differentiate which processes
are in the foreground?
The answer is, the terminal does not identify foreground process individually. Instead, it identifies a foreground process group. All processes associated with the foreground job are in the foreground process group, and instead of signalling processes individually both shell and the terminal think of execution in terms of process groups.
3.2 Orphaned Stopped Process Groups
Process group interaction has other side effects when you consider
programs that fork children. For example, consider the program
(orphan
) below which simply forks a child, and then both child a
parent loop forever:
/*orphan.c*/ int main(int argc, char * argv[]){ pid_t cpid; cpid = fork(); if( cpid == 0 ){ /* CHILD */ //child loops forever! while(1); }else if( cpid > 0 ){ /* PARRENT */ //Parrent loops forever while(1); }else{ /*ERROR*/ perror("fork"); _exit(1); } return 0; }
If we were to run this program, we can see that, yes, indeed, it
forks and now we have two versions of orphan
running in the same
process group.
#> ./orphan & [1] 24468 #> ps -o pid,pgid,ppid,comm PID PGID PPID COMMAND 4038 4038 4037 bash 24468 24468 4038 orphan 24469 24468 24468 orphan 24470 24470 4038 ps
Moving the orphan
program to the foreground, it can then be
terminated by the terminal using Ctrl-c
.
#> fg ./orphan ^C #> ps -o pid,pgid,ppid,comm PID PGID PPID COMMAND 4038 4038 4037 bash 24471 24471 4038 ps
The resulting termination is for both parent and child, which is as expected since they are both in the foreground process group. While we might expect an orphan to be created, this does not occur. However, let's consider the same program, but this time, the child is placed in a different process group as the parent:
/*orphan_group.c*/ int main(int argc, char * argv[]){ pid_t cpid; cpid = fork(); if( cpid == 0 ){ /* CHILD */ //set process group to itself setpgrp(); //child loops forever! while(1); }else if( cpid > 0 ){ /* PARRENT */ //set the proccess group of child setpgid(cpid, cpid); //Parrent loops forever while(1); }else{ /*ERROR*/ perror("fork"); _exit(1); } return 0; }
Let's do the same experiment as before:
#> ./orphan_group & [1] 24487 #> ps -o pid,pgid,ppid,comm PID PGID PPID COMMAND 4038 4038 4037 bash 24487 24487 4038 orphan_group 24488 24488 24487 orphan_group 24489 24489 4038 ps #> fg ./orphan_group ^C #> ps -o pid,pgid,ppid,comm PID PGID PPID COMMAND 4038 4038 4037 bash 24488 24488 1 orphan_group 24490 24490 4038 ps
This time, yes, we see that we have created an orphan process. This
is clear from the PPID
field which indicates that the parent of
the orphan_group
program is init
, which inherits all orphaned
processes. This happens because the terminal signal Ctrl-c
is
delivered to the foreground process group only, but the child is
not in that group. The child is in its own process group and never
recieves the signal, and, thus, never terminates. It just continues
on its merry way never realizing that it just lost its parent. In
this examples lies the danger of using process groups; it's very
easy to create a bunch of orphans that will just cary on if not
killed. To rid yourself of them, you must explictely kill them with
a call like killall
#> killall orphan_group #> ps -o pid,pgid,ppid,comm PID PGID PPID COMMAND 4038 4038 4037 bash 24494 24494 4038 ps
And good riddance …